Enabling Infiniband on Ububtu 10.10

I do mention in my tagline that this blog may contain some information about computers, so you’ve been warned. The following article is quite technical, and may not be of interest to the photography contingent that might otherwise be interested in my blog articles. I posted this because I spent several days attempting a particular task, and managed to get it down to a few simple instructions. It may be useful for other attempting the same task.

I’ve recently set up an Infiniband fabric at home, and had a lot of trouble getting it working on Ubuntu. Windows 7 was a breeze, I just instlled the OFED drivers from openfbrics.org.
Here’s the steps to get it working on Ubuntu 10.10. It might not seem a lot, but there was a lot of messing to get it down to these few steps.

Install Ubuntu. Everything below is done as root – “sudo bash”.
Add a file to /etc/udev/rules.d called, say, 99-udev-umad.rules. This will cause the correct entries to be created in /sys, which otherwise do not get created for Ubuntu 10.10.
Insert the following:

KERNEL==”umad*”, NAME=”infiniband/%k”, MODE=”0666″
KERNEL==”issm*”, NAME=”infiniband/%k”, MODE=”0666″

Edit /etc/modules and add the following modules:

ib_sa
ib_cm
ib_umad
ib_addr
ib_uverbs
ib_ipoib
ib_ipath
ib_qib

Next, “apt-get install opensm”. This will install the subnet manager and all the relevant dependencies, libibverbs, etc.

Then add the relevant entries for the interface into /etc/network/interfaces file:

auto ib0
iface ib0 inet static
address 192.168.1.1
netmask 255.255.255.0

Then reboot. This will create the relevant infiniband entries in /sys, load the ipoib modules, and bring up the infiniband port with an ip address.

You should now have a functioning infiniband port on your Ubuntu machine.

However, there is still some more investigation to be done. Initially, when I was mixing custom kernels with OFED drivers, and stock linux kernel drivers, netperf was showing 7 gbps throughput. With the configurtation above, it’s down at about 25 mbps. Dreadfully slow. I’ll have to find out what optimisations are needed (or other drivers) in order to get the speed back up to 7 gbps.

–EDIT–

Note: iperf maxed out at 1.2 gbps, and on the current linux install, I couldnt get netperf client working at all. netserver would work, but only showed a throughput of 25mbps from a Win7 client. HOWEVER, when I set up the raid with 6 old 160G drives, the “hdparm -t /dev/md0p1” showed 250MB/sec reads, and I got the same from the Win7 machine using samba across the infiniband fabric. This seems to indicate that iperf and netperf are completely unreliable for testing this type of connection. Bear in mind though that I did have netperf running on the previous ubuntu install, but that installation was so messy I don’t know what drivers and user-space software was running. I reckon it’s the kind of think that may be fixed in the stock Ubuntu install in the near future. For the moment, just go with real-world testing, i.e. copying large files from ramdisk to ramdisk, for example.

This entry was posted in Computer Stuff and tagged , , , , .

One Comment

  1. Greg May 15, 2011 at 6:47 am #

    I used your instructions (almost) blindly and could not understand why

    opensm -o

    always failed with the error

    [B76C26C0] 0x01 -> osm_vendor_open_port: ERR 542C: umad_open_port() failed

    Verbose output told me that

    umad_open_port: open /dev/infiniband/umad0 failed

    Further investigation showed that umad device was indeed missing. I played around with the rule file but it took some time to realize that I copied the rule from the web page where quotes are encoded differently! I was u-mad!

    After I changed quotes by entering them with keyboard everything worked and SM started just fine!

One Trackback

  1. […] HOWTO on getting the fabric up on Ubuntu 10.10. About 10 minutes should get it working – http://davidhunt.ie/wp/?p=375 […]