r8 - 05 Dec 2006 - 11:48:54 - AndyRabagliatiYou are here: TWiki >  Ltsp Web  >  TroubleShooting > Troubleshooting-mount-problems

NFS Mount Problems

The reason why.

NFS is a peculiar protocol because, by default, it doesn't use the 'reliable' TCP as its transport layer. In an attempt to improve performance of the filesystem it uses UDP over IP. To complicate matters even further it uses a very large UDP packet - again to improve performance. Traditionally this was 8K in size, but on the newer Linux kernels this has been increased to 32K. This causes 'IP fragmentation' on the network, and lots of it.

Network engineers take great care in their designs to avoid IP fragmentation. It is traditionally seen as a Bad Thing because it causes all sorts of problems. The mount issue is due to these problems.

Jump down to read The Gory Details

Quick hardware fixes

  1. Make sure that your clients and your server are running at the same network speed to avoid buffer overflows in the network fabric.
  2. Get a faster network card in your clients with a bigger packet buffer and put it on the PCI bus.
  3. Change your switch(es) for one(s) with proper flow control handling for the slower Ethernet speeds.

Sort out the client

The size of an NFS UDP transaction is controlled by the client - up to the maximum size allowed by the server. If you have access to the client's /linuxrc file then you can alter the NFS mount options to force the transaction to a lower packet size. There are two ways of doing this: Reduce the UDP packet size and Run NFS over TCP.

Reduce the UDP packet size

To get the client to negotiate a particular packet size you need to specify the correct variables on the NFS mount line within /linuxrc. There are two variables rsize and wsize which tell the client and server how big the packet should be when reading and writing. Normally they are set to the same value. You need to specify both on the mount line.

Edit /linuxrc and find the 'Mounting root filesystem' section. Change it as follows:

echo "Mounting root filesystem: ${NFS_DIR} from: ${NFS_IP}"
mount -n -o nolock,ro,rsize=8192,wsize=8192 ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=4096,wsize=4096 ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=2048,wsize=2048 ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=1024,wsize=1024 ${NFS_IP}:${NFS_DIR} /mnt

Alternative comment out the existing mount line and uncomment one of the other ones. (You may find that ISA based Ethernet cards work better with a 4096 or 2048 packet size).

If you don't specify a variable on the mount line then NFS will negotiate the maximum size that the server will support.

See other comments at NFS.

Run NFS over TCP

NFS was originally designed to run over UDP. One of the reasons was problems with slow TCP implementations. Of course time moves on and with faster cards, faster machines and better algorithms TCP performance is now very good indeed. TCP is able to handle the large NFS packets in new Linux kernels much more gracefully than UDP. However NFS over TCP is still relatively new and you may encounter other NFS problems - particularly if your server is not running Linux.

To get the client to use TCP rather than UDP you need to specify the correct option proto=tcp on the NFS mount line within /linuxrc.

Edit /linuxrc and find the 'Mounting root filesystem' section. Change it as follows:

echo "Mounting root filesystem: ${NFS_DIR} from: ${NFS_IP}"
mount -n -o nolock,proto=tcp,ro ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=4096,wsize=4096 ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=2048,wsize=2048 ${NFS_IP}:${NFS_DIR} /mnt
##mount -n -o nolock,ro,rsize=1024,wsize=1024 ${NFS_IP}:${NFS_DIR} /mnt

Patch the server

If you use a distribution (like Gentoo) with 'all in one' kernels then you will not be able to easily adjust the client mount parameters and, other than rebuilding the client system, your only option is to patch the server.

You can reduce the maximum size of the NFS UDP packet by patching the server kernel and rebuilding. This will affect all clients attaching to that server. Returning the maximum value back to 8K will often help. You can go smaller if you want, but keep the values as multiples of 1024.

The file you need to edit is include/linux/nfsd/const.h. Find the line that defines the constant NFSSVC_MAXBLKSIZE and change it as follows

    #define NFSSVC_MAXBLKSIZE         (8*1024)

Recompile your kernel and install it in the usual fashion.

NFS dies with "Protocol not suported"

If you have built your own kernel, you should include NFSv3 suport to it:

File Systems --> Network File Systems --> NFS server support --> Provide NFSv3 server support

The Gory Details

Traditionally NFS uses a block size of 8192. Recent Linux kernels on both 2.4 and 2.6 tracks have increased the maximum block size to 32768. To get this across the network it has to be broken into chunks to fit into the ~1500 bytes available in an Ethernet packet.

TCP breaks the blocks into segments. It uses a segment size calculation and path discovery to try and avoid fragmentation along the route of a packet, and is able to recover from the loss of a single segment by just retransmitting that segment. Additionally TCP is adaptive and responds to loss by slowing down transmission of segments.

UDP however relies upon IP to break the block into chunks. This is 'IP fragmentation'. If one of the fragments is lost then the entire block has to be retransmitted not just the lost fragments. UDP is not adaptive and will continue to transmit at full network speed whatever happens to the fragments en route.

At an 8K block size, six fragments have to get successfully over the network. At 32K block size 23 fragments have to get over the network successfully at first attempt. The larger block size gives more scope for problems.

LTSP clients tend to be older machines, possibly with ISA Ethernet Cards running at 10Mb/s Half Duplex and possibly connected to the network with cheap switches. This lends itself to a couple of transmission problems.

  1. The server is connected via a much faster network system, eg. Gigabit Ethernet. Gigabit Ethernet is by definition a 100 times faster than 10Mb Ethernet. It can transmit the entire 23 fragment sequence before a fifth of the first fragment has been transmitted to the client. The packet sequence has to be buffered somewhere in the switch structure to give the slower network chance to catch up. If the buffers aren't big enough then fragments will be lost.
  2. The packet buffer on older Ethernet cards is quite small - as low as 8kb. A fair chunk of this is used by the drivers as transmission buffer space leaving space for only about 3 Ethernet packets. It only takes a short delay in getting packets off the card and into main memory for the buffer to wrap around and a packet to be lost.

If either of these occur then you will likely get a transmission loop from NFS with the server constantly trying to transmit a large sequence of fragments across a network that cannot handle them. Everything appears to stop.

Further Information

-- NeilW - 22 Dec 2004

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r8 < r7 < r6 < r5 < r4 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback