Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Edward Ned Harvey
> From: Patrick Cable [mailto:p...@pcable.net]
> 
> The FPGA receives data from various "controlling
> electronics" and then buffers it and pushes it off over ethernet to
> the desktop. 

...

> The FPGA is sending data to the desktop at 940 bytes at 625Hz, which
> is how we arrive at ~4.6 or 4.7 megabits (I imagine with TCP overhead)
> a second.

So the FPGA is sending Ether to your desktop, and then the desktop is
sending via the same Ethernet cable to the NFS server?  Although I agree the
data rate is small enough that it should be able to handle it ... Maybe you
want to try just adding a 2nd ethernet card to the workstation.  Let the
fpga write to the workstation via eth0, and let the workstation write to the
nfs server via eth1.

Can't read & write at the same time on a single ethernet interface.  You
might just be suffering due to lack of buffering capabilities...  Either in
the workstation NIC, or the fpga.


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Brad Knowles
On Sep 27, 2010, at 11:41 AM, Patrick Cable wrote:

> The problem is space on the desktop and that the data needs to be
> centrally accessible in real time. And that they didn't have $100k for
> a NAS.

Well, you'll never get "real time" unless you use a "real" RTOS in all phases 
of the process, and NFS sure as hell ain't gonna qualify as any part of an RTOS 
solution.  Now, if "near real time" is close enough, then you may have 
sufficient "wiggle room" elsewhere in the system.

Besides, if you're reading from the local disk on the workstation fast enough, 
that file storage can be kept to a minimum and you won't need to spend much 
money on disk space to accommodate it.  You might even be able to put that on a 
high-speed SSD, or even a RAM disk (if the workstation is on battery backup).

Then end result would be that you can essentially guarantee no loss of data 
between the FPGA and the workstation, and then the process of serving that data 
across the network to the file server would actually be very low overhead and 
very high speed, because you're only going to have one process writing to the 
local datastore from the FPGA (which is the slow/expensive part) and all the 
read operations across the network will be asynchronous and very low impact.

> RHEL is terribly old, but I am using 5.5 at the very least.
> I may be able to use RHEL6 on the data server; supposedly that's
> coming out next month.

It sounds to me like your problem occurs before the information gets to the 
data server, so no amount of "cure tonnage" you try apply to the situation at 
that point is likely to have much impact.  I think you're much better off 
trying to find unconventional ways to apply a few ounces of prevention earlier 
in the pipeline.

--
Brad Knowles 
LinkedIn Profile: 


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Patrick Cable
On Mon, Sep 27, 2010 at 11:43 AM, Brad Knowles  wrote:
> Here's a question for you -- can you split these processes?  I.e., read data 
> from the FPGA and write it to local disk on the workstation, then have a 
> separate process read the data from the workstation and write that to the NFS 
> fileserver?  Heck, you could even turn the workstation into its own NFS 
> fileserver where the data is *READ* across the network by the central 
> fileserver, and then written to wherever it needs to go.

The problem is space on the desktop and that the data needs to be
centrally accessible in real time. And that they didn't have $100k for
a NAS.

> you just happen to have chosen a particular mix of software that is
> particularly poorly suited to this application.

It would seem so, yes.

> the particular age of the base OS you have installed,

RHEL is terribly old, but I am using 5.5 at the very least.
I may be able to use RHEL6 on the data server; supposedly that's
coming out next month.

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Patrick Cable
Sorry for the double email Ed, I'm having a Reply-to-all forgetful day today.

On Mon, Sep 27, 2010 at 11:25 AM, Edward Ned Harvey
 wrote:
> Let's have a little more detail ... You have an FPGA board connected to a
> workstation via ... USB I guess... and the workstation is reading data from
> the USB and writing to some file or directory. Right?

The FPGA is part of a larger control board that pushes data off
ethernet. The FPGA receives data from various "controlling
electronics" and then buffers it and pushes it off over ethernet to
the desktop. Everything is ethernet, besides the link between the CE
and the FPGA (RS422 there, I believe)

The FPGA is sending data to the desktop at 940 bytes at 625Hz, which
is how we arrive at ~4.6 or 4.7 megabits (I imagine with TCP overhead)
a second.

> I have another suggestion ... If either your USB or ethernet are hogging
> channels, such as IRQ's, then only device might be able to work at a time,
> and while you might benefit by write buffering in memory of the local disks,
> you might not have that benefit writing to the Ethernet.

This is a possibility.

> You should be able to get a little more information about the cause of the
> problem ... be it hardware or ethernet specific, or nfs-specific ... by
> trying something like a pipe to "ssh some machine 'cat > somefile'" ...
> eliminate nfs as a variable, etc.  Or try cifs.  (I think somebody suggested
> that before.)

That's also next on the list.

> It's all about what your diagnostic process is going to be... What's the
> logic you're going to follow, to isolate the cause of the problem...
>
> Or just try a few things and see if they work.   ;-)

I try to document the things I try at the very least ;)

> What kind of workstation is it?  You might benefit by going into BIOS and
> disabling all unnecessary devices.  (Sound, parallel port, etc.)  You might
> benefit by adding a newer/better usb card or ethernet card.  Which could
> possibly have some feature such as DMA which the old one could possibly be
> lacking.

It's a new Dell Precision T3500 with an X5650 and 12 gigs of ram.
Maybe I need to add a seperate nic, I think that'll be near the end of
the things that I try, but we'll see.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Brad Knowles
On Sep 27, 2010, at 9:43 AM, Patrick Cable wrote:

> ALSO. I am a giant turkey and should clarify that the FPGA is sending
> data out at 4.76 mega*bits* a second (my engineer said bytes... iftop
> shows bits, someone else confirms). Still nothing that NFS should
> choke over, just the writes aren't fast enough.
> 
> A basic diagram of how this works -
> Controlling Electronics ---> Modem FPGA ---> desktop --NFS--> dataserver

Here's a question for you -- can you split these processes?  I.e., read data 
from the FPGA and write it to local disk on the workstation, then have a 
separate process read the data from the workstation and write that to the NFS 
fileserver?  Heck, you could even turn the workstation into its own NFS 
fileserver where the data is *READ* across the network by the central 
fileserver, and then written to wherever it needs to go.

I ask because writes to a local filesystem should be much faster (and more 
deterministic) than writes to an NFS fileserver, at least when the NFS write 
situation is less than ideal.  That, and the fact that NFS reads are pretty 
much always much, much faster (and less problematical) than NFS writes.

> Unfortunately, upgrading this NFS server won't work. We purchased an
> X86 machine full of disks because we can't really afford a netapp or
> other storage, but I don't think we're throwing anything at it that it
> cant do. They're all sequential writes.

I don't think this is a hardware problem.  I think this is a software problem, 
and you just happen to have chosen a particular mix of software that is 
particularly poorly suited to this application.

The base OS itself is only part of the problem -- the other part of the problem 
is the particular age of the base OS you have installed, because I think a lot 
of these issues may have been resolved or at least significantly improved in 
more recent releases of the same OS.

--
Brad Knowles 
LinkedIn Profile: 


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Edward Ned Harvey
> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
> Of Patrick Cable
> 
> ALSO. I am a giant turkey and should clarify that the FPGA is sending
> data out at 4.76 mega*bits* a second (my engineer said bytes... iftop
> shows bits, someone else confirms). Still nothing that NFS should
> choke over, just the writes aren't fast enough.
> 
> A basic diagram of how this works -
> Controlling Electronics ---> Modem FPGA ---> desktop --NFS-->
> dataserver

Let's have a little more detail ... You have an FPGA board connected to a
workstation via ... USB I guess... and the workstation is reading data from
the USB and writing to some file or directory. Right?

I have another suggestion ... If either your USB or ethernet are hogging
channels, such as IRQ's, then only device might be able to work at a time,
and while you might benefit by write buffering in memory of the local disks,
you might not have that benefit writing to the Ethernet.

You should be able to get a little more information about the cause of the
problem ... be it hardware or ethernet specific, or nfs-specific ... by
trying something like a pipe to "ssh some machine 'cat > somefile'" ...
eliminate nfs as a variable, etc.  Or try cifs.  (I think somebody suggested
that before.)

It's all about what your diagnostic process is going to be... What's the
logic you're going to follow, to isolate the cause of the problem...

Or just try a few things and see if they work.   ;-)

What kind of workstation is it?  You might benefit by going into BIOS and
disabling all unnecessary devices.  (Sound, parallel port, etc.)  You might
benefit by adding a newer/better usb card or ethernet card.  Which could
possibly have some feature such as DMA which the old one could possibly be
lacking.

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-27 Thread Patrick Cable
Whoops. Reply to one != reply to all.

On Sat, Sep 25, 2010 at 9:37 AM, Giovanni Tirloni  wrote:
> What options are you using to mount the NFS share ? Depending on your safety
> requirements, you might want to try async/sync and different rsize/wsize
> values.

I'm using a rsize/wsize of 32768. The drive is mounted async.

ALSO. I am a giant turkey and should clarify that the FPGA is sending
data out at 4.76 mega*bits* a second (my engineer said bytes... iftop
shows bits, someone else confirms). Still nothing that NFS should
choke over, just the writes aren't fast enough.

A basic diagram of how this works -
Controlling Electronics ---> Modem FPGA ---> desktop --NFS--> dataserver

To clarify on James Grinter's point ("Are these dropped packets on the
NIC stats, or reported at the RPC/NFS level?") - The FPGA has some
sort of buffer that is filling with data from the controlling
electroncis and dropping packets before they get sent over to the
desktop, because the desktop isn't writing fast enough. Changing the
TCP window settings on the client and host have brought that number
down *significantly*, where the FPGA had only dropped 11 packets out
of like 4+, but we're looking for 100% retention.

>From this thread, I think my next steps are:

1) Modeling IO with iozone
2) Using wireshark to monitor window zero/full errors.
3) Tuning TCP from there.

Unfortunately, upgrading this NFS server won't work. We purchased an
X86 machine full of disks because we can't really afford a netapp or
other storage, but I don't think we're throwing anything at it that it
cant do. They're all sequential writes.

I'd prefer to stick with one OS in this lab for many reasons, but if
it turns out I have to switch for this minuscule amount of streamed
data, so be it.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes [SEC=UNCLASSIFIED]

2010-09-26 Thread James R Grinter
The "device reports dropped packets" - I'd ask the original poster - is it 
definitely a TCP based NFS mount? Are these dropped packets on the NIC 
stats, or reported at the RPC/NFS level?

If it's any useful comparison, a completely untuned CentOS 5.4 system that I 
have - with Broadcom Gbit NIC, can write nearly 60MB/sec over a 10 second 
period to a Sun 7110 NFS server (OpenSolaris based, FWIW) without a problem. 
(I just tested by dd'ing /dev/zero to a file with a 1MB block size. No jumbo 
frames involved.)

James.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes [SEC=UNCLASSIFIED]

2010-09-26 Thread Robinson, Greg
UNCLASSIFIED

Hi all,

Coming in late to this conversation...

It seems to me, that data going in to the workstation is ok, to local
disk, but data going out, to a NFS share is the slow part.

Have you considered adding another NIC to the workstation, and making
that new one dedicated to receiving the data, and the primary one,
dedicated to deliverying the data to the NFS server, as well as
everything else network related.

This would eliminate potential contention issues on the wire, and
clearly point the finger to where the issue lies.
Others have suggested good points for the NFS server to consider...

Greg.

-Original Message-
From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
Of Patrick Cable
Sent: Friday, 24 September 2010 11:18 AM
To: tech@lopsa.org
Subject: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty
writes

I have a device that sends out information at 4.7 Megabytes a second.
I have a desktop that receives the data from this device that runs Red
Hat Enterprise Linux 5.5. They are on the same switch, a 24-port Juniper
EX2200.

When I write the data to the desktop on the local filesystem, there's no
dropped information. When I write the data to an NFS share, the device
reports dropped packets.

I have tried playing with the rsize/wsize NFS parameters (8192K seems to
be the best value), and values in
/proc/sys/net/core/{r,w}mem_{default,max} and increasing the NFS daemon
count, as suggested by
http://nfs.sourceforge.net/nfs-howto/ar01s05.html. Very similar results
across the board.

The NFS server also runs RHEL5.5. It's got 11 600gb 15k SAS drives in a
hardware RAID6 array. Running 'iftop' on the machine during the data
gathering operations, I'll see bursty traffic... that is to say,
workstation -> NFS server traffic will be in the high 40mb/sec rate,
then slow down, and once it slows down the device I refer to complains
of dropped information then it'll speed up again.

I find it hard to believe that a machine on the same (recent, gigabit)
switch can't write out 4.7MB/sec. Am I wrong?

Does anyone have any NFS or TCP tuning recommendations that may be a
little more up to date than the NFS howto that was last updated in 2006?
I'm really at a loss here.

Thank you, more than a lot, in advance..

- Pat
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/

IMPORTANT: This email remains the property of the Department of Defence
and is subject to the jurisdiction of section 70 of the Crimes Act 1914.
If you have received this email in error, you are requested to contact
the sender and delete the email.


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-26 Thread Brandon S Allbery KF8NH
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 9/24/10 12:12 , Andrew Keen wrote:
>2. NFS over UDP probably won't help, and it has its own issues. See
>   WARNINGS under nfs(5)

The primary WARNING that needs to be in there on Linux is "Try FreeBSD
instead."  (I'd've suggested Solaris, but f*** you too, Horricle.)

- -- 
brandon s. allbery [linux,solaris,freebsd,perl]  allb...@kf8nh.com
system administrator  [openafs,heimdal,too many hats]  allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon university  KF8NH
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyfvzYACgkQIn7hlCsL25Wy5gCeMPVuvnkUHaZyM6DfiY2+MfXe
PV8An0iGzd7DqE8mCfNkMJT4kSaNN9vd
=SQBQ
-END PGP SIGNATURE-
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-26 Thread Phil Pennock
On 2010-09-25 at 02:03 -0500, Brad Knowles wrote:
> Yup.  The mmap() system call on NFS is undefined, which means that Berkeley 
> DB, and any other software that uses mmap() cannot safely be used on NFS.  
> This is supposedly fixed in the latest versions of NFS, but I have yet to 
> have that claim actually demonstrated to me.

My recollection is that it's actually the flushing behaviour of writes
to mmap()d memory that is undefined and not flushable and so can't be
combined with locks, of any kind, for protecting access to data.  You
also can't predict order-of-writes, etc.

So mmap() works just fine, as long as you're very certain that only one
host is ever going to be accessing the file at any time.  So a single
dedicated host, with manual failover to a hot standby after taking down
the normal server, would work.

I'm not saying that I recommend this, but knowing the actual limitations
can make a difference when you're trying to duct-tape your way out of a
problem.

-Phil
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-25 Thread Giovanni Tirloni
On Fri, Sep 24, 2010 at 11:27 AM, Patrick Cable  wrote:

> On Fri, Sep 24, 2010 at 12:37 AM,   wrote:
> > What's going on over the wire?
> > Something wiresharky: Statistcs -> TCP stream graph ->
> > throughput graph. Might be revealing.
>
> I was running iftop to get a handle on what's going on over the wire.
> Not as in depth as wireshark, but it gives me a graph and a more data
> to start with. I ran it on the desktop receiving information from the
> FPGA and the data server receiving data from the desktop.
>
> On the desktop:
> - FPGA pushes data at a constant 4.76MB/sec, desktop receives that
> just fine, no burstiness.
> - The desktop seems to push data to nfs at anywhere from the
> 14.6Mb/sec to 29Mb/sec
> - The desktop receives NFS data at about 22.7Mb/sec constant
> (sometimes it jumps to 24.7Mb/sec, but its rare)
>
> On the server:
> - Server is sending NFS data to desktop at 22.6Mb/sec (I'm not going
> to sweat over .01Mb/sec difference)
> - Server is receiving NFS data with the same kind of bursts.
>
> It seems like the burstiness continues, but the dropped packet rate
> reduces dramatically if the following are set:
> /sbin/sysctl net.core.rmem_max=33554432
> /sbin/sysctl net.core.wmem_max=33554432
> /sbin/sysctl net.ipv4.tcp_rmem="4096 87380 33554432"
> /sbin/sysctl net.ipv4.tcp_wmem="4096 65536 33554432"
> /sbin/sysctl net.ipv4.tcp_no_metrics_save=1
>
> which points to issues with the TCP stack.
>
> Is NFS over UDP worth trying, or am I going to run into similar
> things, except not with TCP retransmit but with NFS UDP retrainsmit?
>
>
Wireshark does a good job in pointing out what's wrong with TCP sessions.
Keep an eye for window zero/full errors.

What options are you using to mount the NFS share ? Depending on your safety
requirements, you might want to try async/sync and different rsize/wsize
values.

You can also try to get some statistics (IOPS and throughput) from iozone
writing to the mount point. Remember to adjust record size and thread count
to your reality.

-- 
Giovanni Tirloni
gtirl...@sysdroid.com
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-25 Thread Edward Ned Harvey
> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
> Of Doug Hughes
> 
> RAID-5/6 are bad at IOPS. You get the equivalent IOPS of one disk
> (because all disks have to ask in synchronicity. But, for throughput,
> they are just fine. As long as you're writing more than (and preferably
> in multiples of) the stripe width, there's no penalty with a modern
> processor. zoom zoom for large sequential throughput.

I find this a gray area.  Below are the results of benchmarks I did in
November.  As you can see, my raid5 which should perform like 4 disks ...
performed like 3.5 for reads, and 2.2 for writes.

While that is an awful lot of overhead, the raid5 & 6 were actually better
for sequential operations than the equivalent raid-10 setup.

Conclusion:  raid-10 is better for random ops, while raid5/6 is better for
sequential ops.

1 disk:  
1.0 read G/s sustained
1.0 write G/s sustained

2-disk mirror:
1.7 read G/s sustained
1.0 write G/s sustained

3-disk mirror:
2.5 read G/s sustained
1.0 write G/s sustained

2 mirrors striped:
3.3 read G/s sustained
1.5 write G/s sustained

3 mirrors striped:
5.4 read G/s sustained
1.5 write G/s sustained

raid5 (5 disks capacity of 4)
3.5 read G/s sustained
2.2 write G/s sustained

raid6 (6 disks capacity of 4)
3.4 read G/s sustained
1.9 write G/s sustained


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-25 Thread Brad Knowles
On Sep 23, 2010, at 9:23 PM, Patrick Cable wrote:

> The documentation for this application mentions the following
> particular issues with NFS:
> - NFS can't make FIFOs (which is mitigated by the fact we store FIFOs
> on a local partition)

This is a valid issue.

> - "More important, SOFTWARE uses the mmap() system call on some of
> the files in the output directory, and mmap)( isn't reliable on an NFS
> mounted filesystem" (mitigated by.. I'm not sure, I did some googling
> and that seemed to be fixed in later kernel versions, i.e. *not*
> RHEL3).

Yup.  The mmap() system call on NFS is undefined, which means that Berkeley DB, 
and any other software that uses mmap() cannot safely be used on NFS.  This is 
supposedly fixed in the latest versions of NFS, but I have yet to have that 
claim actually demonstrated to me.

If you need these things, then the solution is to simply not use NFS.  Try a 
different network file storage solution that might work better for you -- 
perhaps iSCSI, so that it looks like a local disk to the machine where the 
volume is being mounted?

Of course, what alternatives are available to you (e.g., iSCSI) will depend 
entirely on what your options are for the back-end fileserver platform, what 
version of what OS it runs, and if you can't get what you want with what you 
have today, then what upgrade options you have available to you.

--
Brad Knowles 
LinkedIn Profile: 


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Brad Knowles
On Sep 24, 2010, at 10:47 PM, Doug Hughes wrote:

> RAID-5/6 are bad at IOPS. You get the equivalent IOPS of one disk 
> (because all disks have to ask in synchronicity. But, for throughput, 
> they are just fine. As long as you're writing more than (and preferably 
> in multiples of) the stripe width, there's no penalty with a modern 
> processor. zoom zoom for large sequential throughput.

Actually, everything depends on the underlying implementation and how much 
logic is layered on top of the physical configuration, and how much that logic 
isolates or exposes you to the quirks of the underlying technology.


I can tell you that the Linux NFS server has historically had quite a few 
issues with performance and reliability, even when used with other Linux 
clients (and no cross-platform issues to be concerned with), and when used with 
settings that are "optimal" for that particular pattern of data access.

Solutions to this particular problem space are going to depend on what options 
are available.  Can you do a forklift upgrade of the Linux NFS server for 
something that actually does reasonably well in that job, either another piece 
of relatively generic hardware running a relatively general-purpose OS like 
Solaris or FreeBSD?  Is replacing the Linux NFS server with an appliance (e.g., 
NetApp) an option?

If the only option is what mount options you have to play with, then you may be 
able to get a certain way down the road, but you're going to reach that 
dead-end pretty quickly and you're not likely to be able to go past that point 
without looking at some significantly more expensive options.

--
Brad Knowles 
LinkedIn Profile: 


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Doug Hughes
  On 9/24/2010 9:47 PM, Tom Limoncelli wrote:
> On Fri, Sep 24, 2010 at 8:22 PM, Edward Ned Harvey  
> wrote:
>>> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
>>> Of Andrew Keen
>>>
>>> RAID6 has terrible write performance.
>>> Try RAID10.
>> Only true for random writes/reads.  Raid6 does fine for continuous IO.
> Umm... I believe that RAID6 has just as bad write performance as RAID5.
>
> RAID0 and RAID10 are really the only levels with good write performance.
>
> Tom
>
RAID-5/6 are bad at IOPS. You get the equivalent IOPS of one disk 
(because all disks have to ask in synchronicity. But, for throughput, 
they are just fine. As long as you're writing more than (and preferably 
in multiples of) the stripe width, there's no penalty with a modern 
processor. zoom zoom for large sequential throughput.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Tom Limoncelli
On Fri, Sep 24, 2010 at 8:22 PM, Edward Ned Harvey  wrote:
>> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
>> Of Andrew Keen
>>
>> RAID6 has terrible write performance.
>> Try RAID10.
>
> Only true for random writes/reads.  Raid6 does fine for continuous IO.

Umm... I believe that RAID6 has just as bad write performance as RAID5.

RAID0 and RAID10 are really the only levels with good write performance.

Tom

-- 
http://EverythingSysadmin.com  -- my blog (new posts Mon and Wed)
http://www.TomOnTime.com -- my advice (more videos coming soon)

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Edward Ned Harvey
> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
> Of Andrew Keen
> 
> RAID6 has terrible write performance.
> Try RAID10.

Only true for random writes/reads.  Raid6 does fine for continuous IO.


___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


[lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Andrew Keen

 Two more things:

  1. It shouldn't matter as your configuration should be more than
 sufficient to sustain 5 MB/s, but RAID6 has terrible write
 performance. Try RAID10.
  2. NFS over UDP probably won't help, and it has its own issues. See
 WARNINGS under nfs(5)

-Andy
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


[lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Andrew Keen

 Some suggestions.

  1. Try CIFS (via Samba on the RHEL server) instead of NFS. This may
 indicate if it is a problem with NFS or at a lower layer.
  2. Any hardware problems? Link errors or dropped packets on the
 interface? I've seen some cheap NICs that can't handle high packet
 rates.
  3. async NFS mounts are the default on Linux. This can cause problems
 if the device's software assumes that as soon as a write() call
 returns the file is updated, which is not guaranteed with the
 Linux NFS server, IIRC. Try sync mounts. Increasing the number of
 NFSd servers may increase throughput but it may also increase the
 possibility that a read() from the client is serviced before the
 write() is committed.
  4. Try creating a large file in the NFS share, creating a file system
 on it, and mounting it directly via loopback on the client.

-Andy
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Patrick Cable
On Fri, Sep 24, 2010 at 12:37 AM,   wrote:
> What's going on over the wire?
> Something wiresharky: Statistcs -> TCP stream graph ->
> throughput graph. Might be revealing.

I was running iftop to get a handle on what's going on over the wire.
Not as in depth as wireshark, but it gives me a graph and a more data
to start with. I ran it on the desktop receiving information from the
FPGA and the data server receiving data from the desktop.

On the desktop:
- FPGA pushes data at a constant 4.76MB/sec, desktop receives that
just fine, no burstiness.
- The desktop seems to push data to nfs at anywhere from the
14.6Mb/sec to 29Mb/sec
- The desktop receives NFS data at about 22.7Mb/sec constant
(sometimes it jumps to 24.7Mb/sec, but its rare)

On the server:
- Server is sending NFS data to desktop at 22.6Mb/sec (I'm not going
to sweat over .01Mb/sec difference)
- Server is receiving NFS data with the same kind of bursts.

It seems like the burstiness continues, but the dropped packet rate
reduces dramatically if the following are set:
/sbin/sysctl net.core.rmem_max=33554432
/sbin/sysctl net.core.wmem_max=33554432
/sbin/sysctl net.ipv4.tcp_rmem="4096 87380 33554432"
/sbin/sysctl net.ipv4.tcp_wmem="4096 65536 33554432"
/sbin/sysctl net.ipv4.tcp_no_metrics_save=1

which points to issues with the TCP stack.

Is NFS over UDP worth trying, or am I going to run into similar
things, except not with TCP retransmit but with NFS UDP retrainsmit?
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-24 Thread Edward Ned Harvey
> From: tech-boun...@lopsa.org [mailto:tech-boun...@lopsa.org] On Behalf
> Of Patrick Cable
> 
> I have tried playing with the rsize/wsize NFS parameters (8192K seems
> to be the best value), and values in
> /proc/sys/net/core/{r,w}mem_{default,max} and increasing the NFS
> daemon count, as suggested by
> http://nfs.sourceforge.net/nfs-howto/ar01s05.html. Very similar
> results across the board.

I think your biggest factor is going to be sync/async.
If you use sync, then the client waits for acknowledgement of every packet.
Depending on the size of your packet, it may dramatically hurt your
performance.

Try async and see if it helps.

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread cpolish
On Thu, Sep 23, 2010 at 10:34:54PM -0400, Patrick Cable wrote:
> On Thu, Sep 23, 2010 at 10:25 PM, Doug Hughes  wrote:
> > THere's always strace..
> 
> There is. I've never straced a java app before, though - which is what
> most of this app is written in. It's also separated into about 20
> smaller applications, so figuring out what does what is a task.
> 
> These are good starting points.

What's going on over the wire?
Something wiresharky: Statistcs -> TCP stream graph ->
throughput graph. Might be revealing.
-- 
Charles Polisher

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Andrew Hume

as laurie anderson would say

and when strace is gone, there's always force.
and when force is gone, there;s always mom.
(hi mom!)

On Sep 23, 2010, at 10:34 PM, Patrick Cable wrote:


On Thu, Sep 23, 2010 at 10:25 PM, Doug Hughes  wrote:

THere's always strace..


There is. I've never straced a java app before, though - which is what
most of this app is written in. It's also separated into about 20
smaller applications, so figuring out what does what is a task.

These are good starting points.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


--
Andrew Hume  (best -> Telework) +1 732-886-1886
and...@research.att.com  (Work) +1 973-360-8651
AT&T Labs - Research; member of USENIX and LOPSA



___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Daniel Pittman
Patrick Cable  writes:

> I have a device that sends out information at 4.7 Megabytes a second.
> I have a desktop that receives the data from this device that runs Red Hat
> Enterprise Linux 5.5. They are on the same switch, a 24-port Juniper EX2200.
>
> When I write the data to the desktop on the local filesystem, there's no
> dropped information. When I write the data to an NFS share, the device
> reports dropped packets.

[...]

> I find it hard to believe that a machine on the same (recent, gigabit)
> switch can't write out 4.7MB/sec. Am I wrong?

No, it should be fine doing that.  One test that might be illuminating, if
painful, would be to try running 'while sleep 1; do sync; done' while doing
the capture and see if that will smooth things out.

While I would have hoped more modern systems have improved things, it used to
be back in the bad old days when I was working with DV capture on Linux that
the system had plenty of average bandwidth to write the stream, but would
batch work until the bursts were blocking long enough to drop frames.

That hack was my cheap test for figuring out that having the kernel / app
flushing more often, so lowering the peak requirement, would fix things.

If it does, having your application flush the output while writing should help
sort things out.  (Traditionally, fsync from another thread works...)

Regards,
Daniel
-- 
✣ Daniel Pittman✉ dan...@rimspace.net☎ +61 401 155 707
   ♽ made with 100 percent post-consumer electrons

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Patrick Cable
On Thu, Sep 23, 2010 at 10:25 PM, Doug Hughes  wrote:
> THere's always strace..

There is. I've never straced a java app before, though - which is what
most of this app is written in. It's also separated into about 20
smaller applications, so figuring out what does what is a task.

These are good starting points.
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Patrick Cable
On Thu, Sep 23, 2010 at 10:20 PM, Nicholas Tang  wrote:
> A couple of quick suggestions:
>
> 1.) Turn on jumbo frames on the switch and the servers
> 2.) Turn atime off on the receiving (writing) end - the nfs server

These were my next thoughts.

I don't have an aversion to jumbo frames everywhere, other than the
NFS server also runs other services (dns, a very very small LDAP) and
I don't know if there will be a problem with clients expecting a
smaller frame size trying to connect to a machine with a large frame
size?

> Also, so I understand correctly, you've got 3 machines:
>
> - data writer
> - desktop
> - nfs server
>
> The desktop mounts an nfs volume. The data writer sends data to the
> desktop, which writes it to the nfs volume.

Correct.

> If that's correct, then:
>
> 1.) Why not write directly to the nfs server and cut out the middle
> man?  This could be a matter of bad drivers on the desktop or whatever
> else.  You could accomplish this by having the data writing box send
> data straight to the nfs server, or by mounting that nfs mount on the
> data writer and having it write "locally" to that mount.  You could
> still mount that on the desktop for reading/ access/ editing.

Well, I wanted to seperate out what machines do what.

What's running on this desktop now is supposed to move to a dedicated
server that writes the data to NFS and also provides live streams out
to any other desktops (there are four) that could ask for it.

But, if the desktop can't do it, I have a problem believing the server
will be any better off. Similar processing power, network ports, etc.

> 2.) If you can't do that, why not write to local disk on the desktop
> and then do an async move/ copy to the nfs volume?

Breaks the architecture. The idea is for all this stuff to be over on
the NFS server because the individual desktops dont have the storage
space to make this work.

> I've always found NFS performance on linux to be lackluster, although
> your numbers seem unusually bad.

This is what I've heard, though the "unusually bad" part seems.. bad.

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Nicholas Tang
If it's ext3, try playing w/ different journaling settings... that can
have a big impact on write performance depending on the type of writes
coming in.

Nicholas

On Thu, Sep 23, 2010 at 10:23 PM, Patrick Cable  wrote:
> On Thu, Sep 23, 2010 at 9:53 PM, Doug Hughes  wrote:
>> What size chunks is the application writing? How many files? What size
>> files? What is the back-end filesystem behind NFS?
>
> I will try to find this out tomorrow (though what I can say right now
> is that it's an ext3 filesystem).
> I'm not sure how much we'll know because the particular software that
> generates this data is... special.
>
> The documentation for this application mentions the following
> particular issues with NFS:
>  - NFS can't make FIFOs (which is mitigated by the fact we store FIFOs
> on a local partition)
>  - "More important, SOFTWARE uses the mmap() system call on some of
> the files in the output directory, and mmap)( isn't reliable on an NFS
> mounted filesystem" (mitigated by.. I'm not sure, I did some googling
> and that seemed to be fixed in later kernel versions, i.e. *not*
> RHEL3).
> ___
> Tech mailing list
> Tech@lopsa.org
> http://lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Doug Hughes
  On 9/23/2010 10:23 PM, Patrick Cable wrote:
> On Thu, Sep 23, 2010 at 9:53 PM, Doug Hughes  wrote:
>> What size chunks is the application writing? How many files? What size
>> files? What is the back-end filesystem behind NFS?
> I will try to find this out tomorrow (though what I can say right now
> is that it's an ext3 filesystem).
> I'm not sure how much we'll know because the particular software that
> generates this data is... special.
>

THere's always strace..
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Patrick Cable
On Thu, Sep 23, 2010 at 9:53 PM, Doug Hughes  wrote:
> What size chunks is the application writing? How many files? What size
> files? What is the back-end filesystem behind NFS?

I will try to find this out tomorrow (though what I can say right now
is that it's an ext3 filesystem).
I'm not sure how much we'll know because the particular software that
generates this data is... special.

The documentation for this application mentions the following
particular issues with NFS:
 - NFS can't make FIFOs (which is mitigated by the fact we store FIFOs
on a local partition)
 - "More important, SOFTWARE uses the mmap() system call on some of
the files in the output directory, and mmap)( isn't reliable on an NFS
mounted filesystem" (mitigated by.. I'm not sure, I did some googling
and that seemed to be fixed in later kernel versions, i.e. *not*
RHEL3).
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Nicholas Tang
A couple of quick suggestions:

1.) Turn on jumbo frames on the switch and the servers
2.) Turn atime off on the receiving (writing) end - the nfs server

Also, so I understand correctly, you've got 3 machines:

- data writer
- desktop
- nfs server

The desktop mounts an nfs volume. The data writer sends data to the
desktop, which writes it to the nfs volume.

If that's correct, then:

1.) Why not write directly to the nfs server and cut out the middle
man?  This could be a matter of bad drivers on the desktop or whatever
else.  You could accomplish this by having the data writing box send
data straight to the nfs server, or by mounting that nfs mount on the
data writer and having it write "locally" to that mount.  You could
still mount that on the desktop for reading/ access/ editing.

2.) If you can't do that, why not write to local disk on the desktop
and then do an async move/ copy to the nfs volume?

I've always found NFS performance on linux to be lackluster, although
your numbers seem unusually bad.

Nicholas

On Thu, Sep 23, 2010 at 9:47 PM, Patrick Cable  wrote:
> I have a device that sends out information at 4.7 Megabytes a second.
> I have a desktop that receives the data from this device that runs Red
> Hat Enterprise Linux 5.5. They are on the same switch, a 24-port
> Juniper EX2200.
>
> When I write the data to the desktop on the local filesystem, there's
> no dropped information. When I write the data to an NFS share, the
> device reports dropped packets.
>
> I have tried playing with the rsize/wsize NFS parameters (8192K seems
> to be the best value), and values in
> /proc/sys/net/core/{r,w}mem_{default,max} and increasing the NFS
> daemon count, as suggested by
> http://nfs.sourceforge.net/nfs-howto/ar01s05.html. Very similar
> results across the board.
>
> The NFS server also runs RHEL5.5. It's got 11 600gb 15k SAS drives in
> a hardware RAID6 array. Running 'iftop' on the machine during the data
> gathering operations, I'll see bursty traffic... that is to say,
> workstation -> NFS server traffic will be in the high 40mb/sec rate,
> then slow down, and once it slows down the device I refer to complains
> of dropped information then it'll speed up again.
>
> I find it hard to believe that a machine on the same (recent, gigabit)
> switch can't write out 4.7MB/sec. Am I wrong?
>
> Does anyone have any NFS or TCP tuning recommendations that may be a
> little more up to date than the NFS howto that was last updated in
> 2006? I'm really at a loss here.
>
> Thank you, more than a lot, in advance..
>
> - Pat
> ___
> Tech mailing list
> Tech@lopsa.org
> http://lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


Re: [lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Doug Hughes
  On 9/23/2010 9:47 PM, Patrick Cable wrote:
> I have a device that sends out information at 4.7 Megabytes a second.
> I have a desktop that receives the data from this device that runs Red
> Hat Enterprise Linux 5.5. They are on the same switch, a 24-port
> Juniper EX2200.
>
> When I write the data to the desktop on the local filesystem, there's
> no dropped information. When I write the data to an NFS share, the
> device reports dropped packets.
>
> I have tried playing with the rsize/wsize NFS parameters (8192K seems
> to be the best value), and values in
> /proc/sys/net/core/{r,w}mem_{default,max} and increasing the NFS
> daemon count, as suggested by
> http://nfs.sourceforge.net/nfs-howto/ar01s05.html. Very similar
> results across the board.
>
> The NFS server also runs RHEL5.5. It's got 11 600gb 15k SAS drives in
> a hardware RAID6 array. Running 'iftop' on the machine during the data
> gathering operations, I'll see bursty traffic... that is to say,
> workstation ->  NFS server traffic will be in the high 40mb/sec rate,
> then slow down, and once it slows down the device I refer to complains
> of dropped information then it'll speed up again.
>
> I find it hard to believe that a machine on the same (recent, gigabit)
> switch can't write out 4.7MB/sec. Am I wrong?
>
> Does anyone have any NFS or TCP tuning recommendations that may be a
> little more up to date than the NFS howto that was last updated in
> 2006? I'm really at a loss here.
>
> Thank you, more than a lot, in advance..
What size chunks is the application writing? How many files? What size 
files? What is the back-end filesystem behind NFS?

Have you tried simulating the same load directly on the server?

___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/


[lopsa-tech] The FPGA and the NFS mount: A tale of bursty writes

2010-09-23 Thread Patrick Cable
I have a device that sends out information at 4.7 Megabytes a second.
I have a desktop that receives the data from this device that runs Red
Hat Enterprise Linux 5.5. They are on the same switch, a 24-port
Juniper EX2200.

When I write the data to the desktop on the local filesystem, there's
no dropped information. When I write the data to an NFS share, the
device reports dropped packets.

I have tried playing with the rsize/wsize NFS parameters (8192K seems
to be the best value), and values in
/proc/sys/net/core/{r,w}mem_{default,max} and increasing the NFS
daemon count, as suggested by
http://nfs.sourceforge.net/nfs-howto/ar01s05.html. Very similar
results across the board.

The NFS server also runs RHEL5.5. It's got 11 600gb 15k SAS drives in
a hardware RAID6 array. Running 'iftop' on the machine during the data
gathering operations, I'll see bursty traffic... that is to say,
workstation -> NFS server traffic will be in the high 40mb/sec rate,
then slow down, and once it slows down the device I refer to complains
of dropped information then it'll speed up again.

I find it hard to believe that a machine on the same (recent, gigabit)
switch can't write out 4.7MB/sec. Am I wrong?

Does anyone have any NFS or TCP tuning recommendations that may be a
little more up to date than the NFS howto that was last updated in
2006? I'm really at a loss here.

Thank you, more than a lot, in advance..

- Pat
___
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/