Re: [Lustre-discuss] How to achieve 20GB/s file system throughput?
On 07/23/2010 10:25 PM, henry...@dell.com wrote: > Hello, > > One of my customer want to set up HPC with thousands of compute nodes. > The parallel file system should have 20GB/s throughput. I am not sure > whether lustre can make it. How many IO nodes needed to achieve this target? I hate to say "it depends" but, it does in fact depend upon many things. What type of IO is the customer doing; large block sequential spread out over many nodes (parallel IO), or small block random, or a mixture? It is possible to achieve 20GB/s, and quite a bit more, using Lustre. As to whether or not that 20GB/s is meaningful to their code(s), thats a different question. It would be 20GB/s in aggregate, over possibly many compute nodes doing IO. > My assumption is 100 or more IO nodes(rack servers) are needed. Hmmm ... If you can achieve 500+ MB/s per OST, then you would need about 40 OSTs. You can have each OSS handle several OSTs. There are efficiency losses you should be aware of, but 20GB/s using some mechanism to measure this, should be possible with a realistic number of units. Don't forget to count efficiency losses in the design. 100 IO nodes ... I presume you mean OSSes? If your units are slower, then yes, you will need more of them to achieve this performance. You would need to make sure you have a well designed and correctly functional Infiniband infrastructure in addition to the other issues. We've found that Lustre is ... very sensitive ... to the Infiniband implementation. Regards, Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] How to achieve 20GB/s file system throughput?
On Saturday, July 24, 2010, henry...@dell.com wrote: > Hello, > > > > One of my customer want to set up HPC with thousands of compute nodes. > The parallel file system should have 20GB/s throughput. I am not sure > whether lustre can make it. How many IO nodes needed to achieve this > target? > > > > My assumption is 100 or more IO nodes(rack servers) are needed. > I'm a bit prejudiced, of course, but with DDN storage that would be quite simple. With the older DDN S2A 9990, you can get 5GB/s per controller-pair , with the newer SFA1 you can get 6.5 to 7GB/s (we are still tuning it) per controller pair. Each controller pair (couplet in DDN terms) usually has 4 servers connected and fits into single rack in a 300 drive configuration. So you can get 20GB/s with 3 or 4 racks and 12 or 16 OSS servers, which is much below your 100 IO nodes ;) Cheers, Bernd -- Bernd Schubert DataDirect Networks ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] How to achieve 20GB/s file system throughput?
may be checkout http://www.terascala.com provide lustre appliance it claim rts1000 20TB per enclosure 2GB throughput regards On Fri, Jul 23, 2010 at 10:25 PM, wrote: > Hello, > > > > One of my customer want to set up HPC with thousands of compute nodes. The > parallel file system should have 20GB/s throughput. I am not sure whether > lustre can make it. How many IO nodes needed to achieve this target? > > > > My assumption is 100 or more IO nodes(rack servers) are needed. > > > > Thanks in advance! > > * * > > *Henry Xu*, > > *S*ystem *C*onsultant, > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- Hung-Sheng Tsao, Ph.D. laot...@gmail.com http://laotsao.wordpress.com 9734950840 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] How to achieve 20GB/s file system throughput?
Hate to reply to myself ... not an advertisement On 07/23/2010 10:50 PM, Joe Landman wrote: > On 07/23/2010 10:25 PM, henry...@dell.com wrote: [...] > It is possible to achieve 20GB/s, and quite a bit more, using Lustre. > As to whether or not that 20GB/s is meaningful to their code(s), thats a > different question. It would be 20GB/s in aggregate, over possibly many > compute nodes doing IO. I should point out that we have customers with 20GB/s maximum theoretical configs (best case scenarios) with our siCluster (http://scalableinformatics.com/sicluster), with 8 IO units. Their write patterns and Infiniband configurations don't seem to allow achieving this in practice. Simple benchmark tests (mixtures of llnl mpi-io, io-bm, iozone, ...) show sustained results north of 12 GB/s for them. Again, to set expectations, most users codes never utilize storage systems very effectively, hence you might design a 20GB/s storage system, and the IO being done might not hit much above 500 MB/s for single threads. >> My assumption is 100 or more IO nodes(rack servers) are needed. > Hmmm ... If you can achieve 500+ MB/s per OST, then you would need about > 40 OSTs. You can have each OSS handle several OSTs. There are > efficiency losses you should be aware of, but 20GB/s using some > mechanism to measure this, should be possible with a realistic number of > units. Don't forget to count efficiency losses in the design. We do this in 8 machines (theoretical max performance), and could put this in a single rack. We prefer to break it out among more IO nodes, say 16-24 smaller nodes, with 2-3 OSTs per OSS (e.g. IO node). My comments are to make sure your customer understands the efficiency issues, and that simple fortran writes from a single thread aren't going to be done at 20GB/s. That is, not unlike a compute cluster, a storage cluster has an aggregate bandwidth, that a single node or reader/writer cannot achieve on its own. Regards, Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss