Re: [Gluster-users] What NAS device(s) do you use? And why?
Sent from my iPad On Dec 13, 2010, at 11:28 AM, Rudi Ahlers wrote: > On Mon, Dec 13, 2010 at 2:52 AM, Marc Villemade wrote: >> >> On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote: >> >>> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman >>> wrote: On 12/11/2010 11:17 AM, Rudi Ahlers wrote: >> [..] -- >>> >>> [...] >> >> At scality, we have developed such an object store which scales smoothly up >> to petabytes with off-the shelf servers logically brought together in a ring. >> While other solutions' performance usually degrade with time, our >> performance is similar to a high-end SAN from the start and stays roughly >> the same as we scale up to petabytes. >> > [...] > > Thanx, It's the first time I hear of the term "Object Storage". In all > honest, from a technical view point, how does this differ from NAS / > SAN's? > Hey Rudi, [Once again, disclaimer: i work for Scality, an object store platform developer] Sure, let me try and explain. I guess the core difference between object storage and NAS/SAN is that there is no filesystem involved (whether we're talking about server-side (NAS) or client-side (SAN) managed filesystem. This means that there is none of the limitations inherent to filesystems: number of inodes, number of files in a directory, etc. In a nutshell, object storage is a system where stored data is referenced by a key assigned to the object at creation and which is used for subsequent retrievals. There is no folders, or paths to a file in its core concept. Objects are usually replicated to ensure reliability and availability, with metadata attached to the objects for many uses (replication and retention policy, tiering, keyword tagging ...). Object storage is sometimes refered to as cloud storage as well. It is true that the cloud storage services (a la Amazon, Rackspace in the US or Dunkel/ScaleUp in Europe) are storing objects basically, but the difference is that the underlying storage is not necesarily "object". Object storage is also somewhat closely related to CAS (Content addressable storage) which is mostly used for fixed content storage, so very popular for archival and storage needing high levels of compliancy with government regulations. Objects in CAS are addressed through a hash of their payload (hence the name) which makes it hard to have modifiable content as the addresing would change for each new modification. For the unstructured data, the most growing data set in the world right now, object storage is perfect as it is maps really easily with these datasets' needs: - "unstructured" storage (objects are not necesarily linked to each other, although they can be thru metadata tagging), - when correctly implemented, object storage should be a much more scalable system than regular filesystems so for exploding datasets, it makes much more sense. Now, why should it be a much more scalable system, you might ask ? :-D Without considering the economic aspect, object storage technology are void of volume management, as it should be a flat addressable space with virtually no limits, and without the filesystem limits, growing to billions of objects is possible (whereas storing billions of objects on a NAS/SAN without losing performance might prove difficult). It also depends on the technology, some have an object location database that creates a bottleneck and lowers the scalability and reliability of the system. Then, there is the economic aspect. Depending on the technology, off the shelf servers and disks can be used, which makes it easy to set up a new service in a competing market, or to move to object storage with a large existing dataset without investing millions and millions of $$. Object storage is perfect for unstructured data and for other applications (email, backup, media, archiving..). It is not a good fit for relational databases, for example. But, as i said earlier, the most exploding datasets these days are in the unstructured data realm. Depending on the type of data, access patterns and applications one needs to use, object storage is usually the way to go to control costs while having a reliable and durable storage environment when hitting hundreds of Terabytes and more. Over here at Scality, our object storage platform has all these characteristics (no volume management, elastic growth, no central database ..). Our key differentiation with other people in the space is that we bring roughly the same performance than SAN/NAS systems and all the object storage advantages. And then some .. If you want more information, let me know ;) There, I hope this helps you understand a bit more about object storage. Sorry i carried on so much ;) Happy holidays everyone ! Cheers -Marc Villemade http://j.mp/e1pjfo ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] What NAS device(s) do you use? And why?
On Mon, Dec 13, 2010 at 2:52 AM, Marc Villemade wrote: > > On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote: > >> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman >> wrote: >>> On 12/11/2010 11:17 AM, Rudi Ahlers wrote: > >>> [..] >>> >>> -- >> >> >> True, I fully agree with you on that point. I myself don't like vendor >> lock-ins. But, I do like the simplicity that off-the-shelf NAS devices >> offer, as apposed to a DIY one. And they often offer more tools than a >> normal Linux box with iSCSI as well. >> > > hi all, > > [Disclaimer: I work for Scality] > > Rudi, first and foremost, i think you have to pinpoint the main > characteristic you want to improve on your system. It sounds like it is > scalablity. Beyond NAS, there is an array (no pun intended) of choices for > you to consider (scale-out NAS, SANs, clustered file system, dispersed > storage ...). > But to figure out the best system for you, one needs to take a hard look at > the kind of data you're storing and at the access patterns and requirements > for those. > > For large systems (meaning > 100 TB and/or lots of files (of whichever > size)), getting rid of the filesystem layer is sometimes very efficient. > Hence, one should think about Object Storage as a solution. They bring > high-reliability and durability (through self-replication, self-healing) with > cost-efficiency using commodity hardware. It is the technology used by lots > of public cloud vendors to offer their service for cheap and still be > profitable. Those guys inherently need to be able to scale almost infinitely > (look at Amazon or Rackspace). > > If you are dealing with unstructured data as opposed to a relational database > for example; if there are millions and millions of objects that you need to > access quickly, object storage might be for you. > > Rudi, what do you actually store ? for what kind of service is the storage > layer used ? Are you storing emails/backups or hosting your employees' file > sharing service ? Basically, if you're storing any kind of data which size > you know is likely to grow massively and you want to be able to: > 1 - allow it > 2 - afford it > 3 - operate it at the lowest possible cost > , then i would strongly suggest you look into object storage technology > (there are a couple of opensource options as well as vendor solutions). > > At scality, we have developed such an object store which scales smoothly up > to petabytes with off-the shelf servers logically brought together in a ring. > While other solutions' performance usually degrade with time, our performance > is similar to a high-end SAN from the start and stays roughly the same as we > scale up to petabytes. > > We gracefully scale in capacity and performance by just adding nodes to the > system (without service interruption), so we're never limited by a box design > (either in maximum nb of drives or by network capacity). > > If you feel like you could use object storage to store your data, please have > a look at our technology and get in touch - http://bit.ly/fY6eMm > > Happy Holidays ! > > -Marc Villemade > http://linkd.in/heve30 > > Thanx, It's the first time I hear of the term "Object Storage". In all honest, from a technical view point, how does this differ from NAS / SAN's? -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] What NAS device(s) do you use? And why?
On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote: > On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman > wrote: >> On 12/11/2010 11:17 AM, Rudi Ahlers wrote: >> [..] >> >> -- > > > True, I fully agree with you on that point. I myself don't like vendor > lock-ins. But, I do like the simplicity that off-the-shelf NAS devices > offer, as apposed to a DIY one. And they often offer more tools than a > normal Linux box with iSCSI as well. > hi all, [Disclaimer: I work for Scality] Rudi, first and foremost, i think you have to pinpoint the main characteristic you want to improve on your system. It sounds like it is scalablity. Beyond NAS, there is an array (no pun intended) of choices for you to consider (scale-out NAS, SANs, clustered file system, dispersed storage ...). But to figure out the best system for you, one needs to take a hard look at the kind of data you're storing and at the access patterns and requirements for those. For large systems (meaning > 100 TB and/or lots of files (of whichever size)), getting rid of the filesystem layer is sometimes very efficient. Hence, one should think about Object Storage as a solution. They bring high-reliability and durability (through self-replication, self-healing) with cost-efficiency using commodity hardware. It is the technology used by lots of public cloud vendors to offer their service for cheap and still be profitable. Those guys inherently need to be able to scale almost infinitely (look at Amazon or Rackspace). If you are dealing with unstructured data as opposed to a relational database for example; if there are millions and millions of objects that you need to access quickly, object storage might be for you. Rudi, what do you actually store ? for what kind of service is the storage layer used ? Are you storing emails/backups or hosting your employees' file sharing service ? Basically, if you're storing any kind of data which size you know is likely to grow massively and you want to be able to: 1 - allow it 2 - afford it 3 - operate it at the lowest possible cost , then i would strongly suggest you look into object storage technology (there are a couple of opensource options as well as vendor solutions). At scality, we have developed such an object store which scales smoothly up to petabytes with off-the shelf servers logically brought together in a ring. While other solutions' performance usually degrade with time, our performance is similar to a high-end SAN from the start and stays roughly the same as we scale up to petabytes. We gracefully scale in capacity and performance by just adding nodes to the system (without service interruption), so we're never limited by a box design (either in maximum nb of drives or by network capacity). If you feel like you could use object storage to store your data, please have a look at our technology and get in touch - http://bit.ly/fY6eMm Happy Holidays ! -Marc Villemade http://linkd.in/heve30 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] What NAS device(s) do you use? And why?
On 11/12/2010 16:17, Rudi Ahlers wrote: If you use any NAS (or a SAN) devices, what do you use? And I'm referring more to larger scale network storage than your home PC or home theater system. We've had very good experiences with our NetGear ReadyNAS devices but I'm in the market for something new. The NetGear's aren't the cheapest ones around but they do what it says on the box. My only real gripe with them is the lack of decent scalability. TheCus devices seems to be rather powerful as well, and you can stack upto 5 units together. But that's where the line stops. You said no HTPC systems and then listed a couple? I would have thought at the 100TB level you would want to have the experience to manage the machine in house anyway? You want to be 100% comfortable that when that machine goes down you can rescue it... So I would suggest a Norco or Supermicro case - these go up to 30-36 drives per physical box. Then choose your favourite distro and get super comfortable with the ins and outs of LVM, linux raid and iscsi. Break it, fix it, break it, There is a growing amount of support for RAID6 as being far more "reliable" than RAID10 for a given set of parameters (and given definition of "reliable"). RAID10 is capable of far more IOPs though, so pick your poison... I definitely buy the double parity argument though, so try and gain it somehow... (The issue in practice seems to be that the first drive feels like "protection", but once it's failed it's ever so easy to have some kind of tiny error during recovery, eg unscrubbed array, unplug wrong drive, gremlin, second drive failure, etc) I think you can buy a well supported Supermicro box with support from a well supported enterprise distro and still spend less than a mid-spec NAS at the level you are aiming at? However, I would 100% concede that above the level of NAS boxes using off the shelf linux software there is a potentially large performance gap, eg a NetApp box should blow away your linux box (caveat - don't own a netapp box...) Remember also that at this kind of storage level you need to be really sure what your goals are. It's not so hard to get 100TB in a single chassis, but getting it "reliable" and "fast" (choose your own definition) is a tradeoff and much harder Good luck - I love hearing about these larger projects, please send some feedback on your choices? Ed W ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] What NAS device(s) do you use? And why?
On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman wrote: > On 12/11/2010 11:17 AM, Rudi Ahlers wrote: > > [...] > >> I'm now looking for something that could scale beyond 100TB on one >> device (not necessarily one unit though) and find it frustrating that >> most NAS's come in 1U or 2U at most. > > Not meant as a commercial. See http://scalableinformatics.com/sicluster > > Scales through many PB. 100TB isn't an issue, we have many in the field of > this size or larger (and smaller). Runs Gluster 3.x (3.1.x recommended). > >> Maybe I'm just not shopping around enough, or maybe I prefer to well >> known brands, I don't know. > > The "well known" brands have well known limitations. You have to decide if > brand name is more important than meeting the specific needs. > > > -- True, I fully agree with you on that point. I myself don't like vendor lock-ins. But, I do like the simplicity that off-the-shelf NAS devices offer, as apposed to a DIY one. And they often offer more tools than a normal Linux box with iSCSI as well. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] What NAS device(s) do you use? And why?
On 12/11/2010 11:17 AM, Rudi Ahlers wrote: [...] I'm now looking for something that could scale beyond 100TB on one device (not necessarily one unit though) and find it frustrating that most NAS's come in 1U or 2U at most. Not meant as a commercial. See http://scalableinformatics.com/sicluster Scales through many PB. 100TB isn't an issue, we have many in the field of this size or larger (and smaller). Runs Gluster 3.x (3.1.x recommended). Maybe I'm just not shopping around enough, or maybe I prefer to well known brands, I don't know. The "well known" brands have well known limitations. You have to decide if brand name is more important than meeting the specific needs. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] What NAS device(s) do you use? And why?
If you use any NAS (or a SAN) devices, what do you use? And I'm referring more to larger scale network storage than your home PC or home theater system. We've had very good experiences with our NetGear ReadyNAS devices but I'm in the market for something new. The NetGear's aren't the cheapest ones around but they do what it says on the box. My only real gripe with them is the lack of decent scalability. TheCus devices seems to be rather powerful as well, and you can stack upto 5 units together. But that's where the line stops. I'm now looking for something that could scale beyond 100TB on one device (not necessarily one unit though) and find it frustrating that most NAS's come in 1U or 2U at most. Maybe I'm just not shopping around enough, or maybe I prefer to well known brands, I don't know. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users