Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-15 Thread Marc Villemade


Sent from my iPad

On Dec 13, 2010, at 11:28 AM, Rudi Ahlers  wrote:

> On Mon, Dec 13, 2010 at 2:52 AM, Marc Villemade  wrote:
>> 
>> On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote:
>> 
>>> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman
>>>  wrote:
 On 12/11/2010 11:17 AM, Rudi Ahlers wrote:
>> 
 [..]
 
 --
>>> 
>>> [...]
>> 
>> At scality, we have developed such an object store which scales smoothly up 
>> to petabytes with off-the shelf servers logically brought together in a ring.
>> While other solutions' performance usually degrade with time, our 
>> performance is similar to a high-end SAN from the start and stays roughly 
>> the same as we scale up to petabytes.
>> 
> [...]
> 
> Thanx, It's the first time I hear of the term "Object Storage". In all
> honest, from a technical view point, how does this differ from NAS /
> SAN's?
> 

Hey Rudi,

[Once again, disclaimer: i work for Scality, an object store platform developer]

Sure, let me try and explain.

I guess the core difference between object storage and NAS/SAN is that there is 
no filesystem involved (whether we're talking about server-side (NAS) or 
client-side (SAN) managed filesystem. This means that there is none of the 
limitations inherent to filesystems: number of inodes, number of files in a 
directory, etc.

In a nutshell, object storage is a system where stored data is referenced by a 
key assigned to the object at creation and which is used for subsequent 
retrievals. There is no folders, or paths to a file in its core concept. 
Objects are usually replicated to ensure reliability and availability, with 
metadata attached to the objects for many uses (replication and retention 
policy, tiering, keyword tagging ...).

Object storage is sometimes refered to as cloud storage as well. It is true 
that the cloud storage services (a la Amazon, Rackspace in the US or 
Dunkel/ScaleUp in Europe) are storing objects basically, but the difference is 
that the underlying storage is not necesarily "object".
Object storage is also somewhat closely related to CAS (Content addressable 
storage) which is mostly used for fixed content storage, so very popular for 
archival and storage needing high levels of compliancy with government 
regulations. Objects in CAS are addressed through a hash of their payload 
(hence the name) which makes it hard to have modifiable content as the 
addresing would change for each new modification.

For the unstructured data, the most growing data set in the world right now, 
object storage is perfect as it is maps really easily with these datasets' 
needs:
- "unstructured" storage (objects are not necesarily linked to each 
other, although they can be thru metadata tagging),
- when correctly implemented, object storage should be a much more 
scalable system than regular filesystems so for exploding datasets, it makes 
much more sense.

Now, why should it be a much more scalable system, you might ask ? :-D

Without considering the economic aspect, object storage technology are void of 
volume management, as it should be a flat addressable space with virtually no 
limits, and without the filesystem limits, growing to billions of objects is 
possible (whereas storing billions of objects on a NAS/SAN without losing 
performance might prove difficult). It also depends on the technology, some 
have an object location database that creates a bottleneck and lowers the 
scalability and reliability of the system. 

Then, there is the economic aspect. Depending on the technology, off the shelf 
servers and disks can be used, which makes it easy to set up a new service in a 
competing market, or to move to object storage with a large existing dataset 
without investing millions and millions of $$.

Object storage is perfect for unstructured data and for other applications 
(email, backup, media, archiving..). It is not a good fit for relational 
databases, for example. But, as i said earlier, the most exploding datasets 
these days are in the unstructured data realm.

Depending on the type of data, access patterns and applications one needs to 
use, object storage is usually the way to go to control costs while having a 
reliable and durable storage environment when hitting hundreds of Terabytes and 
more.

Over here at Scality, our object storage platform has all these characteristics 
(no volume management, elastic growth, no central database ..). Our key 
differentiation with other people in the space is that we bring roughly the 
same performance than SAN/NAS systems and all the object storage advantages. 
And then some .. If you want more information, let me know ;)

There, I hope this helps you understand a bit more about object storage. Sorry 
i carried on so much ;)

Happy holidays everyone !

Cheers

-Marc Villemade
http://j.mp/e1pjfo
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-13 Thread Rudi Ahlers
On Mon, Dec 13, 2010 at 2:52 AM, Marc Villemade  wrote:
>
> On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote:
>
>> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman
>>  wrote:
>>> On 12/11/2010 11:17 AM, Rudi Ahlers wrote:
>
>>> [..]
>>>
>>> --
>>
>>
>> True, I fully agree with you on that point. I myself don't like vendor
>> lock-ins. But, I do like the simplicity that off-the-shelf NAS devices
>> offer, as apposed to a DIY one. And they often offer more tools than a
>> normal Linux box with iSCSI as well.
>>
>
> hi all,
>
> [Disclaimer: I work for Scality]
>
> Rudi, first and foremost, i think you have to pinpoint the main 
> characteristic you want to improve on your system. It sounds like it is 
> scalablity. Beyond NAS, there is an array (no pun intended) of choices for 
> you to consider (scale-out NAS, SANs, clustered file system, dispersed 
> storage ...).
> But to figure out the best system for you, one needs to take a hard look at 
> the kind of data you're storing and at the access patterns and requirements 
> for those.
>
> For large systems (meaning > 100 TB and/or lots of files (of whichever 
> size)), getting rid of the filesystem layer is sometimes very efficient. 
> Hence, one should think about Object Storage as a solution. They bring 
> high-reliability and durability (through self-replication, self-healing) with 
> cost-efficiency using commodity hardware. It is the technology used by lots 
> of public cloud vendors to offer their service for cheap and still be 
> profitable. Those guys inherently need to be able to scale almost infinitely 
> (look at Amazon or Rackspace).
>
> If you are dealing with unstructured data as opposed to a relational database 
> for example; if there are millions and millions of objects that you need to 
> access quickly, object storage might be for you.
>
> Rudi, what do you actually store ? for what kind of service is the storage 
> layer used ? Are you storing emails/backups or hosting your employees' file 
> sharing service ? Basically, if  you're storing any kind of data which size 
> you know is likely to grow massively and you want to be able to:
>        1 - allow it
>        2 - afford it
>        3 - operate it at the lowest possible cost
> , then i would strongly suggest you look into object storage technology 
> (there are a couple of opensource options as well as vendor solutions).
>
> At scality, we have developed such an object store which scales smoothly up 
> to petabytes with off-the shelf servers logically brought together in a ring.
> While other solutions' performance usually degrade with time, our performance 
> is similar to a high-end SAN from the start and stays roughly the same as we 
> scale up to petabytes.
>
> We gracefully scale in capacity and performance by just adding nodes to the 
> system (without service interruption), so we're never limited by a box design 
> (either in maximum nb of drives or by network capacity).
>
> If you feel like you could use object storage to store your data, please have 
> a look at our technology and get in touch - http://bit.ly/fY6eMm
>
> Happy Holidays !
>
> -Marc Villemade
> http://linkd.in/heve30
>
>




Thanx, It's the first time I hear of the term "Object Storage". In all
honest, from a technical view point, how does this differ from NAS /
SAN's?

-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-12 Thread Marc Villemade

On Dec 11, 2010, at 5:34 PM, Rudi Ahlers wrote:

> On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman
>  wrote:
>> On 12/11/2010 11:17 AM, Rudi Ahlers wrote:

>> [..]
>> 
>> --
> 
> 
> True, I fully agree with you on that point. I myself don't like vendor
> lock-ins. But, I do like the simplicity that off-the-shelf NAS devices
> offer, as apposed to a DIY one. And they often offer more tools than a
> normal Linux box with iSCSI as well.
> 

hi all,

[Disclaimer: I work for Scality]

Rudi, first and foremost, i think you have to pinpoint the main characteristic 
you want to improve on your system. It sounds like it is scalablity. Beyond 
NAS, there is an array (no pun intended) of choices for you to consider 
(scale-out NAS, SANs, clustered file system, dispersed storage ...).
But to figure out the best system for you, one needs to take a hard look at the 
kind of data you're storing and at the access patterns and requirements for 
those. 

For large systems (meaning > 100 TB and/or lots of files (of whichever size)), 
getting rid of the filesystem layer is sometimes very efficient. Hence, one 
should think about Object Storage as a solution. They bring high-reliability 
and durability (through self-replication, self-healing) with cost-efficiency 
using commodity hardware. It is the technology used by lots of public cloud 
vendors to offer their service for cheap and still be profitable. Those guys 
inherently need to be able to scale almost infinitely (look at Amazon or 
Rackspace).

If you are dealing with unstructured data as opposed to a relational database 
for example; if there are millions and millions of objects that you need to 
access quickly, object storage might be for you.

Rudi, what do you actually store ? for what kind of service is the storage 
layer used ? Are you storing emails/backups or hosting your employees' file 
sharing service ? Basically, if  you're storing any kind of data which size you 
know is likely to grow massively and you want to be able to:
1 - allow it 
2 - afford it 
3 - operate it at the lowest possible cost
, then i would strongly suggest you look into object storage technology (there 
are a couple of opensource options as well as vendor solutions).

At scality, we have developed such an object store which scales smoothly up to 
petabytes with off-the shelf servers logically brought together in a ring.
While other solutions' performance usually degrade with time, our performance 
is similar to a high-end SAN from the start and stays roughly the same as we 
scale up to petabytes.

We gracefully scale in capacity and performance by just adding nodes to the 
system (without service interruption), so we're never limited by a box design 
(either in maximum nb of drives or by network capacity).

If you feel like you could use object storage to store your data, please have a 
look at our technology and get in touch - http://bit.ly/fY6eMm

Happy Holidays !

-Marc Villemade
http://linkd.in/heve30




___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-12 Thread Ed W

On 11/12/2010 16:17, Rudi Ahlers wrote:

If you use any NAS (or a SAN) devices, what do you use? And I'm
referring more to larger scale network storage than your home PC or
home theater system.

We've had very good experiences with our NetGear ReadyNAS devices but
I'm in the market for something new. The NetGear's aren't the cheapest
ones around but they do what it says on the box. My only real gripe
with them is the lack of decent scalability.

TheCus devices seems to be rather powerful as well, and you can stack
upto 5 units together. But that's where the line stops.


You said no HTPC systems and then listed a couple?

I would have thought at the 100TB level you would want to have the 
experience to manage the machine in house anyway?  You want to be 100% 
comfortable that when that machine goes down you can rescue it...


So I would suggest a Norco or Supermicro case - these go up to 30-36 
drives per physical box.  Then choose your favourite distro and get 
super comfortable with the ins and outs of LVM, linux raid and iscsi.  
Break it, fix it, break it, 


There is a growing amount of support for RAID6 as being far more 
"reliable" than RAID10 for a given set of parameters (and given 
definition of "reliable").  RAID10 is capable of far more IOPs though, 
so pick your poison...  I definitely buy the double parity argument 
though, so try and gain it somehow...  (The issue in practice seems to 
be that the first drive feels like "protection", but once it's failed 
it's ever so easy to have some kind of tiny error during recovery, eg 
unscrubbed array, unplug wrong drive, gremlin, second drive failure, etc)


I think you can buy a well supported Supermicro box with support from a 
well supported enterprise distro and still spend less than a mid-spec 
NAS at the level you are aiming at?  However, I would 100% concede that 
above the level of NAS boxes using off the shelf linux software there is 
a potentially large performance gap, eg a NetApp box should blow away 
your linux box (caveat - don't own a netapp box...)


Remember also that at this kind of storage level you need to be really 
sure what your goals are.  It's not so hard to get 100TB in a single 
chassis, but getting it "reliable" and "fast" (choose your own 
definition) is a tradeoff and much harder


Good luck - I love hearing about these larger projects, please send some 
feedback on your choices?


Ed W


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-11 Thread Rudi Ahlers
On Sat, Dec 11, 2010 at 6:27 PM, Joe Landman
 wrote:
> On 12/11/2010 11:17 AM, Rudi Ahlers wrote:
>
> [...]
>
>> I'm now looking for something that could scale beyond 100TB on one
>> device (not necessarily one unit though) and find it frustrating that
>> most NAS's come in 1U or 2U at most.
>
> Not meant as a commercial.  See http://scalableinformatics.com/sicluster
>
> Scales through many PB.  100TB isn't an issue, we have many in the field of
> this size or larger (and smaller).  Runs Gluster 3.x (3.1.x recommended).
>
>> Maybe I'm just not shopping around enough, or maybe I prefer to well
>> known brands, I don't know.
>
> The "well known" brands have well known limitations.  You have to decide if
> brand name is more important than meeting the specific needs.
>
>
> --


True, I fully agree with you on that point. I myself don't like vendor
lock-ins. But, I do like the simplicity that off-the-shelf NAS devices
offer, as apposed to a DIY one. And they often offer more tools than a
normal Linux box with iSCSI as well.


-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] What NAS device(s) do you use? And why?

2010-12-11 Thread Joe Landman

On 12/11/2010 11:17 AM, Rudi Ahlers wrote:

[...]


I'm now looking for something that could scale beyond 100TB on one
device (not necessarily one unit though) and find it frustrating that
most NAS's come in 1U or 2U at most.


Not meant as a commercial.  See http://scalableinformatics.com/sicluster

Scales through many PB.  100TB isn't an issue, we have many in the field 
of this size or larger (and smaller).  Runs Gluster 3.x (3.1.x 
recommended).



Maybe I'm just not shopping around enough, or maybe I prefer to well
known brands, I don't know.


The "well known" brands have well known limitations.  You have to decide 
if brand name is more important than meeting the specific needs.



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
   http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] What NAS device(s) do you use? And why?

2010-12-11 Thread Rudi Ahlers
If you use any NAS (or a SAN) devices, what do you use? And I'm
referring more to larger scale network storage than your home PC or
home theater system.

We've had very good experiences with our NetGear ReadyNAS devices but
I'm in the market for something new. The NetGear's aren't the cheapest
ones around but they do what it says on the box. My only real gripe
with them is the lack of decent scalability.

TheCus devices seems to be rather powerful as well, and you can stack
upto 5 units together. But that's where the line stops.

I'm now looking for something that could scale beyond 100TB on one
device (not necessarily one unit though) and find it frustrating that
most NAS's come in 1U or 2U at most.

Maybe I'm just not shopping around enough, or maybe I prefer to well
known brands, I don't know.


-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users