Re: maximum storage per node
Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? spinning disk and 1Gbe networking, rule of thumb was 300Gb to 500GB. SSD or very fast local disk, 10Gbe networking, optionally JBOD, cassandra 1.2 and vnodes people are talking about multiple TB's per node. If there are limitations, what are the reasons for it? The main issues were: * As discussed potentially very long compaction * As discussed repair taking a very long time to calculate the merkle trees. * Potentially taking a very long time to rebuild a new node after one totally fails. vNodes address this by increasing the number of nodes that can stream data to one bootstrapping. This is really something that has to fit into your operations. Hope that helps. - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/07/2013, at 5:00 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Jul 26, 2013 at 4:23 AM, Romain HARDOUIN romain.hardo...@urssaf.fr wrote: Do you have some fairly complex queries to run against your data? Or your need is just to store large pieces of data? (In which case Object Storage like OpenStack Swift could be more appropriate IMHO) Or distributed blob storage like MogileFS. https://code.google.com/p/mogilefs/ =Rob
Re: maximum storage per node
I dont think it is a good idea to put multiple instance in same machine. You may lose multiple instances at the same time if the machine goes down. You can also specify multiple directories as storage in 1.2. I am also not sure boot-strapping will be a big problem since the number keys you will store is relatively small. Why didnt you partition your data according to time instead of using your own compactor? Cem On Fri, Jul 26, 2013 at 3:50 AM, sankalp kohli kohlisank...@gmail.comwrote: Try putting multiple instances per machine with each instance mapped to its own disk. This might not work with v-nodes On Thu, Jul 25, 2013 at 9:04 AM, Pruner, Anne (Anne) pru...@avaya.comwrote: I actually wrote my own compactor that deals with this problem. ** ** Anne ** ** *From:* cem [mailto:cayiro...@gmail.com] *Sent:* Thursday, July 25, 2013 11:59 AM *To:* user@cassandra.apache.org *Subject:* Re: maximum storage per node ** ** You will suffer from long compactions if you are planning to get rid of from old records by TTL. ** ** Best Regards, Cem. ** ** On Thu, Jul 25, 2013 at 5:51 PM, Kanwar Sangha kan...@mavenir.com wrote: Issues with large data nodes would be – · Nodetool repair will be impossible to run · Your read i/o will suffer since you will almost always go to disk (each read will take 3 IOPS worst case) · Boot-straping the node in case of failures will take days/weeks *From:* Pruner, Anne (Anne) [mailto:pru...@avaya.com] *Sent:* 25 July 2013 10:45 *To:* user@cassandra.apache.org *Subject:* RE: maximum storage per node We’re storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what’s recommended? BTW, we’re using Cassandra 1.2 Anne *From:* cem [mailto:cayiro...@gmail.com cayiro...@gmail.com] *Sent:* Thursday, July 25, 2013 11:41 AM *To:* user@cassandra.apache.org *Subject:* Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne ** **
RE: maximum storage per node
Do you have some fairly complex queries to run against your data? Or your need is just to store large pieces of data? (In which case Object Storage like OpenStack Swift could be more appropriate IMHO) Pruner, Anne (Anne) pru...@avaya.com a écrit sur 25/07/2013 17:45:11 : De : Pruner, Anne (Anne) pru...@avaya.com A : user@cassandra.apache.org user@cassandra.apache.org, Date : 25/07/2013 17:45 Objet : RE: maximum storage per node We’re storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what’s recommended? BTW, we’re using Cassandra 1.2 Anne From: cem [mailto:cayiro...@gmail.com] Sent: Thursday, July 25, 2013 11:41 AM To: user@cassandra.apache.org Subject: Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne
Re: maximum storage per node
On Fri, Jul 26, 2013 at 4:23 AM, Romain HARDOUIN romain.hardo...@urssaf.frwrote: Do you have some fairly complex queries to run against your data? Or your need is just to store large pieces of data? (In which case Object Storage like OpenStack Swift could be more appropriate IMHO) Or distributed blob storage like MogileFS. https://code.google.com/p/mogilefs/ =Rob
maximum storage per node
Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne
Re: maximum storage per node
Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.comwrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? ** ** Thanks, Anne
RE: maximum storage per node
We're storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what's recommended? BTW, we're using Cassandra 1.2 Anne From: cem [mailto:cayiro...@gmail.com] Sent: Thursday, July 25, 2013 11:41 AM To: user@cassandra.apache.org Subject: Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.commailto:pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne
RE: maximum storage per node
Issues with large data nodes would be - * Nodetool repair will be impossible to run * Your read i/o will suffer since you will almost always go to disk (each read will take 3 IOPS worst case) * Boot-straping the node in case of failures will take days/weeks From: Pruner, Anne (Anne) [mailto:pru...@avaya.com] Sent: 25 July 2013 10:45 To: user@cassandra.apache.org Subject: RE: maximum storage per node We're storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what's recommended? BTW, we're using Cassandra 1.2 Anne From: cem [mailto:cayiro...@gmail.com] Sent: Thursday, July 25, 2013 11:41 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.commailto:pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne
Re: maximum storage per node
You will suffer from long compactions if you are planning to get rid of from old records by TTL. Best Regards, Cem. On Thu, Jul 25, 2013 at 5:51 PM, Kanwar Sangha kan...@mavenir.com wrote: Issues with large data nodes would be – ** ** **· **Nodetool repair will be impossible to run **· **Your read i/o will suffer since you will almost always go to disk (each read will take 3 IOPS worst case) **· **Boot-straping the node in case of failures will take days/weeks ** ** ** ** *From:* Pruner, Anne (Anne) [mailto:pru...@avaya.com] *Sent:* 25 July 2013 10:45 *To:* user@cassandra.apache.org *Subject:* RE: maximum storage per node ** ** We’re storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. ** ** What is the reason for the limit? Does something fail after that? ** ** If there are hardware issues, what’s recommended? ** ** BTW, we’re using Cassandra 1.2 ** ** Anne ** ** *From:* cem [mailto:cayiro...@gmail.com cayiro...@gmail.com] *Sent:* Thursday, July 25, 2013 11:41 AM *To:* user@cassandra.apache.org *Subject:* Re: maximum storage per node ** ** Between 500GB - 1TB is recommended. ** ** But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? ** ** Best Regards, Cem ** ** On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne ** **
RE: maximum storage per node
I actually wrote my own compactor that deals with this problem. Anne From: cem [mailto:cayiro...@gmail.com] Sent: Thursday, July 25, 2013 11:59 AM To: user@cassandra.apache.org Subject: Re: maximum storage per node You will suffer from long compactions if you are planning to get rid of from old records by TTL. Best Regards, Cem. On Thu, Jul 25, 2013 at 5:51 PM, Kanwar Sangha kan...@mavenir.commailto:kan...@mavenir.com wrote: Issues with large data nodes would be - * Nodetool repair will be impossible to run * Your read i/o will suffer since you will almost always go to disk (each read will take 3 IOPS worst case) * Boot-straping the node in case of failures will take days/weeks From: Pruner, Anne (Anne) [mailto:pru...@avaya.commailto:pru...@avaya.com] Sent: 25 July 2013 10:45 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: maximum storage per node We're storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what's recommended? BTW, we're using Cassandra 1.2 Anne From: cem [mailto:cayiro...@gmail.com] Sent: Thursday, July 25, 2013 11:41 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.commailto:pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne
Re: maximum storage per node
Try putting multiple instances per machine with each instance mapped to its own disk. This might not work with v-nodes On Thu, Jul 25, 2013 at 9:04 AM, Pruner, Anne (Anne) pru...@avaya.comwrote: I actually wrote my own compactor that deals with this problem. ** ** Anne ** ** *From:* cem [mailto:cayiro...@gmail.com] *Sent:* Thursday, July 25, 2013 11:59 AM *To:* user@cassandra.apache.org *Subject:* Re: maximum storage per node ** ** You will suffer from long compactions if you are planning to get rid of from old records by TTL. ** ** Best Regards, Cem. ** ** On Thu, Jul 25, 2013 at 5:51 PM, Kanwar Sangha kan...@mavenir.com wrote: Issues with large data nodes would be – · Nodetool repair will be impossible to run · Your read i/o will suffer since you will almost always go to disk (each read will take 3 IOPS worst case) · Boot-straping the node in case of failures will take days/weeks* *** *From:* Pruner, Anne (Anne) [mailto:pru...@avaya.com] *Sent:* 25 July 2013 10:45 *To:* user@cassandra.apache.org *Subject:* RE: maximum storage per node We’re storing fairly large files (about 1MB apiece) for a few months and then deleting the oldest to get more space to add new ones. We have large requirements (maybe up to 100 TB), so having a 1TB limit would be unworkable. What is the reason for the limit? Does something fail after that? If there are hardware issues, what’s recommended? BTW, we’re using Cassandra 1.2 Anne *From:* cem [mailto:cayiro...@gmail.com cayiro...@gmail.com] *Sent:* Thursday, July 25, 2013 11:41 AM *To:* user@cassandra.apache.org *Subject:* Re: maximum storage per node Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.com wrote: Does anyone have opinions on the maximum amount of data reasonable to store on one Cassandra node? If there are limitations, what are the reasons for it? Thanks, Anne ** **