Ideas for Big Data Support

AJ Thu, 09 Jun 2011 01:24:07 -0700

[Please feel free to correct me on anything or suggest other workaroundsthat could be employed now to help.]


Hello,

This is purely theoretical, as I don't have a big working cluster yetand am still in the planning stages, but from what I understand, whileCass scales well horizontally, EACH node will not be able to handle wella data store in the terabyte range... for reasons that areunderstandable such as simple hardware and bandwidth limitations. But,looking forward and pushing the envelope, I think there might be ways toat least manage these issues until broadband speeds, disk and memorytechnology catches up.


The biggest issues with big data clusters that I am currently aware are:

> disk I/O probs during major compaction and repairs.
> Bandwidth limitations during new node commissioning.

Here are a few ideas I've thought of:

1.)  Load-balancing:

During a major compaction or repair or other similar severe performanceimpacting processes, allow the node to broadcast that it is temporarilyunavailable so requests for data can be sent to other nodes in thecluster. The node could still "wake-up" and pause or cancel it'scompaction in the case of a failed node whereby there are no other nodesthat can provide the data requested. The node could be considered as"degraded" by other nodes, rather than down. (As a matter of fact, ageneral load-balancing scheme could be devised if each node broadcastsit's current load level and maybe even hop-count between data centers.)


2.)  Data Transplants:

Since commissioning a new node that is due to receive data in the TBrange (data xfer could take days or weeks), it would be much moreefficient to just courier the data. Perhaps the SSTables (maybe from asnapshot) could be transplanted from one production node into a new nodeto help jump-start the bootstrap process. The new node could sortthings out during the bootstrapping phase so that it is balancedcorrectly as if it had started out with no data as usual. If this couldcut down on half the bandwidth, that would be a great benefit. However,this would work well mostly if the transplanted data came from akeyspace that used a random partitioner; coming from an orderedpartioner may not be so helpful if the rows in the transplanted datawould never be used in the new node.


3.)  Strategic Partitioning:

Of course, there are surely other issues to contend with, such as RAMrequirements for caching purposes. That may be managed by a partitionstrategy that allows certain nodes to specialize in a certain subset ofthe data, such as geographically or whatever the designer chooses.Replication would still be done as usual but this may help the cache tobe better utilized by allowing it to focus on the subset of data thatcomprises the majority of the node's data versus a random sampling ofthe entire cluster. IOW, while a node may specialize in a certainsubset and also contain replicated rows from outside that subset, itwill still only (mostly) be queried for data from within it's subset andthus the cache will contain mostly data from this special subset whichcould increase the hit rate of the cache.

This may not be a huge help for TB sized data nodes since the even 32 GBof RAM would still be relatively tiny in comparison to the data size,but I include it just in case it spurs other ideas. Also, I do not knowhow Cass decides on which node to query for data in the first place...maybe not the best idea.


4.)  Compressed Columns:

Some sort of data compression of certain columns could be very helpfulespecially since text can be compressed to less than 50% if theconditions are right. Overall native disk compression will not help thebandwidth issue since the data would be decompressed before transit. Ifthe data was stored compressed, then Cass could even send the data tothe client compressed so as to offload the decompression to the client.Likewise, during node commission, the data would never have to bedecompressed saving on CPU and BW. Alternately, a client could tellCass to decompress the data before transmit if needed. This, combinedwith idea #1 (transplants) could help speed-up new node bootstraping,but only when a large portion of the data consists of very large columnvalues and thus compression is practical and efficient. Of course, theclient could handle all the compression today without Cass even knowingabout it, so building this into Cass would be just a convenience, butstill nice to have, nonetheless.


5.)  Postponed Major Compactions:

The option to postpone auto-triggered major compactions until apre-defined time of day or week or until staff can do it manually.

6.?) Finally, some have suggested just using more nodes with less datastorage which may solve most if not all of these problems. But, I'mstill fuzzy on that. The trade-offs would be more infrastructure andmaintenance costs, higher chance that a server will fail... maybe higherbandwidth between nodes due to a large cluster??? I need more clarityon this alternative. Imagine a total data size of 100 TBs and thechoice between 200 nodes or 50. What is the cost of more nodes; allthings being equal?

Please contribute additional ideas and strategies/patterns for thebenefit of all!


Thanks for listening and keep up the good work guys!

Ideas for Big Data Support

Reply via email to