Thanks! && Thanks!
On Mon, Aug 18, 2014 at 1:14 PM, Josh Elser <[email protected]> wrote: > I think Billie's project is one of our "examples" -- > http://accumulo.apache.org/1.6/examples/dirlist.html > > > On 8/18/14, 1:05 PM, Adam Fuchs wrote: > >> Joe, >> >> I would say that a rule of thumb would be tens of megabytes for a single >> cell. There are two limits that affect this: >> >> 1) Amount of memory used: This includes ingesting into the batchwriter, >> buffering in the in-memory maps, scanning RFiles, and preparing query >> responses. At any given point, there could be a few copies of the cell >> hanging out in memory, so you don't want to pack things too tightly. If >> you >> have ridiculous amounts of memory then you can squeeze in some pretty >> large >> docs. >> 2) Message size for client/server communication: This is limited to 1G by >> default, but can be increased if needed. A single key/value pair will not >> be fragmented across these message frames. >> >> Whether to store bigger files in fragmented cells or as references to HDFS >> files typically has to do with security and lifecycle management. If you >> want cell-level security and encryption protection, you'll probably want >> to >> go with a fragmented key/value approach. If you want to keep all of your >> data in one spot for easier management you might also prefer to fragment >> the files in Accumulo. Otherwise sticking it in HDFS and storing a >> reference is a pretty simple and good solution. >> >> Billie did a project a while ago to fragment and store larger files in >> Accumulo. I'm not sure what happened with that, but it might be out there >> somewhere for you to use. >> >> Cheers, >> Adam >> >> >> >> On Mon, Aug 18, 2014 at 11:36 AM, Joe Stein <[email protected]> wrote: >> >> Hi, for Accumulo is there a recommended max for column value size? So if >>> want to store files at what point do we have to split the file into parts >>> or (rather) just store it in HDFS with a reference path to it? >>> >>> /******************************************* >>> Joe Stein >>> Founder, Principal Consultant >>> Big Data Open Source Security LLC >>> http://www.stealth.ly >>> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> >>> ********************************************/ >>> >>> >>
