RE: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-27 Thread Wei Tan
Image :) Best regards, Wei - Wei Tan, PhD Research Staff Member IBM T. J. Watson Research Center http://researcher.ibm.com/person/us-wtan From: Vladimir Rodionov vrodio...@carrieriq.com To: user@hbase.apache.org user@hbase.apache.org, Date: 02/27/2014

Re: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-26 Thread Wei Tan
I am thinking of storing medium sized objects (~1M) using HBase. The advantage of using HBase rather than HBase (storing pointers) + HDFS, in my mind, is: data locality. When I want to run analytics, I will access these objects using HBase scan, and HBase stores KVs in a sequential manner. If I

RE: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-26 Thread Vladimir Rodionov
What type of analytics are you going to do on medium sized objects (1M)? Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Wei Tan [w...@us.ibm.com] Sent: Wednesday, February

Re: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-25 Thread shashwat shriparv
Yes for sure you can use hbase for this, you can have 1. different fields of mail in different column of a column family and attachment as a binary array also in a column. 2. you can keep whole message in columns in hbase and the attachments are large enoug on the hdfs and some reference to it in

Re: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-25 Thread Ameya Kanitkar
The only other thing I'd add is, by default HBase caps size of the data per column at 10 MB (I think). You can change that by changing this setting: hbase.client.keyvalue.maxsize in hbase-site.xml -1 means no cap. You can put other numbers for appropriate cap for your use case. Ameya On Tue,

Re: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-25 Thread Ted Yu
Minor: Value 0 also means no cap - see HTable#validatePut() if (maxKeyValueSize 0) { ... if (kv.getLength() maxKeyValueSize) { throw new IllegalArgumentException(KeyValue size too large); } On Tue, Feb 25, 2014 at 11:52 AM, Ameya Kanitkar

RE: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-25 Thread Vladimir Rodionov
Usually, it is not advisable to store such a large values in HBase (to avoid excessive IO during compaction). Keep them in a separate files in HDFS and store in HBase only references. To overcome inherent max file number limitation of NN you can bulk several values into a single file (you will

Re: Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-25 Thread Upendra Yadav
Me too realize same what you suggest...: (Keep them in a separate files in HDFS and store in HBase only references) will try several attachments into a single file... And Thanks a lot... On Wed, Feb 26, 2014 at 1:45 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: Usually, it is not

Is HBase is feasible for storing 4-5 MB of data as cell value

2014-02-24 Thread Upendra Yadav
I have to use hbase and have mix type of data Some of them have size 1-4K(Mail- Header) and others 5MB(Attachments...) And also we need only random access: any data Is HBase is feasible for storing this type of data What will be my schema design - will have to go with 2 different Table -