How can I set column information when I use YCSB to test HBase?

2013-01-18 Thread yonghu
Dear all, I read the information of https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload For example, I can indicate the column family name when I issue the command line as java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P workload

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-18 Thread Ted Yu
To my knowledge CDH-4.1.2 is based on HBase 0.92.x Looks like you were using patch from HBASE-6509 which was integrated to trunk only. Please confirm. Copying Alex who wrote the patch. Cheers On Fri, Jan 18, 2013 at 3:28 PM, Eugeny Morozov wrote: > Hi, folks! > > HBase, Hadoop, etc version is

Re: Reagrding HBase Hadoop multiple scan objects issue

2013-01-18 Thread Doug Meil
Hi there- You probably want to review this section of the RegGuide: http://hbase.apache.org/book.html#mapreduce re: "it's inefficient to have one scan object to scan everything." It is. But in the MapReduce case, there is a Map-task for each input split (see the RefGuide for details), and th

Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-18 Thread Eugeny Morozov
Hi, folks! HBase, Hadoop, etc version is CDH-4.1.2 I'm using custom FuzzyRowFilter, which I get from http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/and suddenly after quite a time we found that it starts loosing data. Basically the i

Reagrding HBase Hadoop multiple scan objects issue

2013-01-18 Thread Xu, Leon
Hi HBase users, I am currently trying to set up a denormalization map-reduce job for my HBase Table. Since our table contains large volume of data, it's inefficient to have one scan object to scan everything. We are only need to process those records that have changes. I am planning to have mul

Spring for hadoop

2013-01-18 Thread Panshul Whisper
Hello, I was wondering if anyone is using spring for hadoop to execute map reduce jobs or to perform hbase operations on a hadoop cluster using spring data for hadoop. Please suggest me a working example as I am unable to find any working sample and spring data documentation is of no use for begin

Re: Hbase heap size

2013-01-18 Thread lars hofhansl
That is true. Mind telling us more about your setup?I think that would be interesting knowledge. -- Lars From: Adrien Mogenet To: user@hbase.apache.org Sent: Friday, January 18, 2013 12:28 PM Subject: Re: Hbase heap size On Fri, Jan 18, 2013 at 3:24 AM, la

Re: Hbase heap size

2013-01-18 Thread Varun Sharma
I meant controlling compaction activity by emitting fewer hfiles but of larger size. On Fri, Jan 18, 2013 at 12:28 PM, Adrien Mogenet wrote: > On Fri, Jan 18, 2013 at 3:24 AM, lars hofhansl wrote: > > > - The largest useful region size is 20G (at least that is the current > > common tribal knowl

Re: Loading data, hbase slower than Hive?

2013-01-18 Thread Doug Meil
Hi there, See this section of the HBase RefGuide for information about bulk loading. http://hbase.apache.org/book.html#arch.bulk.load On 1/18/13 12:57 PM, "praveenesh kumar" wrote: >Hey, >Can someone throw some pointers on what would be the best practice for >bulk >imports in hbase ? >Th

Re: Loading data, hbase slower than Hive?

2013-01-18 Thread praveenesh kumar
Hey, Can someone throw some pointers on what would be the best practice for bulk imports in hbase ? That would be really helpful. Regards, Praveenesh On Thu, Jan 17, 2013 at 11:16 PM, Mohammad Tariq wrote: > Just to add to whatever all the heavyweights have said above, your MR job > may not be

Re: Hbase heap size

2013-01-18 Thread Varun Sharma
Thanks, Lars ! In my case, the amount of data on disk is a lot lower so I can do with fewer regions. Neverthless, even if i set the flush cache too large - the memstore lowerLimit and memstore upperLimit will cause flushes before we need a lot of heap to support all the memstores. But then probabl

Re: Slow start of HBase operations with YCSB, possibly because of zookeeper ?

2013-01-18 Thread Ted Yu
Thanks for sharing, Akshay. I think the solution should be part of hbase reference guide. On Fri, Jan 18, 2013 at 7:55 AM, Akshay Singh wrote: > I found the problem, so I thought I would post it here for future > reference. > > The problem was IPv6 enabled network. Though IPv6 in HDFS ( > HADOO

Re: Slow start of HBase operations with YCSB, possibly because of zookeeper ?

2013-01-18 Thread Akshay Singh
I found the problem, so I thought I would post it here for future reference. The problem was IPv6 enabled network. Though IPv6 in HDFS ( HADOOP_OPTS=-Djava.net.preferIPv4Stack=true), and in HBase ( -Djava.net.preferIPv4Stack=true) was already disabled, but for some of the machines in cluster IP

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-18 Thread Doug Meil
Hi there, I'd recommend reading the Schema Design chapter in the RefGuide because there are some good tips and hard-learned lessons. http://hbase.apache.org/book.html#schema Also, all your examples use composite row keys (not a surprise, a very common pattern) and one thing I would like to dra

RE: Hbase heap size

2013-01-18 Thread Chalcy Raja
Looking forward to the blog! Thanks, Chalcy -Original Message- From: lars hofhansl [mailto:la...@apache.org] Sent: Thursday, January 17, 2013 9:24 PM To: user@hbase.apache.org Subject: Re: Hbase heap size You'llĀ  need more memory then, or more machines with not much disk attached. You