Using HBase data
Hi, I am starting to think of a new project using Hadoop and Hbase as my persistent store. But I am quite confused as to how to use these HBASE data. 1. Can these HBASE data be used in web applications. Meaning retrieving the data and showing it on the web page. Can somebody please suggest how HBASE data is used by other companies. Some use case links would certainly be helpful. Regards Shashi
Aw: Re: replication verifyrep
Hi Jean-Daniel, thank you for your answer and bring some light into the darkness. You can see the bad rows listed in the user logs for your MR job. What log do you mean. The output from the command line? I only see the count of GOOD or BAD rows. Are the bad rows listed in that log which are not replicated? Regards Hansi Gesendet: Montag, 14. April 2014 um 19:25 Uhr Von: Jean-Daniel Cryans jdcry...@apache.org An: user@hbase.apache.org user@hbase.apache.org Betreff: Re: replication verifyrep Yeah you should use endtime, it was fixed as part of https://issues.apache.org/jira/browse/HBASE-10395. You can see the bad rows listed in the user logs for your MR job. J-D On Mon, Apr 14, 2014 at 3:06 AM, Hansi Klose hansi.kl...@web.de wrote: Hi, I wrote a little script which should control the running replication. The script is triggered by cron and executes the following command with the actual time stamp in endtime and a time stamp = endtime - 1080 milli seconds. So the time frame is 3 hours. hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397228401927 --families=t 1 tablename 21 After some running's the script found some BADROWS. 14/04/11 17:04:05 INFO mapred.JobClient: BADROWS=176 14/04/11 17:04:05 INFO mapred.JobClient: GOODROWS=2 I executed the same command 20 Minutes later in the shell and got : hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397228401927 --families=t 1 tablename 21 14/04/11 17:21:03 INFO mapred.JobClient: BADROWS=178 After that I run the command with the same start time and the actual timestamp an end time, so the time frame is greater but with the same start time. And now I got : hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397230074876 --families=t 1 tablename 21 14/04/11 17:28:28 INFO mapred.JobClient: GOODROWS=184 Is there something wrong with the command? In our metrics i could not see that three is an Issue at that time. We are a little bit confused about the endtime. In all documents they talk about stoptime. But we found that in the job configuration there is no parameter called stoptime. We found the verifyrep.startTime which hold the value of the starttime in our command and verifyrep.endTime which is alway 0 when we use stoptime in the command. So we decided to use endtime Even in the code http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.html they use: static long endTime = Long.MAX_VALUE; Which name is the right on? endtime or stoptime? We use cdh 4.2.0. Regards Hansi
Weird behavior splitting regions
I have a table in Hbase that sizes around 96Gb, I generate 4 regions of 30Gb. Some time, table starts to split because the max size for region is 1Gb (I just realize of that, I'm going to change it or create more pre-splits.). There're two things that I don't understand. how is it creating the splits? right now I have 130 regions and growing. The problem is the size of the new regions: 1.7 M/hbase/filters/4ddbc34a2242e44c03121ae4608788a2 1.6 G/hbase/filters/548bdcec79cfe9a99fa57cb18f801be2 3.1 G/hbase/filters/58b50df089bd9d4d1f079f53238e060d 2.5 M/hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f 1.9 G/hbase/filters/5b0a35b5735a473b7e804c4b045ce374 883.4 M /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c 1.7 M/hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7 632.4 M /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2 There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. One, I create the table and insert data, I don't insert new data or modify them. Another interested point it's why there're major compactions: 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c to hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,407 INFO org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:* Completed major compaction of 1 file*(s) in d of filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is 789.1 M 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4., storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060; duration=7sec I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. I'm working with 0.94.6 CDH44. I'm going to change the size of the regions, but, I would like to understand why things happen. Thank you.
Re: Weird behavior splitting regions
There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. When hbase does a split, it doesn't actually split at the disk/file level. Its just a metadata operation which creates new regions that contain the reference files that still point to old HFiles. That is the reason you find KB size regions. I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. IIRC sometimes minor compactions get promoted to major compactions based on some criteria, but I'll leave it for others to answer! On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz konstt2...@gmail.comwrote: I have a table in Hbase that sizes around 96Gb, I generate 4 regions of 30Gb. Some time, table starts to split because the max size for region is 1Gb (I just realize of that, I'm going to change it or create more pre-splits.). There're two things that I don't understand. how is it creating the splits? right now I have 130 regions and growing. The problem is the size of the new regions: 1.7 M/hbase/filters/4ddbc34a2242e44c03121ae4608788a2 1.6 G/hbase/filters/548bdcec79cfe9a99fa57cb18f801be2 3.1 G/hbase/filters/58b50df089bd9d4d1f079f53238e060d 2.5 M/hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f 1.9 G/hbase/filters/5b0a35b5735a473b7e804c4b045ce374 883.4 M /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c 1.7 M/hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7 632.4 M /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2 There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. One, I create the table and insert data, I don't insert new data or modify them. Another interested point it's why there're major compactions: 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c to hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,407 INFO org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:* Completed major compaction of 1 file*(s) in d of filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is 789.1 M 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4., storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060; duration=7sec I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. I'm working with 0.94.6 CDH44. I'm going to change the size of the regions, but, I would like to understand why things happen. Thank you. -- Bharath Vissapragada http://www.cloudera.com
Re: Weird behavior splitting regions
The default split policy in hbase0.94.x is IncreaseToUpperBound rather than ConstantSizeSplitPolicy which was the default in the older versions of hbase. Please refer to the link given below to understand how a IncreaseToUpperBoundSplitPolicy works: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ check the auto-splitting section Hope this answers your question Thanks Divye Sheth On Tue, Apr 15, 2014 at 3:36 PM, Bharath Vissapragada bhara...@cloudera.com wrote: There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. When hbase does a split, it doesn't actually split at the disk/file level. Its just a metadata operation which creates new regions that contain the reference files that still point to old HFiles. That is the reason you find KB size regions. I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. IIRC sometimes minor compactions get promoted to major compactions based on some criteria, but I'll leave it for others to answer! On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz konstt2...@gmail.com wrote: I have a table in Hbase that sizes around 96Gb, I generate 4 regions of 30Gb. Some time, table starts to split because the max size for region is 1Gb (I just realize of that, I'm going to change it or create more pre-splits.). There're two things that I don't understand. how is it creating the splits? right now I have 130 regions and growing. The problem is the size of the new regions: 1.7 M/hbase/filters/4ddbc34a2242e44c03121ae4608788a2 1.6 G/hbase/filters/548bdcec79cfe9a99fa57cb18f801be2 3.1 G/hbase/filters/58b50df089bd9d4d1f079f53238e060d 2.5 M/hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f 1.9 G/hbase/filters/5b0a35b5735a473b7e804c4b045ce374 883.4 M /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c 1.7 M/hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7 632.4 M /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2 There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. One, I create the table and insert data, I don't insert new data or modify them. Another interested point it's why there're major compactions: 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c to hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,407 INFO org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:* Completed major compaction of 1 file*(s) in d of filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is 789.1 M 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4., storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060; duration=7sec I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. I'm working with 0.94.6 CDH44. I'm going to change the size of the regions, but, I would like to understand why things happen. Thank you. -- Bharath Vissapragada http://www.cloudera.com
Re: Using HBase data
Please take a look at https://m.facebook.com/UsingHbase?id=191660807562816refsrc=https%3A%2F%2Fwww.facebook.com%2FUsingHbase On Apr 14, 2014, at 11:20 PM, Shashidhar Rao raoshashidhar...@gmail.com wrote: Hi, I am starting to think of a new project using Hadoop and Hbase as my persistent store. But I am quite confused as to how to use these HBASE data. 1. Can these HBASE data be used in web applications. Meaning retrieving the data and showing it on the web page. Can somebody please suggest how HBASE data is used by other companies. Some use case links would certainly be helpful. Regards Shashi
Re: Weird behavior splitting regions
I read the article, that's why I typed the question, because I didn't understand the result I got. Oh, yes!!, that's true, so silly. I think some of the files are pretty small because the table has two families and one of them is much smaller than the another one. So, it has been splitted many times. The big regions get a size close to 1Gb, but the smaller regions has a final size pretty small because they have been splitted a lot of times. What I don't know, it's why HBase decides to split the table so late, not when I create the table presplitted if not, two hours later or whatever. Anyway, that's my error, I'm just curious about it. 2014-04-15 12:17 GMT+02:00 divye sheth divs.sh...@gmail.com: The default split policy in hbase0.94.x is IncreaseToUpperBound rather than ConstantSizeSplitPolicy which was the default in the older versions of hbase. Please refer to the link given below to understand how a IncreaseToUpperBoundSplitPolicy works: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ check the auto-splitting section Hope this answers your question Thanks Divye Sheth On Tue, Apr 15, 2014 at 3:36 PM, Bharath Vissapragada bhara...@cloudera.com wrote: There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. When hbase does a split, it doesn't actually split at the disk/file level. Its just a metadata operation which creates new regions that contain the reference files that still point to old HFiles. That is the reason you find KB size regions. I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. IIRC sometimes minor compactions get promoted to major compactions based on some criteria, but I'll leave it for others to answer! On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz konstt2...@gmail.com wrote: I have a table in Hbase that sizes around 96Gb, I generate 4 regions of 30Gb. Some time, table starts to split because the max size for region is 1Gb (I just realize of that, I'm going to change it or create more pre-splits.). There're two things that I don't understand. how is it creating the splits? right now I have 130 regions and growing. The problem is the size of the new regions: 1.7 M/hbase/filters/4ddbc34a2242e44c03121ae4608788a2 1.6 G/hbase/filters/548bdcec79cfe9a99fa57cb18f801be2 3.1 G/hbase/filters/58b50df089bd9d4d1f079f53238e060d 2.5 M/hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f 1.9 G/hbase/filters/5b0a35b5735a473b7e804c4b045ce374 883.4 M /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c 1.7 M/hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7 632.4 M /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2 There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table. One, I create the table and insert data, I don't insert new data or modify them. Another interested point it's why there're major compactions: 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c to hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,407 INFO org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:* Completed major compaction of 1 file*(s) in d of filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is 789.1 M 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4., storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060; duration=7sec I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. I'm working with 0.94.6 CDH44. I'm going to change the size of the regions, but, I would like to understand why things happen. Thank you. -- Bharath Vissapragada http://www.cloudera.com
All regions stay on two nodes out of 18 nodes
I am using HDP 2.0.6, which has 18 nodes(region servers). One of my HBase tables has 50 regions but I found that the 50 regions all stay in just two nodes, not spread evenly in the 18 nodes. I did not pre-create splits so this table was gradually split into 50 regions itself. I'd like to know why all the regions stay in just two nodes, not the 18 nodes of the cluster, and how to spread the regions evenly across all the region servers. Thanks.
Re: All regions stay on two nodes out of 18 nodes
Check if hbase balancer is on. $hbase_shell balance_switch true Run the balancer from the hbase shell $hbase_shell balancer If the above command returns false check for any regions in transition on the HMaster UI or check HMaster logs. Thanks Divye Sheth On Tue, Apr 15, 2014 at 5:10 PM, Tao Xiao xiaotao.cs@gmail.com wrote: I am using HDP 2.0.6, which has 18 nodes(region servers). One of my HBase tables has 50 regions but I found that the 50 regions all stay in just two nodes, not spread evenly in the 18 nodes. I did not pre-create splits so this table was gradually split into 50 regions itself. I'd like to know why all the regions stay in just two nodes, not the 18 nodes of the cluster, and how to spread the regions evenly across all the region servers. Thanks.
Re: All regions stay on two nodes out of 18 nodes
Is load balancer enabled ? Can you grep this table in master log and pastebin what you found ? Cheers On Apr 15, 2014, at 4:40 AM, Tao Xiao xiaotao.cs@gmail.com wrote: I am using HDP 2.0.6, which has 18 nodes(region servers). One of my HBase tables has 50 regions but I found that the 50 regions all stay in just two nodes, not spread evenly in the 18 nodes. I did not pre-create splits so this table was gradually split into 50 regions itself. I'd like to know why all the regions stay in just two nodes, not the 18 nodes of the cluster, and how to spread the regions evenly across all the region servers. Thanks.
Re: hbase exception: Could not reseek StoreFileScanner
bq. HFileScanner for reader reader=hdfs:// 192.168.11.150:8020/hbase/vc2.in_link/6http://192.168.11.150:8020/hbase/vc2.in_link/6b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75 b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75http://192.168.11.150:8020/hbase/vc2.in_link/6b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75 You can find, from master log, which region server hosted 6http://192.168.11.150:8020/hbase/vc2.in_link/6b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75 b879cb43205cdae084a280c38fab34ahttp://192.168.11.150:8020/hbase/vc2.in_link/6b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75 Then you can check the region server log on that server. On Mon, Apr 14, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: where to find the server log? I mean there are many region servers. should I find one by one? On Tue, Apr 15, 2014 at 12:31 PM, lars hofhansl la...@apache.org wrote: Thanks. Was there anything in the server logs at that time? The client does not report the full stack trace. I have not seen this one before. I assume HDFS was running at the time... -- Lars From: Li Li fancye...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Monday, April 14, 2014 9:09 PM Subject: Re: hbase exception: Could not reseek StoreFileScanner Version 0.94.11, r1513697, Wed Aug 14 04:54:46 UTC 2013 On Tue, Apr 15, 2014 at 12:03 PM, lars hofhansl la...@apache.org wrote: Hi Li, please always tell us which version of HBase/Hadoop you are using and what it is that you were trying to do. Thanks. -- Lars From: Li Li fancye...@gmail.com To: user@hbase.apache.org Sent: Monday, April 14, 2014 5:32 PM Subject: hbase exception: Could not reseek StoreFileScanner Mon Apr 14 23:54:40 CST 2014, org.apache.hadoop.hbase.client.HTable$9@14923f6b, java.io.IOException: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6 b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOn Close=false] [cacheCompressed=false], firstKey=\xE82\x14\xFF/\xF04\xA4\xBC\xB0X\xEB\xB4\xE9\xD1\x11\x93h\xD3\xAA\xC4\xAB\x99\xC3\x09\x874\x16VZ\x05\x10/cf:an/1397117856840/Put, lastKey=\xF0\x1F\xA7\xF7u\x9E.\xB2\x8EZ\xD5\xEB\xD6h\x03 W\x0F\x8A\xA0\x9B\x0A\xE8\xEC\x9ELu5o\xFE\x03\xCE/cf:an/1397131734218/Put, avgKeyLen=48, avgValueLen=14, entries=3712302, length=260849569, cur=\xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDC F/cf:an/1397454753471/Maximum/vlen=0/ts=0] to key \xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDCF/cf:an/LATEST_TIMESTAMP/Maximum/vlen=0/ts=0 3968 Mon Apr 14 23:55:50 CST 2014, org.apache.hadoop.hbase.client.HTable$9@14923f6b, java.io.IOException: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6 b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOn Close=false] [cacheCompressed=false], firstKey=\xE82\x14\xFF/\xF04\xA4\xBC\xB0X\xEB\xB4\xE9\xD1\x11\x93h\xD3\xAA\xC4\xAB\x99\xC3\x09\x874\x16VZ\x05\x10/cf:an/1397117856840/Put, lastKey=\xF0\x1F\xA7\xF7u\x9E.\xB2\x8EZ\xD5\xEB\xD6h\x03 W\x0F\x8A\xA0\x9B\x0A\xE8\xEC\x9ELu5o\xFE\x03\xCE/cf:an/1397131734218/Put, avgKeyLen=48, avgValueLen=14, entries=3712302, length=260849569, cur=\xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDC F/cf:an/1397454753471/Maximum/vlen=0/ts=0] to key \xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDCF/cf:an/LATEST_TIMESTAMP/Maximum/vlen=0/ts=0 3969 3970 at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:188) 3971 at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:918) ... 3975 14-04-14 23:58:14,993 ERROR Thread-8 com.founder.extractor.ExtractWorker Failed after attempts=14, exceptions: 3976 Mon Apr 14 23:51:52 CST 2014, org.apache.hadoop.hbase.client.HTable$9@17b86244, java.io.IOException: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6 b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true]
Re: Re: replication verifyrep
On Tue, Apr 15, 2014 at 12:17 AM, Hansi Klose hansi.kl...@web.de wrote: Hi Jean-Daniel, thank you for your answer and bring some light into the darkness. You're welcome! You can see the bad rows listed in the user logs for your MR job. What log do you mean. The output from the command line? I only see the count of GOOD or BAD rows. Are the bad rows listed in that log which are not replicated? You started VerifyReplication via hadoop jar, so it's a MapReduce job. Go to your JobTracker's web UI, you should see your jobs there, then checkout one of them and click on one of the completed maps then look for the log. The bad rows are listed in that output. J-D
RE: endpoint coprocessor
Thanks Yu. I have added below coprocessor to my table. And tried to invoke coprocessor using java client. But fails with below error. But I could see coprocessor in describe table output. Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.UnknownProtocolException): org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No registered coprocessor service found for name AggregateService in region test3,,1397469869214.c73698dce0d5b91d29d42a9f9e194965. at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:5070) From describe table: -- test3', {TABLE_ATTRIBUTES = {coprocessor$1 = 'hdfs://xxx.com:8020/user///hbase-server-0.98.1-hadoop2.jar|org.apache.hadoop.hbase.coprocessor.AggregateImplementation||'}, {NAME = 'cf' Thanks, Chandra -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Thursday, April 10, 2014 5:36 PM To: user@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: endpoint coprocessor Here is a reference implementation for aggregation : http://search-hadoop.com/c/HBase:hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java||Hbase+aggregation+endpoint You can find it in hbase source code. Cheers On Apr 10, 2014, at 4:29 AM, Bogala, Chandra Reddy chandra.bog...@gs.commailto:chandra.bog...@gs.com wrote: Hi, I am planning to write endpoint coprocessor to calculate TOP N results for my usecase. I got confused with old apis and new apis. I followed below links and try to implement. But looks like api's changed a lot. I don't see many of these classes in hbase jars. We are using Hbase 0.96. Can anyone point to the latest document/apis?. And if possible sample code to calculate top N. https://blogs.apache.org/hbase/entry/coprocessor_introduction https://www.youtube.com/watch?v=xHvJhuGGOKc Thanks, Chandra
Re: HBase atomic append functionality (not just client)
Hmm... Wouldn't mvcc prevent seeing partial append? Append is just put in the end, the way it is currently implemented. On Mon, Apr 14, 2014 at 10:41 AM, Vladimir Rodionov vrodio...@carrieriq.com wrote: From HRegion.java: Appends performed are done under row lock but reads do not take locks out so this can be seen partially complete by gets and scans. Appends are partially atomic (you can get partial reads but you will never get corrupted writes) and they are implemented on the server side. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: GSK Chaitanya [gskchaitany...@gmail.com] Sent: Monday, April 14, 2014 10:05 AM To: user@hbase.apache.org; d...@hbase.apache.org Subject: HBase atomic append functionality (not just client) Mighty Hbase users and developers, I have few questions and I'd really appreciate it if someone can clarify them. 1) I want to know if Hbase inherently supports *atomic append*functionality like *get* and *put*. For my work, I would be using OpenTSDB which is a layer on top of AsynchHBase and AsynchHBase doesnt work with HBase client (which supports *atomic append*). 2) If I understand correctly, atomic append of HBase client internally does a get and put instead of actually appending to the end of the cell. If that's the case, I wonder how does this functionality is of much use in terms of performance. In our case, we would like a very light weight append functionality. I'd like to know if there are any plans of adding this feature to HBase main in the near future. Thanks, Chaitanya Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: All regions stay on two nodes out of 18 nodes
The command balance_switch true returns true, but the command balancer returns false. I checked the HMaster UI and found some regions of other tables in transition, not of this table. This table's name is E_MP_DAY_READ, I did grep it in the master log and found only the following lines: 2014-04-15 15:50:59,925 INFO [MASTER_SERVER_OPERATIONS-b03:6-1] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,160001123745_2014-01-25:00:00:00,1395753408476.ba5c8291f8dad37d5b9621b7334c17b0. because it has been opened in a04.jsepc.com,60020,1397548219084 2014-04-15 15:50:59,926 INFO [MASTER_SERVER_OPERATIONS-b03:6-1] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,37915618_2014-03-13:00:00:00,1395994146202.ec4e397baffd1cc40bdc18ce0ab2f28a. because it has been opened in a04.jsepc.com,60020,1397548219084 2014-04-15 15:50:59,926 INFO [MASTER_SERVER_OPERATIONS-b03:6-1] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,300013608840_2014-02-21:00:00:00,1395749573711.744bab52befec279a7ee97497801e10f. because it has been opened in a04.jsepc.com,60020,1397548219084 2014-04-15 15:50:59,937 INFO [MASTER_SERVER_OPERATIONS-b03:6-2] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,30497780_2014-01-23:00:00:00,1395746363941.79b831e698053b1005f7a97c9f2a6ddc. because it has been opened in a04.jsepc.com,60020,1397548219084 2014-04-15 15:50:59,938 INFO [MASTER_SERVER_OPERATIONS-b03:6-2] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,38188567_2014-03-04:00:00:00,1395756104426.eb1806c2dc5833152b6b5e7b5e4a88b8. because it has been opened in a04.jsepc.com,60020,1397548219084 2014-04-15 15:50:59,940 INFO [MASTER_SERVER_OPERATIONS-b03:6-2] handler.ServerShutdownHandler: Skip assigning region E_MP_DAY_READ,300016987143_2014-01-21:00:00:00,1395986789897.e4d143865d354bdc2a427c1f00df6ad7. because it has been opened in a04.jsepc.com,60020,1397548219084 so few logging lines about it, looks strange ? BTW, I can spread the regions of this table evenly across the whole cluster after I shutdown the two region servers where the regions of this table resided originally. 2014-04-15 19:47 GMT+08:00 Ted Yu yuzhih...@gmail.com: Is load balancer enabled ? Can you grep this table in master log and pastebin what you found ? Cheers On Apr 15, 2014, at 4:40 AM, Tao Xiao xiaotao.cs@gmail.com wrote: I am using HDP 2.0.6, which has 18 nodes(region servers). One of my HBase tables has 50 regions but I found that the 50 regions all stay in just two nodes, not spread evenly in the 18 nodes. I did not pre-create splits so this table was gradually split into 50 regions itself. I'd like to know why all the regions stay in just two nodes, not the 18 nodes of the cluster, and how to spread the regions evenly across all the region servers. Thanks.