Re: Bulk-loading HFiles after table split (on ACL enabled cluster)

2014-09-17 Thread Matteo Bertozzi
yeah, in a non secure cluster you have to manually the chmod. there was discussion to implement something like the SecureBulkLoadEndPoint even for the unsecure setup, but at the moment there is no jira/patch available. (the SecureBulkLoadEndPoint is basically doing a chmod 777 before starting the b

Re: Performance oddity between AWS instance sizes

2014-09-17 Thread Ted Yu
bq. there's almost no activity on either side During this period, can you capture stack trace for the region server and pastebin the stack ? Cheers On Wed, Sep 17, 2014 at 3:21 PM, Josh Williams wrote: > Hi, everyone. Here's a strange one, at least to me. > > I'm doing some performance profil

Performance oddity between AWS instance sizes

2014-09-17 Thread Josh Williams
Hi, everyone. Here's a strange one, at least to me. I'm doing some performance profiling, and as a rudimentary test I've been using YCSB to drive HBase (originally 0.98.3, recently updated to 0.98.6.) The problem happens on a few different instance sizes, but this is probably the closest compari

Re: is it a gud way to store a map object in hbase column

2014-09-17 Thread Ted Yu
bq. storing the map object makes task easy. The above makes write(s) easy. But when you query, do you always need all the key-value pairs in this map object ? Cheers On Wed, Sep 17, 2014 at 1:38 PM, yeshwanth kumar wrote: > hi i have a huge map object, which comes from the solr query results.

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Vinay Gupta
Actually we did not test data consistency issues in 0.94. So they might as well be there. We don’t plan to turn the nonce feature off, btw. Increasing hbase.rpc.timeout seems to solve this problem. My guess is client doesnt retry so often when we increase this value. Another config which is su

Re: Does hbase master preserve the disable/enable load balancer property between master restarts?

2014-09-17 Thread Bryan Beaudreault
It depends on your version. See https://issues.apache.org/jira/browse/HBASE-6260 So in 0.94.x your option is to increase hbase.balancer.period to Integer.MAX_VALUE In 0.96 and 0.98, it should be supported On Wed, Sep 17, 2014 at 3:43 PM, Gomathivinayagam Muthuvinayagam < sankarm...@gmail.com> w

Does hbase master preserve the disable/enable load balancer property between master restarts?

2014-09-17 Thread Gomathivinayagam Muthuvinayagam
Hello, Does hbase master preserve the disable/enable load balancer property between master restarts? Thanks & Regards,

is it a gud way to store a map object in hbase column

2014-09-17 Thread yeshwanth kumar
hi i have a huge map object, which comes from the solr query results. map contains around 400-500 key-value pairs is it a gud way to store the entire map as a value in the column. is there any particular things like column vaue size, i need to take care of or shud i store it in different columns

Re: Bulk-loading HFiles after table split (on ACL enabled cluster)

2014-09-17 Thread Daisy Zhou
Thanks for the response, Matteo. My HBase is not a secure HBase, I only have ACL enabled on HDFS. I did try adding the SecureBulkLoadEndpoint coprocessor to my HBase cluster, but I think it does something different, and it didn't help. I normally have to chmod -R a+rwx the hfile directory in ord

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
Thanks Esteban for the suggestion. For case 2) KeyPrefixRegionSplitPolicy won't be enough I think as we're constantly adding new types so the #types is unknown at the beginning, and when there's a new type of data, it will add pre-splits [type|00, type|01, ..., type|FF] to the table. Data is inges

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Esteban Gutierrez
Thanks Jianshi for that helpful information, I think for use case 1) it depends on the data ingestion rate when the regions need to split. The synchronous split operation makes some sense there if you want the regions to contain specific time ranges and/or number of records. For use case 2) I th

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
Hi Esteban, Two reasons to split dynamically, 1) I have a column family that stores timeseries data for mapreduce tasks, and the rowkey is monotonically increasing to make scanning easier. 2) (a better reason), I'm storing multiple types of data in the same table, and I have about 500TB of data

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Esteban Gutierrez
Jianshi, The retry is not an expected behavior that the client should be doing. In fact you don't want your clients to issue admin operations to the cluster ;) Shahab's option is the best alternative by polling when the number of regions has changed in the table you want to modify the splits dyna

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
You rock Ted, I would also add synchronous addSplits as well, there's no good reason multiple splits has to be done sequentially. I also checked createTable, and I trace the code here and lost track... executeCallable(new MasterCallable(getConnection()) { @Override public Void cal

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Ted Yu
Jianshi: See HBASE-11608 Add synchronous split bq. createTable does something special? Yes. See this in HBaseAdmin: public void createTable(final HTableDescriptor desc, byte [][] splitKeys) On Wed, Sep 17, 2014 at 10:58 AM, Jianshi Huang wrote: > I see Shahab, async makes sense, but I prefe

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
Yes Esteban, there're very practical reasons to do the pre-split dynamically. Jianshi On Thu, Sep 18, 2014 at 1:41 AM, Esteban Gutierrez wrote: > Hi Jianshi, > > Is there any reason why you need to split dynamically the table? Users > usually pre-split their tables with a specific number of spl

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
I see Shahab, async makes sense, but I prefer that the HBase client does the retry for me, and let me specify a timeout parameter. One question, does that mean adding multiple splits into one region has to be done sequentially? How can I add region splits in parallel? Does createTable does somethi

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Esteban Gutierrez
Hi Jianshi, Is there any reason why you need to split dynamically the table? Users usually pre-split their tables with a specific number of splits or they pick a region split policy that fits their needs: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/DelimitedKeyPrefixRegi

Fwd: Fine tuning HBase for bulkload

2014-09-17 Thread Poonam Ligade
-- Forwarded message -- From: Poonam Ligade Date: Wed, Sep 17, 2014 at 9:00 PM Subject: Re: Fine tuning HBase for bulkload To: u...@phoenix.apache.org How to disable WAL using configuration property?? Instead of that i changed deferred log interval to hbase.regionserver.optional

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Ted Yu
bq. Earlier client (0.94) didn't complain about it. Did you observe any data loss (w.r.t. Increments) in 0.94 when the servers were loaded ? As Anoop said, it is not recommended to turn off this feature in 0.98 On Wed, Sep 17, 2014 at 12:34 AM, Vin Gup wrote: > Yes possibly. Why would that be

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Anoop John
Yes that is also possible.. So in such a case this new behavior telling the issue clearly. In the past the retry op would have silently succeeded giving a wrong result overall!!! -Anoop- On Wed, Sep 17, 2014 at 7:14 PM, Vin Gup wrote: > Ok. I will try with your suggestions but I see this erro

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Vin Gup
Ok. I will try with your suggestions but I see this error even with batches with no row key duplicates. I still suspect that client is timing out and retrying too often and needs to back off as the region server is heavily loaded. -Vinay > On Sep 17, 2014, at 3:14 AM, Anoop John wrote: > >

Data migration from hbase 0.90 to hbase 0.98

2014-09-17 Thread Y. Dong
Hello there, I’m wondering if anyone knows how to move tables in hbase 0.90 to hbase 0.98? I did export on hbase 0.90 and import to hbase 0.98. However it throws exception like java.lang.Exception: java.io.IOException: keyvalues=NONE read 2 bytes, should read 143121 at org.apache.hado

Re: Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Shahab Yunus
Split is an async operation. When you call it, and the call returns, it does not mean that the region has been created yet. So either you wait for a while (using Thread.sleep) or check for the number of regions in a loop and until they have increased to the value you want and then access the regio

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Anoop John
This is an improvement (rather an issue fix) done from 0.98+ versions. This is for non-idempotent operations (like increment) which HBase clients might retry on failure. Such retry can give wrong results (possibly incrementing 2 times for one increment op) Can you change your application side cod

Error during HBaseAdmin.split: Exception: org.apache.hadoop.hbase.NotServingRegionException, What does that mean?

2014-09-17 Thread Jianshi Huang
I constantly get the following errors when I tried to add splits to a table. org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException): org.apache.hadoop.hbase.NotServingRegionException: Region grapple_vertices,cust|rval#7eb7cffca280|1636500018299

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Vin Gup
Yes possibly. Why would that be a problem? Earlier client (0.94) didn't complain about it. Thanks, -Vinay > On Sep 17, 2014, at 12:16 AM, Anoop John wrote: > > You have more than one increment for the same key in one batch? > > On Wed, Sep 17, 2014 at 12:33 PM, Vinay Gupta > wrote: > >> Als

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Anoop John
You have more than one increment for the same key in one batch? On Wed, Sep 17, 2014 at 12:33 PM, Vinay Gupta wrote: > Also the regionserver keeps throwing exceptions like > > 2014-09-17 06:56:07,151 DEBUG [RpcServer.handler=10,port=60020] > regionserver.ServerNonceManager: Conflict detected by

Re: HBase 0.98.1 batch Increment throws OperationConflictException

2014-09-17 Thread Vinay Gupta
Also the regionserver keeps throwing exceptions like 2014-09-17 06:56:07,151 DEBUG [RpcServer.handler=10,port=60020] regionserver.ServerNonceManager: Conflict detected by nonce: [43871278468062569 89:2793719453824938427], [state 0, hasWait false, activity 06:55:41.091] 2014-09-17 06:56:07,151 DE