Re: client Table instance, confused with autoFlush

2015-05-13 Thread Solomon Duskis
BufferedMutator is the preferred alternative for autoflush starting in
HBase 1.0.  Get a connection via ConnectionFactory, then
connection.getBufferedMutator(tableName).  It's the same functionality as
autoflush under the covers.

On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com wrote:

 Please take a look at https://issues.apache.org/jira/browse/HBASE-12728

 Cheers

 On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak serega.shey...@gmail.com
 wrote:

  Hi, in 0.94 we could use autoFlush method for HTable.
  Now HTable shouldn't be used, we refactoring code for Table
 
  Here is a note:
  http://hbase.apache.org/book.html#perf.hbase.client.autoflush
  When performing a lot of Puts, make sure that setAutoFlush is set to
 false
  on your Table
  
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
  
   instance
 
  What is the right way to set autoFlush for Table instance? Can't find
  method/example to do this?
 



Re: client Table instance, confused with autoFlush

2015-05-13 Thread Ted Yu
Please take a look at https://issues.apache.org/jira/browse/HBASE-12728

Cheers

On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak serega.shey...@gmail.com
wrote:

 Hi, in 0.94 we could use autoFlush method for HTable.
 Now HTable shouldn't be used, we refactoring code for Table

 Here is a note:
 http://hbase.apache.org/book.html#perf.hbase.client.autoflush
 When performing a lot of Puts, make sure that setAutoFlush is set to false
 on your Table
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
 
  instance

 What is the right way to set autoFlush for Table instance? Can't find
 method/example to do this?



client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
Hi, in 0.94 we could use autoFlush method for HTable.
Now HTable shouldn't be used, we refactoring code for Table

Here is a note:
http://hbase.apache.org/book.html#perf.hbase.client.autoflush
When performing a lot of Puts, make sure that setAutoFlush is set to false
on your Table
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
 instance

What is the right way to set autoFlush for Table instance? Can't find
method/example to do this?


Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
We are using CDH 5.4, it's on .0.98 version

2015-05-13 16:49 GMT+03:00 Solomon Duskis sdus...@gmail.com:

 BufferedMutator is the preferred alternative for autoflush starting in
 HBase 1.0.  Get a connection via ConnectionFactory, then
 connection.getBufferedMutator(tableName).  It's the same functionality as
 autoflush under the covers.

 On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com wrote:

  Please take a look at https://issues.apache.org/jira/browse/HBASE-12728
 
  Cheers
 
  On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak 
 serega.shey...@gmail.com
  wrote:
 
   Hi, in 0.94 we could use autoFlush method for HTable.
   Now HTable shouldn't be used, we refactoring code for Table
  
   Here is a note:
   http://hbase.apache.org/book.html#perf.hbase.client.autoflush
   When performing a lot of Puts, make sure that setAutoFlush is set to
  false
   on your Table
   
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
   
instance
  
   What is the right way to set autoFlush for Table instance? Can't find
   method/example to do this?
  
 



Re: client Table instance, confused with autoFlush

2015-05-13 Thread Solomon Duskis
The docs you referenced are for 1.0.  Table and BufferedMutator were
introduced in 1.0.  In 0.98, you should continue using HTable and
autoflush.

On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak serega.shey...@gmail.com
wrote:

 We are using CDH 5.4, it's on .0.98 version

 2015-05-13 16:49 GMT+03:00 Solomon Duskis sdus...@gmail.com:

  BufferedMutator is the preferred alternative for autoflush starting in
  HBase 1.0.  Get a connection via ConnectionFactory, then
  connection.getBufferedMutator(tableName).  It's the same functionality as
  autoflush under the covers.
 
  On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com wrote:
 
   Please take a look at
 https://issues.apache.org/jira/browse/HBASE-12728
  
   Cheers
  
   On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak 
  serega.shey...@gmail.com
   wrote:
  
Hi, in 0.94 we could use autoFlush method for HTable.
Now HTable shouldn't be used, we refactoring code for Table
   
Here is a note:
http://hbase.apache.org/book.html#perf.hbase.client.autoflush
When performing a lot of Puts, make sure that setAutoFlush is set to
   false
on your Table

  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html

 instance
   
What is the right way to set autoFlush for Table instance? Can't find
method/example to do this?
   
  
 



Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
But HTable is deprecated in 0.98 ...?

2015-05-13 17:35 GMT+03:00 Solomon Duskis sdus...@gmail.com:

 The docs you referenced are for 1.0.  Table and BufferedMutator were
 introduced in 1.0.  In 0.98, you should continue using HTable and
 autoflush.

 On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak serega.shey...@gmail.com
 wrote:

  We are using CDH 5.4, it's on .0.98 version
 
  2015-05-13 16:49 GMT+03:00 Solomon Duskis sdus...@gmail.com:
 
   BufferedMutator is the preferred alternative for autoflush starting in
   HBase 1.0.  Get a connection via ConnectionFactory, then
   connection.getBufferedMutator(tableName).  It's the same functionality
 as
   autoflush under the covers.
  
   On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com wrote:
  
Please take a look at
  https://issues.apache.org/jira/browse/HBASE-12728
   
Cheers
   
On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak 
   serega.shey...@gmail.com
wrote:
   
 Hi, in 0.94 we could use autoFlush method for HTable.
 Now HTable shouldn't be used, we refactoring code for Table

 Here is a note:
 http://hbase.apache.org/book.html#perf.hbase.client.autoflush
 When performing a lot of Puts, make sure that setAutoFlush is set
 to
false
 on your Table
 
   
  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
 
  instance

 What is the right way to set autoFlush for Table instance? Can't
 find
 method/example to do this?

   
  
 



Re: client Table instance, confused with autoFlush

2015-05-13 Thread Shahab Yunus
Until you move to HBase 1.*, you should use HTableInterface. And the
autoFlush methods and semantics, as far as I understand are, same so you
should not have problem.

Regards,
Shahab

On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak serega.shey...@gmail.com
wrote:

 But HTable is deprecated in 0.98 ...?

 2015-05-13 17:35 GMT+03:00 Solomon Duskis sdus...@gmail.com:

  The docs you referenced are for 1.0.  Table and BufferedMutator were
  introduced in 1.0.  In 0.98, you should continue using HTable and
  autoflush.
 
  On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak 
 serega.shey...@gmail.com
  wrote:
 
   We are using CDH 5.4, it's on .0.98 version
  
   2015-05-13 16:49 GMT+03:00 Solomon Duskis sdus...@gmail.com:
  
BufferedMutator is the preferred alternative for autoflush starting
 in
HBase 1.0.  Get a connection via ConnectionFactory, then
connection.getBufferedMutator(tableName).  It's the same
 functionality
  as
autoflush under the covers.
   
On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com wrote:
   
 Please take a look at
   https://issues.apache.org/jira/browse/HBASE-12728

 Cheers

 On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak 
serega.shey...@gmail.com
 wrote:

  Hi, in 0.94 we could use autoFlush method for HTable.
  Now HTable shouldn't be used, we refactoring code for Table
 
  Here is a note:
  http://hbase.apache.org/book.html#perf.hbase.client.autoflush
  When performing a lot of Puts, make sure that setAutoFlush is
 set
  to
 false
  on your Table
  

   
  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
  
   instance
 
  What is the right way to set autoFlush for Table instance? Can't
  find
  method/example to do this?
 

   
  
 



Re: Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Here is my hbase-site.xml

configuration
  property
namehbase.rootdir/name
valuefile:///usr/local/hbase/value
  /property
  property
namehbase.zookeeper.property.dataDir/name
value/usr/local/hbase/zookeeperdata/value
  /property
/configuration


And hive-site.xml

configuration
 property
namehive.aux.jars.path/name

valuefile:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2-
hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr
/local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar/value
  /property

property
   namehbase.zookeeper.quorum/name
   valuezk1,zk2,zk3/value
/property


property
namehive.exec.scratchdir/name
value/usr/local/hive/mydir/value
descriptionScratch space for Hive jobs/description
/property

/configuration



Hadoop classpath


/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar


On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed ibrar.ah...@gmail.com wrote:

 Hi,

 I am creating a table using hive and getting this error.

 [127.0.0.1:1] hive CREATE TABLE hbase_table_1(key int, value string)
STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key,cf1:val)
TBLPROPERTIES (hbase.table.name = xyz);



 [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
 Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
 Can't get the locations
 at
 org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
 at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
 at
 org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
 at
 org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
 at
 org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
 at
 org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:134)
 at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
 at
 org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
 at com.sun.proxy.$Proxy7.createTable(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
 at
 org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
 

Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
This issue similar some missing settings. What do you for your Hive
Hbase integration ? Can you give some information about your cluster ?

BTW In [1], someone had same issue. Maybe help you

[1] 
http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3cce01cda1.9221%25sanjay.subraman...@wizecommerce.com%3E

2015-05-13 22:28 GMT+03:00 Ibrar Ahmed ibrar.ah...@gmail.com:
 Hi,

 I am creating a table using hive and getting this error.

 [127.0.0.1:1] hive CREATE TABLE hbase_table_1(key int, value string)
STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key,cf1:val)
TBLPROPERTIES (hbase.table.name = xyz);



 [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
 Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
 Can't get the locations
 at
 org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
 at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
 at
 org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
 at
 org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
 at
 org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
 at
 org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:134)
 at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
 at
 org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
 at com.sun.proxy.$Proxy7.createTable(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
 at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
 at
 org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
 at
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
 at
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 )


 Any help/clue can help.



-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304


Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Esteban Gutierrez
rahul,

You might want to look into your MR counters too, if your tasks are
spilling too much to disk or the shuffling phase is too large that might
cause lots of contention. Also you might want to look into the OS/drives
settings too (write cache off or irqbalance off) as Michael said, CPU might
not be a bad thing but depends on what the CPU cycles are being used, e.g.
user times vs system times.

cheers,
esteban.


--
Cloudera, Inc.


On Wed, May 13, 2015 at 10:52 AM, Michael Segel michael_se...@hotmail.com
wrote:

 So …

 First, you’re wasting money on 10K drives. But that could be your
 company’s standard.

 Yes, you’re going to see red.

 24 / 12 , so is that 12 physical cores  or 24 physical cores?

 I suspect those are dual chipped w 6 physical cores per chip.
 That’s 12 cores to 12 disks, which is ok.

 The 40 or 20 cores to 12 drives… that’s going to cause you trouble.

 Note: Seeing high levels of CPU may not be a bad thing.

 7-8 mappers per node?  Not a lot of work for the number of cores…



  On May 13, 2015, at 12:31 PM, rahul malviya malviyarahul2...@gmail.com
 wrote:
 
  *How many mapper/reducers are running per node for this job?*
  I am running 7-8 mappers per node. The spike is seen in mapper phase so
 no
  reducers where running at that point of time.
 
  *Also how many mappers are running as data local mappers?*
  How to determine this ?
 
 
  * You load/data equally distributed?*
  Yes as we use presplit hash keys in our hbase cluster and data is pretty
  evenly distributed.
 
  Thanks,
  Rahul
 
 
  On Wed, May 13, 2015 at 10:25 AM, Anil Gupta anilgupt...@gmail.com
 wrote:
 
  How many mapper/reducers are running per node for this job?
  Also how many mappers are running as data local mappers?
  You load/data equally distributed?
 
  Your disk, cpu ratio looks ok.
 
  Sent from my iPhone
 
  On May 13, 2015, at 10:12 AM, rahul malviya 
 malviyarahul2...@gmail.com
  wrote:
 
  *The High CPU may be WAIT IOs,  which would mean that you’re cpu is
  waiting
  for reads from the local disks.*
 
  Yes I think thats what is going on but I am trying to understand why it
  happens only in case of snapshot MR but if I run the same job without
  using
  snapshot everything is normal. What is the difference in snapshot
 version
  which can cause such a spike ? I looking through the code for snapshot
  version if I can find something.
 
  cores / disks == 24 / 12 or 40 / 12.
 
  We are using 10K sata drives on our datanodes.
 
  Rahul
 
  On Wed, May 13, 2015 at 10:00 AM, Michael Segel 
  michael_se...@hotmail.com
  wrote:
 
  Without knowing your exact configuration…
 
  The High CPU may be WAIT IOs,  which would mean that you’re cpu is
  waiting
  for reads from the local disks.
 
  What’s the ratio of cores (physical) to disks?
  What type of disks are you using?
 
  That’s going to be the most likely culprit.
  On May 13, 2015, at 11:41 AM, rahul malviya 
  malviyarahul2...@gmail.com
  wrote:
 
  Yes.
 
  On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com
 wrote:
 
  Have you enabled short circuit read ?
 
  Cheers
 
  On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
  malviyarahul2...@gmail.com
  wrote:
 
  Hi,
 
  I have recently started running MR on hbase snapshots but when the
 MR
  is
  running there is pretty high CPU usage on datanodes and I start
  seeing
  IO
  wait message in datanode logs and as soon I kill the MR on Snapshot
  everything come back to normal.
 
  What could be causing this ?
 
  I am running cdh5.2.0 distribution.
 
  Thanks,
  Rahul
 
 
 




HBase MapReduce in Kerberized cluster

2015-05-13 Thread Edward C. Skoviak
I'm attempting to write a Crunch pipeline to read various rows from a table
in HBase and then do processing on these results. I am doing this from a
cluster deployed using CDH 5.3.2 running Kerberos and YARN.

I was hoping to get an answer on what is considered the best approach to
authenticate to HBase within MapReduce task execution context? I've perused
various posts/documentation and it seems that TokenUtil was, at least at
one point, the right approach, however I notice now it has been moved to be
a part of the hbase-server package (instead of hbase-client). Is there a
better way to retrieve and pass an HBase delegation token to the MR job
launched by my pipeline?

Thanks,
Ed Skoviak


Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread ramkrishna vasudevan
I think related to this would be HBASE-12790 where we would do a round
robin scheduling and thus helps the shorter scans also to get some time
slice to get the execution cycle.

On Thu, May 14, 2015 at 7:12 AM, Matteo Bertozzi theo.berto...@gmail.com
wrote:

 @nick we have already something like that, which is HBASE-10993
 and it is basically reordering requests based on how many scan.next you
 did.
 (see the picture)
 http://blog.cloudera.com/wp-content/uploads/2014/11/hbase-multi-f2.png
 the problem is that we can't eject requests in execution and we are not
 heavy enough on removing request from the queue and send a retry to the
 client in case someone with more priority is in.

 Matteo


 On Wed, May 13, 2015 at 6:38 PM, Nick Dimiduk ndimi...@gmail.com wrote:

  I guess what I'm thinking of is more about scheduling than
  quota/throttling. I don't want my online requests to sit in a queue
 behind
  MR requests while the MR work build up to it's quota amount. I want a
  scheduler to do time-slicing of operations, with preferential treatment
  given to online work over long scan (analytical) work. For example, all
  scan RPC's known to cover lots of Cells get de-prioritized vs gets
 and
  short scans. Maybe this is synthesized with an RPC annotation marking it
 as
  long vs short -- MR scans are marked long. I'm not sure, and I need
  to look more closely at recent scan improvements. IIRC, there's a
 heartbeat
  now, which maybe is a general mechanism allowing for long operations to
 not
  stomp over short operations. Heartbeat implies the long-running scan is
  coming up for air from time to time, allowing itself to be interrupted
 and
  defer to higher priority work. This isn't preemption, but does allow for
 an
  upper bound on how long the next queued task waits.
 
  On Wed, May 13, 2015 at 6:11 PM, Matteo Bertozzi 
 theo.berto...@gmail.com
  wrote:
 
   @nick what would you like to have? a match on a Job ID or something
 like
   that?
   currently only user/table/namespace are supported,
   but group support can be easily added.
   not sure about a job-id or job-name since we don't have that info on
 the
   scan.
  
   On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk ndimi...@gmail.com
  wrote:
  
Sorry. Yeah, sure, I can ask over there.
   
The throttle was set by user in these tests.  You cannot directly
 throttle a specific job, but do have the option to set the throttle
 for a table or a namespace.  That might be sufficient for you to
 achieve your objective (unless those jobs are run by one user and
 access the same table.)
   
   
Maybe running as different users is the key, but this seems like a
 very
important use-case to support -- folks doing aggregate analysis
concurrently on an online table.
   
On Wed, May 13, 2015 at 5:53 PM, Stack st...@duboce.net wrote:
   
 Should we add in your comments on the blog Govind: i.e. the answers
  to
 Nicks' questions?
 St.Ack

 On Wed, May 13, 2015 at 5:48 PM, Govind Kamat gka...@cloudera.com
 
wrote:

This is a great demonstration of these new features, thanks
 for
  pointing it
out Stack.
   
I'm curious: what percentile latencies are this reported? Does
  the
non-throttled user see significant latency improvements in the
  95,
 99pct
when the competing, scanning users are throttled? MB/s and
 req/s
   are
managed at the region level? Region server level? Aggregate?
 
  The latencies reported in the post are average latencies.
 
  Yes, the non-throttled user sees an across-the-board improvement
 in
  the 95th and 99th percentiles, in addition to the improvement in
  average latency.  The extent of improvement is significant as
 well
   but
  varies with the throttle pressure, just as in the case of the
  average
  latencies.
 
  The total throughput numbers (req/s) are aggregate numbers
 reported
   by
  the YCSB client.
 
These throttle points are by user? Is there a way for us to
 say
   all
 MR
jobs are lower priority than online queries?
   
 
  The throttle was set by user in these tests.  You cannot directly
  throttle a specific job, but do have the option to set the
 throttle
  for a table or a namespace.  That might be sufficient for you to
  achieve your objective (unless those jobs are run by one user and
  access the same table.)
 
  Govind
 
 
Thanks,
Nick
   
On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net
  wrote:
   
 .. by our Govind.

 See here:

 

   
  
 
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature

 St.Ack

 

   
  
 



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread anil gupta
Inline.

On Wed, May 13, 2015 at 10:31 AM, rahul malviya malviyarahul2...@gmail.com
wrote:

 *How many mapper/reducers are running per node for this job?*
 I am running 7-8 mappers per node. The spike is seen in mapper phase so no
 reducers where running at that point of time.

 *Also how many mappers are running as data local mappers?*
 How to determine this ?

On the counter web page of your job. Look for Data-local map tasks
counter.



 * You load/data equally distributed?*
 Yes as we use presplit hash keys in our hbase cluster and data is pretty
 evenly distributed.

 Thanks,
 Rahul


 On Wed, May 13, 2015 at 10:25 AM, Anil Gupta anilgupt...@gmail.com
 wrote:

  How many mapper/reducers are running per node for this job?
  Also how many mappers are running as data local mappers?
  You load/data equally distributed?
 
  Your disk, cpu ratio looks ok.
 
  Sent from my iPhone
 
   On May 13, 2015, at 10:12 AM, rahul malviya 
 malviyarahul2...@gmail.com
  wrote:
  
   *The High CPU may be WAIT IOs,  which would mean that you’re cpu is
  waiting
   for reads from the local disks.*
  
   Yes I think thats what is going on but I am trying to understand why it
   happens only in case of snapshot MR but if I run the same job without
  using
   snapshot everything is normal. What is the difference in snapshot
 version
   which can cause such a spike ? I looking through the code for snapshot
   version if I can find something.
  
   cores / disks == 24 / 12 or 40 / 12.
  
   We are using 10K sata drives on our datanodes.
  
   Rahul
  
   On Wed, May 13, 2015 at 10:00 AM, Michael Segel 
  michael_se...@hotmail.com
   wrote:
  
   Without knowing your exact configuration…
  
   The High CPU may be WAIT IOs,  which would mean that you’re cpu is
  waiting
   for reads from the local disks.
  
   What’s the ratio of cores (physical) to disks?
   What type of disks are you using?
  
   That’s going to be the most likely culprit.
   On May 13, 2015, at 11:41 AM, rahul malviya 
  malviyarahul2...@gmail.com
   wrote:
  
   Yes.
  
   On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com
 wrote:
  
   Have you enabled short circuit read ?
  
   Cheers
  
   On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
   malviyarahul2...@gmail.com
   wrote:
  
   Hi,
  
   I have recently started running MR on hbase snapshots but when the
 MR
   is
   running there is pretty high CPU usage on datanodes and I start
  seeing
   IO
   wait message in datanode logs and as soon I kill the MR on Snapshot
   everything come back to normal.
  
   What could be causing this ?
  
   I am running cdh5.2.0 distribution.
  
   Thanks,
   Rahul
  
  
 




-- 
Thanks  Regards,
Anil Gupta


Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Hi,

I am creating a table using hive and getting this error.

[127.0.0.1:1] hive CREATE TABLE hbase_table_1(key int, value string)
   STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
   WITH SERDEPROPERTIES (hbase.columns.mapping =
:key,cf1:val)
   TBLPROPERTIES (hbase.table.name = xyz);



[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
Can't get the locations
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
at
org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:134)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
at
org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
at
org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
)


Any help/clue can help.


Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
Your Zookeeper managed by Hbase. Could you check your
hbase.zookeeper.quorum settings. It should be same with Hbase
Zookeeper.

Talat

2015-05-13 23:03 GMT+03:00 Ibrar Ahmed ibrar.ah...@gmail.com:
 Here is my hbase-site.xml

 configuration
   property
 namehbase.rootdir/name
 valuefile:///usr/local/hbase/value
   /property
   property
 namehbase.zookeeper.property.dataDir/name
 value/usr/local/hbase/zookeeperdata/value
   /property
 /configuration


 And hive-site.xml

 configuration
  property
 namehive.aux.jars.path/name

 valuefile:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2-
 hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr
 /local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar/value
   /property

 property
namehbase.zookeeper.quorum/name
valuezk1,zk2,zk3/value
 /property


 property
 namehive.exec.scratchdir/name
 value/usr/local/hive/mydir/value
 descriptionScratch space for Hive jobs/description
 /property

 /configuration



 Hadoop classpath


 /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar


 On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed ibrar.ah...@gmail.com wrote:

 Hi,

 I am creating a table using hive and getting this error.

 [127.0.0.1:1] hive CREATE TABLE hbase_table_1(key int, value string)
STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key,cf1:val)
TBLPROPERTIES (hbase.table.name = xyz);



 [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
 Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
 Can't get the locations
 at
 org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
 at
 org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
 at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
 at
 org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
 at
 org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
 at
 org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
 at
 org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:134)
 at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
 at
 org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
 at
 org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
 at com.sun.proxy.$Proxy7.createTable(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
 at
 org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at 

Troubles with HBase 1.1.0 RC2

2015-05-13 Thread James Estes
I saw the vote thread for RC2, so tried to build my project against it.

My build fails when I depend on 1.1.0. I created a bare bones project
to show the issue I'm running into:
https://github.com/housejester/hbase-deps-test

To be clear, it works in 1.0.0 (and I did add the repository).

Further, we have a coprocessor and when I stand up a 1.1.0 HBase and
call my endpoint, I get:

! Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/HRegion;

The same coprocessor works under 1.0.0.

So, it looks like RegionCoprocessorEnvironment.getRegion() has been removed?

The Audience annotation is:
@InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC)
@InterfaceStability.Evolving

Since it is Evolving it is allowed to change in a breaking way. I'm
trying to think about how I migrate. I guess I deploy a new coproc
that uses whatever the new method is, and then in my client, detect at
runtime which HBase version I'm talking to and use that to determine
which coprocessor to hit?

Thanks,
James


Re: Troubles with HBase 1.1.0 RC2

2015-05-13 Thread Andrew Purtell
 So, it looks like RegionCoprocessorEnvironment.getRegion() has been
removed?

No, the signature has changed, basically s/HRegion/Region/. HRegion is an
internal, low level implementation type. Has always been. We have replaced
it with Region, an interface that contains a subset of HRegion we feel we
can support for coprocessor source and binary compatibility longer term.
This work was done on HBASE-12972 if you're curious to know more about it.

 I guess I deploy a new coproc that uses whatever the new method is, and
then in my client, detect at runtime which HBase version I'm talking to and
use that to determine which coprocessor to hit?

Coprocessors are server side extensions. These API changes will require you
to modify the code you plan to deploy on the server. I don't think any
client side changes are needed. Unless your coprocessor implements an
Endpoint and _you_ are changing your RPC message formats, a 1.0.x client
shouldn't care whether it is talking to a 1.0.x server or a 1.1.x server,
running your coprocessor or not.



On Wed, May 13, 2015 at 3:00 PM, James Estes james.es...@gmail.com wrote:

 I saw the vote thread for RC2, so tried to build my project against it.

 My build fails when I depend on 1.1.0. I created a bare bones project
 to show the issue I'm running into:
 https://github.com/housejester/hbase-deps-test

 To be clear, it works in 1.0.0 (and I did add the repository).

 Further, we have a coprocessor and when I stand up a 1.1.0 HBase and
 call my endpoint, I get:

 ! Caused by: java.lang.NoSuchMethodError:

 org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/HRegion;

 The same coprocessor works under 1.0.0.

 So, it looks like RegionCoprocessorEnvironment.getRegion() has been
 removed?

 The Audience annotation is:
 @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC)
 @InterfaceStability.Evolving

 Since it is Evolving it is allowed to change in a breaking way. I'm
 trying to think about how I migrate. I guess I deploy a new coproc
 that uses whatever the new method is, and then in my client, detect at
 runtime which HBase version I'm talking to and use that to determine
 which coprocessor to hit?

 Thanks,
 James




-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)


Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
This is a great demonstration of these new features, thanks for pointing it
out Stack.

I'm curious: what percentile latencies are this reported? Does the
non-throttled user see significant latency improvements in the 95, 99pct
when the competing, scanning users are throttled? MB/s and req/s are
managed at the region level? Region server level? Aggregate?

These throttle points are by user? Is there a way for us to say all MR
jobs are lower priority than online queries?

Thanks,
Nick

On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net wrote:

 .. by our Govind.

 See here:
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature

 St.Ack



Re: Questions related to HBase general use

2015-05-13 Thread Nick Dimiduk
+ Swarnim, who's expert on HBase/Hive integration.

Yes, snapshots may be interesting for you. I believe Hive can access HBase
timestamps, exposed as a virtual column. It's assumed across there whole
row however, not per cell.

On Sun, May 10, 2015 at 9:14 PM, Jerry He jerry...@gmail.com wrote:

 Hi, Yong

 You have a good understanding of the benefit of HBase already.
 Generally speaking, HBase is suitable for real time read/write to your big
 data set.
 Regarding the HBase performance evaluation tool, the 'read' test use HBase
 'get'. For 1m rows, the test would issue 1m 'get' (and RPC) to the server.
 The 'scan' test scans the table and transfers the rows to the client in
 batches (e.g. 100 rows at a time), which will take shorter time for the
 whole test to complete for the same number of rows.
 The hive/hbase integration, as you said, needs more consideration.
 1) The performance.  Hive access HBase via HBase client API, which involves
 going to the HBase server for all the data access. This will slow things
 down.
 There are a couple of things you can explore. e.g. Hive/HBase snapshot
 integration. This would provide direct access to HBase hfiles.
 2) In your email, you are interested in HBase's capability of storing
 multiple versions of data.  You need to consider if Hive supports this
 HBase feature. i.e provide you access to multi versions. As I can remember,
 it is not fully.

 Jerry


 On Thu, May 7, 2015 at 6:18 PM, java8964 java8...@hotmail.com wrote:

  Hi,
  I am kind of new to HBase. Currently our production run IBM BigInsight
 V3,
  comes with Hadoop 2.2 and HBase 0.96.0.
  We are mostly using HDFS and Hive/Pig for our BigData project, it works
  very good for our big datasets. Right now, we have a one dataset needs to
  be loaded from Mysql, about 100G, and will have about Gs change daily.
 This
  is a very important slow change dimension data, we like to sync between
  Mysql and BigData platform.
  I am thinking of using HBase to store it, instead of refreshing the whole
  dataset in HDFS, due to:
  1) HBase makes the merge the change very easy.2) HBase could store all
 the
  changes in the history, as a function out of box. We will replicate all
 the
  changes from the binlog level from Mysql, and we could keep all changes
 in
  HBase (or long history), then it can give us some insight that cannot be
  done easily in HDFS.3) HBase could give us the benefit to access the data
  by key fast, for some cases.4) HBase is available out of box.
  What I am not sure is the Hive/HBase integration. Hive is the top tool in
  our environment. If one dataset stored in Hbase (even only about 100G as
  now), the join between it with the other Big datasets in HDFS worries
 me. I
  read quite some information about Hive/HBase integration, and feel that
 it
  is not really mature, as not too many usage cases I can find online,
  especially on performance. There are quite some JIRAs related to make
 Hive
  utilize the HBase for performance in MR job are still pending.
  I want to know other people experience to use HBase in this way. I
  understand HBase is not designed as a storage system for Data Warehouse
  component or analytics engine. But the benefits to use HBase in this case
  still attractive me. If my use cases of HBase is mostly read or full scan
  the data, how bad it is compared to HDFS in the same cluster? 3x? 5x?
  To help me understand the read throughput of HBase, I use the HBase
  performance evaluation tool, but the output is quite confusing. I have 2
  clusters, one is with 5 nodes with 3 slaves all running on VM (Each with
  24G + 4 cores, so cluster has 12 mappers + 6 reducers), another is real
  cluster with 5 nodes with 3 slaves with 64G + 24 cores and with (48
 mapper
  slots + 24 reducer slots).Below is the result I run the sequentialRead
 3
  on the better cluster:
  15/05/07 17:26:50 INFO mapred.JobClient: Counters: 3015/05/07 17:26:50
  INFO mapred.JobClient:   File System Counters15/05/07 17:26:50 INFO
  mapred.JobClient: FILE: BYTES_READ=54615/05/07 17:26:50 INFO
  mapred.JobClient: FILE: BYTES_WRITTEN=742507415/05/07 17:26:50 INFO
  mapred.JobClient: HDFS: BYTES_READ=270015/05/07 17:26:50 INFO
  mapred.JobClient: HDFS: BYTES_WRITTEN=40515/05/07 17:26:50 INFO
  mapred.JobClient:   org.apache.hadoop.mapreduce.JobCounter15/05/07
 17:26:50
  INFO mapred.JobClient: TOTAL_LAUNCHED_MAPS=3015/05/07 17:26:50 INFO
  mapred.JobClient: TOTAL_LAUNCHED_REDUCES=115/05/07 17:26:50 INFO
  mapred.JobClient: SLOTS_MILLIS_MAPS=290516715/05/07 17:26:50 INFO
  mapred.JobClient: SLOTS_MILLIS_REDUCES=1134015/05/07 17:26:50 INFO
  mapred.JobClient: FALLOW_SLOTS_MILLIS_MAPS=015/05/07 17:26:50 INFO
  mapred.JobClient: FALLOW_SLOTS_MILLIS_REDUCES=015/05/07 17:26:50 INFO
  mapred.JobClient:   org.apache.hadoop.mapreduce.TaskCounter15/05/07
  17:26:50 INFO mapred.JobClient: MAP_INPUT_RECORDS=3015/05/07 17:26:50
  INFO mapred.JobClient: 

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
Ok, thanks!

2015-05-13 18:14 GMT+03:00 Shahab Yunus shahab.yu...@gmail.com:

 Until you move to HBase 1.*, you should use HTableInterface. And the
 autoFlush methods and semantics, as far as I understand are, same so you
 should not have problem.

 Regards,
 Shahab

 On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak serega.shey...@gmail.com
 
 wrote:

  But HTable is deprecated in 0.98 ...?
 
  2015-05-13 17:35 GMT+03:00 Solomon Duskis sdus...@gmail.com:
 
   The docs you referenced are for 1.0.  Table and BufferedMutator were
   introduced in 1.0.  In 0.98, you should continue using HTable and
   autoflush.
  
   On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak 
  serega.shey...@gmail.com
   wrote:
  
We are using CDH 5.4, it's on .0.98 version
   
2015-05-13 16:49 GMT+03:00 Solomon Duskis sdus...@gmail.com:
   
 BufferedMutator is the preferred alternative for autoflush starting
  in
 HBase 1.0.  Get a connection via ConnectionFactory, then
 connection.getBufferedMutator(tableName).  It's the same
  functionality
   as
 autoflush under the covers.

 On Wed, May 13, 2015 at 9:41 AM, Ted Yu yuzhih...@gmail.com
 wrote:

  Please take a look at
https://issues.apache.org/jira/browse/HBASE-12728
 
  Cheers
 
  On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak 
 serega.shey...@gmail.com
  wrote:
 
   Hi, in 0.94 we could use autoFlush method for HTable.
   Now HTable shouldn't be used, we refactoring code for Table
  
   Here is a note:
   http://hbase.apache.org/book.html#perf.hbase.client.autoflush
   When performing a lot of Puts, make sure that setAutoFlush is
  set
   to
  false
   on your Table
   
 

   
  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
   
instance
  
   What is the right way to set autoFlush for Table instance?
 Can't
   find
   method/example to do this?
  
 

   
  
 



Re: Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Elliott Clark
On Wed, May 13, 2015 at 12:59 AM, David chen c77...@163.com wrote:

 -XX:MaxGCPauseMillis=6000


With this line you're basically telling java to never garbage collect. Can
you try lowering that to something closer to the jvm default and see if you
have better stability?


MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Hi,

I have recently started running MR on hbase snapshots but when the MR is
running there is pretty high CPU usage on datanodes and I start seeing IO
wait message in datanode logs and as soon I kill the MR on Snapshot
everything come back to normal.

What could be causing this ?

I am running cdh5.2.0 distribution.

Thanks,
Rahul


Re: Problems with Phoenix and HBase

2015-05-13 Thread Ted Yu
Putting dev@ to bcc

The question w.r.t. sqlline should be posted to Phoenix user mailing list.

w.r.t. question on hbase shell, can you give us more information ?
release of hbase you use
was there exception from hbase shell when no tables were returned ?
master log snippet when above happened

On Wed, May 13, 2015 at 3:30 AM, Asfare aman...@hotmail.com wrote:

 Hi!

 I'm very new in HBase and newer in Phoenix.
 My mainly problem currently is that:

 When I try to lanch sqlline the execution crash and the client can't
 connect
 to the server. Moreover, I think the hbase have another problem, because
 when I try to use the command list in the shell sometimes recognize my
 tables, sometimes not. Can you give me some advices or tips?



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/Problems-with-Phoenix-and-HBase-tp4071362.html
 Sent from the HBase Developer mailing list archive at Nabble.com.



Re: Troubles with HBase 1.1.0 RC2

2015-05-13 Thread Enis Söztutar
Yeah, for coprocessors, what Andrew said. You have to make minor changes.

From your repo, I was able to build:

HW10676:hbase-deps-test$ ./build.sh

:compileJava

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase/1.1.0/hbase-1.1.0.pom

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-server/1.1.0/hbase-server-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-common/1.1.0/hbase-common-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-common/1.1.0/hbase-common-1.1.0-tests.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-protocol/1.1.0/hbase-protocol-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-procedure/1.1.0/hbase-procedure-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-client/1.1.0/hbase-client-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-prefix-tree/1.1.0/hbase-prefix-tree-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-hadoop-compat/1.1.0/hbase-hadoop-compat-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-hadoop2-compat/1.1.0/hbase-hadoop2-compat-1.1.0.jar

Download
https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase-annotations/1.1.0/hbase-annotations-1.1.0.jar

:processResources UP-TO-DATE

:classes

:jar

:assemble

:compileTestJava UP-TO-DATE

:processTestResources UP-TO-DATE

:testClasses UP-TO-DATE

:test UP-TO-DATE

:check UP-TO-DATE

:build


BUILD SUCCESSFUL


Total time: 1 mins 8.182 secs


Also you should not need to pass -Dcompat.module=hbase-hadoop2-compat.

Enis

On Wed, May 13, 2015 at 3:21 PM, Andrew Purtell apurt...@apache.org wrote:

  So, it looks like RegionCoprocessorEnvironment.getRegion() has been
 removed?

 No, the signature has changed, basically s/HRegion/Region/. HRegion is an
 internal, low level implementation type. Has always been. We have replaced
 it with Region, an interface that contains a subset of HRegion we feel we
 can support for coprocessor source and binary compatibility longer term.
 This work was done on HBASE-12972 if you're curious to know more about it.

  I guess I deploy a new coproc that uses whatever the new method is, and
 then in my client, detect at runtime which HBase version I'm talking to and
 use that to determine which coprocessor to hit?

 Coprocessors are server side extensions. These API changes will require you
 to modify the code you plan to deploy on the server. I don't think any
 client side changes are needed. Unless your coprocessor implements an
 Endpoint and _you_ are changing your RPC message formats, a 1.0.x client
 shouldn't care whether it is talking to a 1.0.x server or a 1.1.x server,
 running your coprocessor or not.



 On Wed, May 13, 2015 at 3:00 PM, James Estes james.es...@gmail.com
 wrote:

  I saw the vote thread for RC2, so tried to build my project against it.
 
  My build fails when I depend on 1.1.0. I created a bare bones project
  to show the issue I'm running into:
  https://github.com/housejester/hbase-deps-test
 
  To be clear, it works in 1.0.0 (and I did add the repository).
 
  Further, we have a coprocessor and when I stand up a 1.1.0 HBase and
  call my endpoint, I get:
 
  ! Caused by: java.lang.NoSuchMethodError:
 
 
 org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment.getRegion()Lorg/apache/hadoop/hbase/regionserver/HRegion;
 
  The same coprocessor works under 1.0.0.
 
  So, it looks like RegionCoprocessorEnvironment.getRegion() has been
  removed?
 
  The Audience annotation is:
  @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC)
  @InterfaceStability.Evolving
 
  Since it is Evolving it is allowed to change in a breaking way. I'm
  trying to think about how I migrate. I guess I deploy a new coproc
  that uses whatever the new method is, and then in my client, detect at
  runtime which HBase version I'm talking to and use that to determine
  which coprocessor to hit?
 
  Thanks,
  James
 



 --
 Best regards,

- Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)



Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Govind Kamat
  This is a great demonstration of these new features, thanks for pointing it
  out Stack.
  
  I'm curious: what percentile latencies are this reported? Does the
  non-throttled user see significant latency improvements in the 95, 99pct
  when the competing, scanning users are throttled? MB/s and req/s are
  managed at the region level? Region server level? Aggregate?

The latencies reported in the post are average latencies.  

Yes, the non-throttled user sees an across-the-board improvement in
the 95th and 99th percentiles, in addition to the improvement in
average latency.  The extent of improvement is significant as well but
varies with the throttle pressure, just as in the case of the average
latencies.

The total throughput numbers (req/s) are aggregate numbers reported by
the YCSB client.

  These throttle points are by user? Is there a way for us to say all MR
  jobs are lower priority than online queries?
  

The throttle was set by user in these tests.  You cannot directly
throttle a specific job, but do have the option to set the throttle
for a table or a namespace.  That might be sufficient for you to
achieve your objective (unless those jobs are run by one user and
access the same table.)

Govind


  Thanks,
  Nick
  
  On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net wrote:
  
   .. by our Govind.
  
   See here:
   https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
  
   St.Ack
  


Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Matteo Bertozzi
@nick what would you like to have? a match on a Job ID or something like
that?
currently only user/table/namespace are supported,
but group support can be easily added.
not sure about a job-id or job-name since we don't have that info on the
scan.

On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 Sorry. Yeah, sure, I can ask over there.

 The throttle was set by user in these tests.  You cannot directly
  throttle a specific job, but do have the option to set the throttle
  for a table or a namespace.  That might be sufficient for you to
  achieve your objective (unless those jobs are run by one user and
  access the same table.)


 Maybe running as different users is the key, but this seems like a very
 important use-case to support -- folks doing aggregate analysis
 concurrently on an online table.

 On Wed, May 13, 2015 at 5:53 PM, Stack st...@duboce.net wrote:

  Should we add in your comments on the blog Govind: i.e. the answers to
  Nicks' questions?
  St.Ack
 
  On Wed, May 13, 2015 at 5:48 PM, Govind Kamat gka...@cloudera.com
 wrote:
 
 This is a great demonstration of these new features, thanks for
   pointing it
 out Stack.

 I'm curious: what percentile latencies are this reported? Does the
 non-throttled user see significant latency improvements in the 95,
  99pct
 when the competing, scanning users are throttled? MB/s and req/s are
 managed at the region level? Region server level? Aggregate?
  
   The latencies reported in the post are average latencies.
  
   Yes, the non-throttled user sees an across-the-board improvement in
   the 95th and 99th percentiles, in addition to the improvement in
   average latency.  The extent of improvement is significant as well but
   varies with the throttle pressure, just as in the case of the average
   latencies.
  
   The total throughput numbers (req/s) are aggregate numbers reported by
   the YCSB client.
  
 These throttle points are by user? Is there a way for us to say all
  MR
 jobs are lower priority than online queries?

  
   The throttle was set by user in these tests.  You cannot directly
   throttle a specific job, but do have the option to set the throttle
   for a table or a namespace.  That might be sufficient for you to
   achieve your objective (unless those jobs are run by one user and
   access the same table.)
  
   Govind
  
  
 Thanks,
 Nick

 On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net wrote:

  .. by our Govind.
 
  See here:
 
  
 
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
 
  St.Ack
 
  
 



Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Matteo Bertozzi
@nick we have already something like that, which is HBASE-10993
and it is basically reordering requests based on how many scan.next you did.
(see the picture)
http://blog.cloudera.com/wp-content/uploads/2014/11/hbase-multi-f2.png
the problem is that we can't eject requests in execution and we are not
heavy enough on removing request from the queue and send a retry to the
client in case someone with more priority is in.

Matteo


On Wed, May 13, 2015 at 6:38 PM, Nick Dimiduk ndimi...@gmail.com wrote:

 I guess what I'm thinking of is more about scheduling than
 quota/throttling. I don't want my online requests to sit in a queue behind
 MR requests while the MR work build up to it's quota amount. I want a
 scheduler to do time-slicing of operations, with preferential treatment
 given to online work over long scan (analytical) work. For example, all
 scan RPC's known to cover lots of Cells get de-prioritized vs gets and
 short scans. Maybe this is synthesized with an RPC annotation marking it as
 long vs short -- MR scans are marked long. I'm not sure, and I need
 to look more closely at recent scan improvements. IIRC, there's a heartbeat
 now, which maybe is a general mechanism allowing for long operations to not
 stomp over short operations. Heartbeat implies the long-running scan is
 coming up for air from time to time, allowing itself to be interrupted and
 defer to higher priority work. This isn't preemption, but does allow for an
 upper bound on how long the next queued task waits.

 On Wed, May 13, 2015 at 6:11 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:

  @nick what would you like to have? a match on a Job ID or something like
  that?
  currently only user/table/namespace are supported,
  but group support can be easily added.
  not sure about a job-id or job-name since we don't have that info on the
  scan.
 
  On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk ndimi...@gmail.com
 wrote:
 
   Sorry. Yeah, sure, I can ask over there.
  
   The throttle was set by user in these tests.  You cannot directly
throttle a specific job, but do have the option to set the throttle
for a table or a namespace.  That might be sufficient for you to
achieve your objective (unless those jobs are run by one user and
access the same table.)
  
  
   Maybe running as different users is the key, but this seems like a very
   important use-case to support -- folks doing aggregate analysis
   concurrently on an online table.
  
   On Wed, May 13, 2015 at 5:53 PM, Stack st...@duboce.net wrote:
  
Should we add in your comments on the blog Govind: i.e. the answers
 to
Nicks' questions?
St.Ack
   
On Wed, May 13, 2015 at 5:48 PM, Govind Kamat gka...@cloudera.com
   wrote:
   
   This is a great demonstration of these new features, thanks for
 pointing it
   out Stack.
  
   I'm curious: what percentile latencies are this reported? Does
 the
   non-throttled user see significant latency improvements in the
 95,
99pct
   when the competing, scanning users are throttled? MB/s and req/s
  are
   managed at the region level? Region server level? Aggregate?

 The latencies reported in the post are average latencies.

 Yes, the non-throttled user sees an across-the-board improvement in
 the 95th and 99th percentiles, in addition to the improvement in
 average latency.  The extent of improvement is significant as well
  but
 varies with the throttle pressure, just as in the case of the
 average
 latencies.

 The total throughput numbers (req/s) are aggregate numbers reported
  by
 the YCSB client.

   These throttle points are by user? Is there a way for us to say
  all
MR
   jobs are lower priority than online queries?
  

 The throttle was set by user in these tests.  You cannot directly
 throttle a specific job, but do have the option to set the throttle
 for a table or a namespace.  That might be sufficient for you to
 achieve your objective (unless those jobs are run by one user and
 access the same table.)

 Govind


   Thanks,
   Nick
  
   On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net
 wrote:
  
.. by our Govind.
   
See here:
   

   
  
 
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
   
St.Ack
   

   
  
 



Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
Sorry. Yeah, sure, I can ask over there.

The throttle was set by user in these tests.  You cannot directly
 throttle a specific job, but do have the option to set the throttle
 for a table or a namespace.  That might be sufficient for you to
 achieve your objective (unless those jobs are run by one user and
 access the same table.)


Maybe running as different users is the key, but this seems like a very
important use-case to support -- folks doing aggregate analysis
concurrently on an online table.

On Wed, May 13, 2015 at 5:53 PM, Stack st...@duboce.net wrote:

 Should we add in your comments on the blog Govind: i.e. the answers to
 Nicks' questions?
 St.Ack

 On Wed, May 13, 2015 at 5:48 PM, Govind Kamat gka...@cloudera.com wrote:

This is a great demonstration of these new features, thanks for
  pointing it
out Stack.
   
I'm curious: what percentile latencies are this reported? Does the
non-throttled user see significant latency improvements in the 95,
 99pct
when the competing, scanning users are throttled? MB/s and req/s are
managed at the region level? Region server level? Aggregate?
 
  The latencies reported in the post are average latencies.
 
  Yes, the non-throttled user sees an across-the-board improvement in
  the 95th and 99th percentiles, in addition to the improvement in
  average latency.  The extent of improvement is significant as well but
  varies with the throttle pressure, just as in the case of the average
  latencies.
 
  The total throughput numbers (req/s) are aggregate numbers reported by
  the YCSB client.
 
These throttle points are by user? Is there a way for us to say all
 MR
jobs are lower priority than online queries?
   
 
  The throttle was set by user in these tests.  You cannot directly
  throttle a specific job, but do have the option to set the throttle
  for a table or a namespace.  That might be sufficient for you to
  achieve your objective (unless those jobs are run by one user and
  access the same table.)
 
  Govind
 
 
Thanks,
Nick
   
On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net wrote:
   
 .. by our Govind.

 See here:

 
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature

 St.Ack

 



Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
I guess what I'm thinking of is more about scheduling than
quota/throttling. I don't want my online requests to sit in a queue behind
MR requests while the MR work build up to it's quota amount. I want a
scheduler to do time-slicing of operations, with preferential treatment
given to online work over long scan (analytical) work. For example, all
scan RPC's known to cover lots of Cells get de-prioritized vs gets and
short scans. Maybe this is synthesized with an RPC annotation marking it as
long vs short -- MR scans are marked long. I'm not sure, and I need
to look more closely at recent scan improvements. IIRC, there's a heartbeat
now, which maybe is a general mechanism allowing for long operations to not
stomp over short operations. Heartbeat implies the long-running scan is
coming up for air from time to time, allowing itself to be interrupted and
defer to higher priority work. This isn't preemption, but does allow for an
upper bound on how long the next queued task waits.

On Wed, May 13, 2015 at 6:11 PM, Matteo Bertozzi theo.berto...@gmail.com
wrote:

 @nick what would you like to have? a match on a Job ID or something like
 that?
 currently only user/table/namespace are supported,
 but group support can be easily added.
 not sure about a job-id or job-name since we don't have that info on the
 scan.

 On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk ndimi...@gmail.com wrote:

  Sorry. Yeah, sure, I can ask over there.
 
  The throttle was set by user in these tests.  You cannot directly
   throttle a specific job, but do have the option to set the throttle
   for a table or a namespace.  That might be sufficient for you to
   achieve your objective (unless those jobs are run by one user and
   access the same table.)
 
 
  Maybe running as different users is the key, but this seems like a very
  important use-case to support -- folks doing aggregate analysis
  concurrently on an online table.
 
  On Wed, May 13, 2015 at 5:53 PM, Stack st...@duboce.net wrote:
 
   Should we add in your comments on the blog Govind: i.e. the answers to
   Nicks' questions?
   St.Ack
  
   On Wed, May 13, 2015 at 5:48 PM, Govind Kamat gka...@cloudera.com
  wrote:
  
  This is a great demonstration of these new features, thanks for
pointing it
  out Stack.
 
  I'm curious: what percentile latencies are this reported? Does the
  non-throttled user see significant latency improvements in the 95,
   99pct
  when the competing, scanning users are throttled? MB/s and req/s
 are
  managed at the region level? Region server level? Aggregate?
   
The latencies reported in the post are average latencies.
   
Yes, the non-throttled user sees an across-the-board improvement in
the 95th and 99th percentiles, in addition to the improvement in
average latency.  The extent of improvement is significant as well
 but
varies with the throttle pressure, just as in the case of the average
latencies.
   
The total throughput numbers (req/s) are aggregate numbers reported
 by
the YCSB client.
   
  These throttle points are by user? Is there a way for us to say
 all
   MR
  jobs are lower priority than online queries?
 
   
The throttle was set by user in these tests.  You cannot directly
throttle a specific job, but do have the option to set the throttle
for a table or a namespace.  That might be sufficient for you to
achieve your objective (unless those jobs are run by one user and
access the same table.)
   
Govind
   
   
  Thanks,
  Nick
 
  On Tue, May 12, 2015 at 1:58 PM, Stack st...@duboce.net wrote:
 
   .. by our Govind.
  
   See here:
  
   
  
 
 https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
  
   St.Ack
  
   
  
 



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Yes.

On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:

 Have you enabled short circuit read ?

 Cheers

 On Wed, May 13, 2015 at 9:37 AM, rahul malviya malviyarahul2...@gmail.com
 
 wrote:

  Hi,
 
  I have recently started running MR on hbase snapshots but when the MR is
  running there is pretty high CPU usage on datanodes and I start seeing IO
  wait message in datanode logs and as soon I kill the MR on Snapshot
  everything come back to normal.
 
  What could be causing this ?
 
  I am running cdh5.2.0 distribution.
 
  Thanks,
  Rahul
 



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Michael Segel
Without knowing your exact configuration… 

The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting for 
reads from the local disks. 

What’s the ratio of cores (physical) to disks? 
What type of disks are you using? 

That’s going to be the most likely culprit. 
 On May 13, 2015, at 11:41 AM, rahul malviya malviyarahul2...@gmail.com 
 wrote:
 
 Yes.
 
 On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Have you enabled short circuit read ?
 
 Cheers
 
 On Wed, May 13, 2015 at 9:37 AM, rahul malviya malviyarahul2...@gmail.com
 
 wrote:
 
 Hi,
 
 I have recently started running MR on hbase snapshots but when the MR is
 running there is pretty high CPU usage on datanodes and I start seeing IO
 wait message in datanode logs and as soon I kill the MR on Snapshot
 everything come back to normal.
 
 What could be causing this ?
 
 I am running cdh5.2.0 distribution.
 
 Thanks,
 Rahul
 
 



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Anil Gupta
How many mapper/reducers are running per node for this job?
Also how many mappers are running as data local mappers?
You load/data equally distributed?

Your disk, cpu ratio looks ok. 

Sent from my iPhone

 On May 13, 2015, at 10:12 AM, rahul malviya malviyarahul2...@gmail.com 
 wrote:
 
 *The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
 for reads from the local disks.*
 
 Yes I think thats what is going on but I am trying to understand why it
 happens only in case of snapshot MR but if I run the same job without using
 snapshot everything is normal. What is the difference in snapshot version
 which can cause such a spike ? I looking through the code for snapshot
 version if I can find something.
 
 cores / disks == 24 / 12 or 40 / 12.
 
 We are using 10K sata drives on our datanodes.
 
 Rahul
 
 On Wed, May 13, 2015 at 10:00 AM, Michael Segel michael_se...@hotmail.com
 wrote:
 
 Without knowing your exact configuration…
 
 The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
 for reads from the local disks.
 
 What’s the ratio of cores (physical) to disks?
 What type of disks are you using?
 
 That’s going to be the most likely culprit.
 On May 13, 2015, at 11:41 AM, rahul malviya malviyarahul2...@gmail.com
 wrote:
 
 Yes.
 
 On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Have you enabled short circuit read ?
 
 Cheers
 
 On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
 malviyarahul2...@gmail.com
 wrote:
 
 Hi,
 
 I have recently started running MR on hbase snapshots but when the MR
 is
 running there is pretty high CPU usage on datanodes and I start seeing
 IO
 wait message in datanode logs and as soon I kill the MR on Snapshot
 everything come back to normal.
 
 What could be causing this ?
 
 I am running cdh5.2.0 distribution.
 
 Thanks,
 Rahul
 
 


Re: readAtOffset error when reading from HFiles

2015-05-13 Thread donmai
Ahhh, thanks for that. Yep, all flushes will (or should be) going to S3.
I'm working through it and it seems that it's defaulting to the positional
read instead of seek+read - is this accurate?

On Tue, May 12, 2015 at 12:00 PM, Ted Yu yuzhih...@gmail.com wrote:

 -w is shorthand for --seekToRow (so it is not offset):

  -w,--seekToRow arg Seek to this row and print all the kvs for this
   row only

 Do you store all your data on s3 ?

 Cheers

 On Tue, May 12, 2015 at 8:50 AM, donmai dood...@gmail.com wrote:

  Actually, looking deeper into it, things don't seem to be making sense.
 
  The error message is this: Caused by: java.io.IOException: Positional
 read
  of 65723 bytes failed at offset 394218 (returned 16384)
 
  As such, I try to do a read for 65723 bytes using the tool to see if it
  fails at that offset:
 
  hbase org.apache.hadoop.hbase.io.hfile.HFile -w 65723 -f
 
 
 s3://hbase/data/default/usertable/5cce51bd0bcc8f7507c7e594b73d2d15/family/ec3d1516dba0447e875d489f3ad8bdc0
 
  This results in no output other than:
 
  INFO  [main] s3n.S3NativeFileSystem: Stream for key
 
 
 'bleh2/data/default/usertable/5cce51bd0bcc8f7507c7e594b73d2d15/family/ec3d1516dba0447e875d489f3ad8bdc0'
  seeking to position '1693329'
 
  Am I using the HFIle command correctly?
 
  On Tue, May 12, 2015 at 11:09 AM, donmai dood...@gmail.com wrote:
 
   Thanks Ted, this actually helped me out alot! I'm running 0.98.6 and
 was
   able to determine that the HFiles are perfectly okay and can be scanned
   through without issue - it looks like there's something else going on,
   since after a compaction everything works...
  
   On Tue, May 12, 2015 at 9:55 AM, Ted Yu yuzhih...@gmail.com wrote:
  
   What release of hbase are you using ?
  
   Please read http://hbase.apache.org/book.html#hfile where you can
 find
   description about HFile tool. This tool would allow you to investigate
   given HFile.
  
   Cheers
  
   On Tue, May 12, 2015 at 6:02 AM, donmai dood...@gmail.com wrote:
  
Hi,
   
I'm getting this error when trying to read from HFiles:
   
http://pastebin.com/SJci7uQM
   
Any idea what's going on here?
   
Thanks!
   
  
  
  
 



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Short circuit read can cause it to spike CPU and IO wait issues ?

Rahul

On Wed, May 13, 2015 at 9:41 AM, rahul malviya malviyarahul2...@gmail.com
wrote:

 Yes.

 On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:

 Have you enabled short circuit read ?

 Cheers

 On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
 malviyarahul2...@gmail.com
 wrote:

  Hi,
 
  I have recently started running MR on hbase snapshots but when the MR is
  running there is pretty high CPU usage on datanodes and I start seeing
 IO
  wait message in datanode logs and as soon I kill the MR on Snapshot
  everything come back to normal.
 
  What could be causing this ?
 
  I am running cdh5.2.0 distribution.
 
  Thanks,
  Rahul
 





Re: Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Ted Yu
For #2, partial row would be returned.

Please take a look at the following method in RSRpcServices around line
2393 :

  public ScanResponse scan(final RpcController controller, final
ScanRequest request)

Cheers

On Wed, May 13, 2015 at 12:59 AM, David chen c77...@163.com wrote:

 Thanks for you reply.
 Yes, it indeed appeared in the RegionServer command as follows:
 jps -v|grep Region
 HRegionServer -Dproc_regionserver -XX:OnOutOfMemoryError=kill -9 %p
 -Xmx1000m -Djava.net.preferIPv4Stack=true -Xms16106127360 -Xmx16106127360
 -XX:+UseG1GC -XX:MaxGCPauseMillis=6000
 -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh


 After read HBASE-11544, i have some doubts:
 1. Assume scan has set caching to 1 and batch to 1, for a row with 2
 cells, the first RPC should only return a cell of the row, it is also the
 partial of a row. Unless the cell is too large size, otherwise, will not
 need HBASE-11544. right?
 2. Assume scan has set caching to 1 and maxResultSize to 1, for a row
 which per cell size is more than 1, will the first RPC return the whole or
 partial row? I think the whole row, right?










 At 2015-05-13 11:04:04, Ted Yu yuzhih...@gmail.com wrote:
 Does the following appear in the command which launched region server ?
 -XX:OnOutOfMemoryError=kill -9 %p
 
 There could be multiple reasons for region server process to encounter
 OOME.
 Please take a look at HBASE-11544 which fixes a common cause. The fix is
 in
 the upcoming 1.1.0 release.
 
 Cheers
 
 On Tue, May 12, 2015 at 7:41 PM, David chen c77...@163.com wrote:
 
  A RegionServer was killed because OutOfMemory(OOM), although  the
 process
  killed can be seen in the Linux message log, but i still have two
 following
  problems:
  1. How to inspect the root reason to cause OOM?
  2  When RegionServer encounters OOM, why can't it free some memories
  occupied? if so, whether or not killer will not need.
  Any ideas can be appreciated!



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
*The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
for reads from the local disks.*

Yes I think thats what is going on but I am trying to understand why it
happens only in case of snapshot MR but if I run the same job without using
snapshot everything is normal. What is the difference in snapshot version
which can cause such a spike ? I looking through the code for snapshot
version if I can find something.

cores / disks == 24 / 12 or 40 / 12.

We are using 10K sata drives on our datanodes.

Rahul

On Wed, May 13, 2015 at 10:00 AM, Michael Segel michael_se...@hotmail.com
wrote:

 Without knowing your exact configuration…

 The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
 for reads from the local disks.

 What’s the ratio of cores (physical) to disks?
 What type of disks are you using?

 That’s going to be the most likely culprit.
  On May 13, 2015, at 11:41 AM, rahul malviya malviyarahul2...@gmail.com
 wrote:
 
  Yes.
 
  On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  Have you enabled short circuit read ?
 
  Cheers
 
  On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
 malviyarahul2...@gmail.com
 
  wrote:
 
  Hi,
 
  I have recently started running MR on hbase snapshots but when the MR
 is
  running there is pretty high CPU usage on datanodes and I start seeing
 IO
  wait message in datanode logs and as soon I kill the MR on Snapshot
  everything come back to normal.
 
  What could be causing this ?
 
  I am running cdh5.2.0 distribution.
 
  Thanks,
  Rahul
 
 




Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
*How many mapper/reducers are running per node for this job?*
I am running 7-8 mappers per node. The spike is seen in mapper phase so no
reducers where running at that point of time.

*Also how many mappers are running as data local mappers?*
How to determine this ?


* You load/data equally distributed?*
Yes as we use presplit hash keys in our hbase cluster and data is pretty
evenly distributed.

Thanks,
Rahul


On Wed, May 13, 2015 at 10:25 AM, Anil Gupta anilgupt...@gmail.com wrote:

 How many mapper/reducers are running per node for this job?
 Also how many mappers are running as data local mappers?
 You load/data equally distributed?

 Your disk, cpu ratio looks ok.

 Sent from my iPhone

  On May 13, 2015, at 10:12 AM, rahul malviya malviyarahul2...@gmail.com
 wrote:
 
  *The High CPU may be WAIT IOs,  which would mean that you’re cpu is
 waiting
  for reads from the local disks.*
 
  Yes I think thats what is going on but I am trying to understand why it
  happens only in case of snapshot MR but if I run the same job without
 using
  snapshot everything is normal. What is the difference in snapshot version
  which can cause such a spike ? I looking through the code for snapshot
  version if I can find something.
 
  cores / disks == 24 / 12 or 40 / 12.
 
  We are using 10K sata drives on our datanodes.
 
  Rahul
 
  On Wed, May 13, 2015 at 10:00 AM, Michael Segel 
 michael_se...@hotmail.com
  wrote:
 
  Without knowing your exact configuration…
 
  The High CPU may be WAIT IOs,  which would mean that you’re cpu is
 waiting
  for reads from the local disks.
 
  What’s the ratio of cores (physical) to disks?
  What type of disks are you using?
 
  That’s going to be the most likely culprit.
  On May 13, 2015, at 11:41 AM, rahul malviya 
 malviyarahul2...@gmail.com
  wrote:
 
  Yes.
 
  On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  Have you enabled short circuit read ?
 
  Cheers
 
  On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
  malviyarahul2...@gmail.com
  wrote:
 
  Hi,
 
  I have recently started running MR on hbase snapshots but when the MR
  is
  running there is pretty high CPU usage on datanodes and I start
 seeing
  IO
  wait message in datanode logs and as soon I kill the MR on Snapshot
  everything come back to normal.
 
  What could be causing this ?
 
  I am running cdh5.2.0 distribution.
 
  Thanks,
  Rahul
 
 



Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Stack
On Tue, May 12, 2015 at 7:41 PM, David chen c77...@163.com wrote:

 A RegionServer was killed because OutOfMemory(OOM), although  the process
 killed can be seen in the Linux message log, but i still have two following
 problems:
 1. How to inspect the root reason to cause OOM?


Start the regionserver with -XX:-HeapDumpOnOutOfMemoryError specifying a
location for the heap to be dumped to on OOME (See
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html).
Remove the XX:OnOutOfMemoryError because now it will conflict with
HeapDumpOnOutOfMemoryError
 Then open the heap dump in the java mission control, jprofiler, etc., to
see how the retained objects are associated.


 2  When RegionServer encounters OOM, why can't it free some memories
 occupied? if so, whether or not killer will not need.


We require a certain amount of memory to process a particular work load. If
an insufficient allocation, we OOME. Once an application has OOME'd, its
state goes indeterminate. We opt to kill the process rather than hang
around in a damaged state.

Enable GC logging to figure why in particular you OOME'd (There are
different categories of OOME [1]). We may have a sufficient memory
allocation but an incorrectly tuned GC or a badly specified set of heap
args may bring on OOME.

St.Ack
1.
http://www.javacodegeeks.com/2013/08/understanding-the-outofmemoryerror.html


 Any ideas can be appreciated!


Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Ted Yu
Have you enabled short circuit read ?

Cheers

On Wed, May 13, 2015 at 9:37 AM, rahul malviya malviyarahul2...@gmail.com
wrote:

 Hi,

 I have recently started running MR on hbase snapshots but when the MR is
 running there is pretty high CPU usage on datanodes and I start seeing IO
 wait message in datanode logs and as soon I kill the MR on Snapshot
 everything come back to normal.

 What could be causing this ?

 I am running cdh5.2.0 distribution.

 Thanks,
 Rahul



Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Michael Segel
So … 

First, you’re wasting money on 10K drives. But that could be your company’s 
standard. 

Yes, you’re going to see red. 
 
24 / 12 , so is that 12 physical cores  or 24 physical cores? 

I suspect those are dual chipped w 6 physical cores per chip. 
That’s 12 cores to 12 disks, which is ok. 

The 40 or 20 cores to 12 drives… that’s going to cause you trouble. 

Note: Seeing high levels of CPU may not be a bad thing. 

7-8 mappers per node?  Not a lot of work for the number of cores… 



 On May 13, 2015, at 12:31 PM, rahul malviya malviyarahul2...@gmail.com 
 wrote:
 
 *How many mapper/reducers are running per node for this job?*
 I am running 7-8 mappers per node. The spike is seen in mapper phase so no
 reducers where running at that point of time.
 
 *Also how many mappers are running as data local mappers?*
 How to determine this ?
 
 
 * You load/data equally distributed?*
 Yes as we use presplit hash keys in our hbase cluster and data is pretty
 evenly distributed.
 
 Thanks,
 Rahul
 
 
 On Wed, May 13, 2015 at 10:25 AM, Anil Gupta anilgupt...@gmail.com wrote:
 
 How many mapper/reducers are running per node for this job?
 Also how many mappers are running as data local mappers?
 You load/data equally distributed?
 
 Your disk, cpu ratio looks ok.
 
 Sent from my iPhone
 
 On May 13, 2015, at 10:12 AM, rahul malviya malviyarahul2...@gmail.com
 wrote:
 
 *The High CPU may be WAIT IOs,  which would mean that you’re cpu is
 waiting
 for reads from the local disks.*
 
 Yes I think thats what is going on but I am trying to understand why it
 happens only in case of snapshot MR but if I run the same job without
 using
 snapshot everything is normal. What is the difference in snapshot version
 which can cause such a spike ? I looking through the code for snapshot
 version if I can find something.
 
 cores / disks == 24 / 12 or 40 / 12.
 
 We are using 10K sata drives on our datanodes.
 
 Rahul
 
 On Wed, May 13, 2015 at 10:00 AM, Michael Segel 
 michael_se...@hotmail.com
 wrote:
 
 Without knowing your exact configuration…
 
 The High CPU may be WAIT IOs,  which would mean that you’re cpu is
 waiting
 for reads from the local disks.
 
 What’s the ratio of cores (physical) to disks?
 What type of disks are you using?
 
 That’s going to be the most likely culprit.
 On May 13, 2015, at 11:41 AM, rahul malviya 
 malviyarahul2...@gmail.com
 wrote:
 
 Yes.
 
 On Wed, May 13, 2015 at 9:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 Have you enabled short circuit read ?
 
 Cheers
 
 On Wed, May 13, 2015 at 9:37 AM, rahul malviya 
 malviyarahul2...@gmail.com
 wrote:
 
 Hi,
 
 I have recently started running MR on hbase snapshots but when the MR
 is
 running there is pretty high CPU usage on datanodes and I start
 seeing
 IO
 wait message in datanode logs and as soon I kill the MR on Snapshot
 everything come back to normal.
 
 What could be causing this ?
 
 I am running cdh5.2.0 distribution.
 
 Thanks,
 Rahul
 
 
 



Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Bryan Beaudreault
After moving to the G1GC we were plagued with random OOMs from time to
time.  We always thought it was due to people requesting a big row or group
of rows, but upon investigation noticed that the heap dumps were many GBs
less than the max heap at time of OOM.  If you have this symptom, you may
be running into humongous allocation issues.

I think HBase is especially prone to humongous allocations if you are
batching Puts on the client side, or have large cells.  Googling for
humongous allocations will return a lot of useful results.  I found
http://www.infoq.com/articles/tuning-tips-G1-GC to be especially helpful.

The bottom line is this:

- If an allocation is larger than a 50% of the G1 region size, it is a
humongous allocation which is more expensive to clean up.  We want to avoid
this.
- The default region size is only a few mb, so any big batch puts or scans
can easily be considered humongous.  If you don't set Xms, it will be even
smaller.
- Make sure you are setting Xms to the same value as Xmx.  This is used by
the G1 to calculate default region sizes.
- Enable -XX:+PrintAdaptiveSizePolicy, which will print out information you
can use for debugging humongous allocations.  Any time an allocation is
considered humongous, it will print the size of the allocation.  For us,
enabling this setting made it immediately obvious there was an issue.
- Using the output of the above, determine your optimal region size.
Region sizes must be a power of 2, and you should generally target around
2000 regions.  So a compromise is sometimes needed, as you don't want to be
*too* far below this number.
- Use -XX:G1HeapRegionSize=xM to set the region size.  Like I said, use a
power of 2.

For us, we were getting a lot of allocations around 3-5mb.  The largest
percentage were around 3 to less than 4mb.  On our 25GB regionservers, we
set to the region size to 8MB, so that the vast majority of allocations
fell under 50% of 8mb.  The remaining humongous allocations were low enough
volume to work fine.  On our 32GB regionservers, we set this to 16mb and
completely eliminated humongous allocations.

Since the above tuning, G1GC has worked great for us and we have not had
any OOMs in a couple months.

Hope this helps.

On Wed, May 13, 2015 at 10:37 AM, Stack st...@duboce.net wrote:

 On Tue, May 12, 2015 at 7:41 PM, David chen c77...@163.com wrote:

  A RegionServer was killed because OutOfMemory(OOM), although  the process
  killed can be seen in the Linux message log, but i still have two
 following
  problems:
  1. How to inspect the root reason to cause OOM?
 

 Start the regionserver with -XX:-HeapDumpOnOutOfMemoryError specifying a
 location for the heap to be dumped to on OOME (See

 http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
 ).
 Remove the XX:OnOutOfMemoryError because now it will conflict with
 HeapDumpOnOutOfMemoryError
  Then open the heap dump in the java mission control, jprofiler, etc., to
 see how the retained objects are associated.


  2  When RegionServer encounters OOM, why can't it free some memories
  occupied? if so, whether or not killer will not need.
 

 We require a certain amount of memory to process a particular work load. If
 an insufficient allocation, we OOME. Once an application has OOME'd, its
 state goes indeterminate. We opt to kill the process rather than hang
 around in a damaged state.

 Enable GC logging to figure why in particular you OOME'd (There are
 different categories of OOME [1]). We may have a sufficient memory
 allocation but an incorrectly tuned GC or a badly specified set of heap
 args may bring on OOME.

 St.Ack
 1.

 http://www.javacodegeeks.com/2013/08/understanding-the-outofmemoryerror.html


  Any ideas can be appreciated!



Re: HBase Block locality always 0

2015-05-13 Thread 娄帅
anyidea?

2015-05-12 17:59 GMT+08:00 娄帅 louis.hust...@gmail.com:

 Hi, all,

 I am maintaining an hbase 0.96.0 cluster, but from the web ui of HBase
 regionserver,
 i saw Block locality is 0 for all regionserver.

 Datanode on l-hbase[26-31].data.cn8 and regionserver on
 l-hbase[25-31].data.cn8,

 Any idea?



Re: HBase Block locality always 0

2015-05-13 Thread Dima Spivak
Have you seen Esteban's suggestion? Another possibility is that a number of
old JIRAs covered the fact that regions were assigned in a silly way when a
table was disabled and then enabled. Could this be the case for you?

-Dima

On Wed, May 13, 2015 at 8:36 PM, 娄帅 louis.hust...@gmail.com wrote:

 anyidea?

 2015-05-12 17:59 GMT+08:00 娄帅 louis.hust...@gmail.com:

  Hi, all,
 
  I am maintaining an hbase 0.96.0 cluster, but from the web ui of HBase
  regionserver,
  i saw Block locality is 0 for all regionserver.
 
  Datanode on l-hbase[26-31].data.cn8 and regionserver on
  l-hbase[25-31].data.cn8,
 
  Any idea?
 



Re: Questions related to HBase general use

2015-05-13 Thread Krishna Kalyan
I know that BigInsights comes with BigSQL which interacts with HBase as
well, have you considered that option.
We have a similar use case using BigInsights 2.1.2.


On Thu, May 14, 2015 at 4:56 AM, Nick Dimiduk ndimi...@gmail.com wrote:

 + Swarnim, who's expert on HBase/Hive integration.

 Yes, snapshots may be interesting for you. I believe Hive can access HBase
 timestamps, exposed as a virtual column. It's assumed across there whole
 row however, not per cell.

 On Sun, May 10, 2015 at 9:14 PM, Jerry He jerry...@gmail.com wrote:

  Hi, Yong
 
  You have a good understanding of the benefit of HBase already.
  Generally speaking, HBase is suitable for real time read/write to your
 big
  data set.
  Regarding the HBase performance evaluation tool, the 'read' test use
 HBase
  'get'. For 1m rows, the test would issue 1m 'get' (and RPC) to the
 server.
  The 'scan' test scans the table and transfers the rows to the client in
  batches (e.g. 100 rows at a time), which will take shorter time for the
  whole test to complete for the same number of rows.
  The hive/hbase integration, as you said, needs more consideration.
  1) The performance.  Hive access HBase via HBase client API, which
 involves
  going to the HBase server for all the data access. This will slow things
  down.
  There are a couple of things you can explore. e.g. Hive/HBase
 snapshot
  integration. This would provide direct access to HBase hfiles.
  2) In your email, you are interested in HBase's capability of storing
  multiple versions of data.  You need to consider if Hive supports this
  HBase feature. i.e provide you access to multi versions. As I can
 remember,
  it is not fully.
 
  Jerry
 
 
  On Thu, May 7, 2015 at 6:18 PM, java8964 java8...@hotmail.com wrote:
 
   Hi,
   I am kind of new to HBase. Currently our production run IBM BigInsight
  V3,
   comes with Hadoop 2.2 and HBase 0.96.0.
   We are mostly using HDFS and Hive/Pig for our BigData project, it works
   very good for our big datasets. Right now, we have a one dataset needs
 to
   be loaded from Mysql, about 100G, and will have about Gs change daily.
  This
   is a very important slow change dimension data, we like to sync between
   Mysql and BigData platform.
   I am thinking of using HBase to store it, instead of refreshing the
 whole
   dataset in HDFS, due to:
   1) HBase makes the merge the change very easy.2) HBase could store all
  the
   changes in the history, as a function out of box. We will replicate all
  the
   changes from the binlog level from Mysql, and we could keep all changes
  in
   HBase (or long history), then it can give us some insight that cannot
 be
   done easily in HDFS.3) HBase could give us the benefit to access the
 data
   by key fast, for some cases.4) HBase is available out of box.
   What I am not sure is the Hive/HBase integration. Hive is the top tool
 in
   our environment. If one dataset stored in Hbase (even only about 100G
 as
   now), the join between it with the other Big datasets in HDFS worries
  me. I
   read quite some information about Hive/HBase integration, and feel that
  it
   is not really mature, as not too many usage cases I can find online,
   especially on performance. There are quite some JIRAs related to make
  Hive
   utilize the HBase for performance in MR job are still pending.
   I want to know other people experience to use HBase in this way. I
   understand HBase is not designed as a storage system for Data Warehouse
   component or analytics engine. But the benefits to use HBase in this
 case
   still attractive me. If my use cases of HBase is mostly read or full
 scan
   the data, how bad it is compared to HDFS in the same cluster? 3x? 5x?
   To help me understand the read throughput of HBase, I use the HBase
   performance evaluation tool, but the output is quite confusing. I have
 2
   clusters, one is with 5 nodes with 3 slaves all running on VM (Each
 with
   24G + 4 cores, so cluster has 12 mappers + 6 reducers), another is real
   cluster with 5 nodes with 3 slaves with 64G + 24 cores and with (48
  mapper
   slots + 24 reducer slots).Below is the result I run the sequentialRead
  3
   on the better cluster:
   15/05/07 17:26:50 INFO mapred.JobClient: Counters: 3015/05/07 17:26:50
   INFO mapred.JobClient:   File System Counters15/05/07 17:26:50 INFO
   mapred.JobClient: FILE: BYTES_READ=54615/05/07 17:26:50 INFO
   mapred.JobClient: FILE: BYTES_WRITTEN=742507415/05/07 17:26:50 INFO
   mapred.JobClient: HDFS: BYTES_READ=270015/05/07 17:26:50 INFO
   mapred.JobClient: HDFS: BYTES_WRITTEN=40515/05/07 17:26:50 INFO
   mapred.JobClient:   org.apache.hadoop.mapreduce.JobCounter15/05/07
  17:26:50
   INFO mapred.JobClient: TOTAL_LAUNCHED_MAPS=3015/05/07 17:26:50 INFO
   mapred.JobClient: TOTAL_LAUNCHED_REDUCES=115/05/07 17:26:50 INFO
   mapred.JobClient: SLOTS_MILLIS_MAPS=290516715/05/07 17:26:50 INFO
   mapred.JobClient: SLOTS_MILLIS_REDUCES=1134015/05/07 17:26:50 

Re:Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread David chen
Thanks for you reply.
Yes, it indeed appeared in the RegionServer command as follows:
jps -v|grep Region
HRegionServer -Dproc_regionserver -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m 
-Djava.net.preferIPv4Stack=true -Xms16106127360 -Xmx16106127360 -XX:+UseG1GC 
-XX:MaxGCPauseMillis=6000 
-XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh


After read HBASE-11544, i have some doubts:
1. Assume scan has set caching to 1 and batch to 1, for a row with 2 cells, the 
first RPC should only return a cell of the row, it is also the partial of a 
row. Unless the cell is too large size, otherwise, will not need HBASE-11544. 
right?
2. Assume scan has set caching to 1 and maxResultSize to 1, for a row which per 
cell size is more than 1, will the first RPC return the whole or partial row? I 
think the whole row, right?










At 2015-05-13 11:04:04, Ted Yu yuzhih...@gmail.com wrote:
Does the following appear in the command which launched region server ?
-XX:OnOutOfMemoryError=kill -9 %p

There could be multiple reasons for region server process to encounter OOME.
Please take a look at HBASE-11544 which fixes a common cause. The fix is in
the upcoming 1.1.0 release.

Cheers

On Tue, May 12, 2015 at 7:41 PM, David chen c77...@163.com wrote:

 A RegionServer was killed because OutOfMemory(OOM), although  the process
 killed can be seen in the Linux message log, but i still have two following
 problems:
 1. How to inspect the root reason to cause OOM?
 2  When RegionServer encounters OOM, why can't it free some memories
 occupied? if so, whether or not killer will not need.
 Any ideas can be appreciated!


Re: HBase MapReduce in Kerberized cluster

2015-05-13 Thread Ted Yu
bq. it has been moved to be a part of the hbase-server package

I searched (current) 0.98 and branch-1 where I found:
./hbase-client/src/main/java/org/apache/hadoop/hbase/security/token/TokenUtil.java

FYI

On Wed, May 13, 2015 at 11:45 AM, Edward C. Skoviak 
edward.skov...@gmail.com wrote:

 I'm attempting to write a Crunch pipeline to read various rows from a table
 in HBase and then do processing on these results. I am doing this from a
 cluster deployed using CDH 5.3.2 running Kerberos and YARN.

 I was hoping to get an answer on what is considered the best approach to
 authenticate to HBase within MapReduce task execution context? I've perused
 various posts/documentation and it seems that TokenUtil was, at least at
 one point, the right approach, however I notice now it has been moved to be
 a part of the hbase-server package (instead of hbase-client). Is there a
 better way to retrieve and pass an HBase delegation token to the MR job
 launched by my pipeline?

 Thanks,
 Ed Skoviak



Re: RowKey hashing in HBase 1.0

2015-05-13 Thread jeremy p
Thank you for your response.  However, I'm still having a hard time
understanding you.  Apologies for this.

So, this is where I think I'm getting confused :

Let's talk about the original rowkey, before anything has been prepended to
it.  Let's call this original_rowkey.

Let's say your first original_rowkey is 1000, and your second
original_rowkey is 1001.  Let's say you have a hashing function called f().
Let's say you have 20 regions.

Does a monotonically increasing original_rowkey guarantee a monotonically
increasing return value from f()?  I did not think that was the case. To my
knowledge, f(1001) % 20 is not guaranteed to be larger than f(1000) % 20.

Now, let's talk about the rowkey that I'm going to use when I insert the
row into HBase.  This will be the original_rowkey with f(x) % 20 prepended
to it.  Let's call this ultimate_rowkey.

Since ultimate_rowkey is just original_rowkey with f(x) % 20 prepended to
it, and f(x) % 20 does not increase monotonically, why would I be seeing
the behavior that you describe?

--Jeremy


On Wed, May 6, 2015 at 10:03 PM, Michael Segel michael_se...@hotmail.com
wrote:

 Jeremy,

 I think you have to be careful in how you say things.
 While over time, you’re going to get an even distribution, the hash isn’t
 random. Its consistent so that hash(x) = y  and will always be the same.
 You’re taking the modulus to create 1 to n buckets.

 In each bucket, your new key is n_rowkey  where rowkey is the original row
 key.

 Remember that the rowkey is growing sequentially.  rowkey(n)  rowkey(n+1)
 …   rowkey(n+k)

 So if you hash and take its modulus and prepend it, you will still have
 X_rowkey(n) , X_rowkey(n+k) , …


 All you have is N sequential lists. And again with a sequential list,
 you’re adding to the right so when you split, the top section is never
 going to get new rows.

 I think you need to create a list  and try this with 3 or 4 buckets and
 you’ll start to see what happens.

 The last region fills, but after it splits, the top half is static. The
 new rows are added to the bottom half only.

 This is a problem with sequential keys that you have to learn to live with.

 Its not a killer issue, but something you need to be  aware…

  On May 6, 2015, at 4:00 PM, jeremy p athomewithagroove...@gmail.com
 wrote:
 
  Thank you for the explanation, but I'm a little confused.  The key will
 be
  monotonically increasing, but the hash of that key will not be.
 
  So, even though your original keys may look like : 1_foobar, 2_foobar,
  3_foobar
  After the hashing, they'd look more like : 349000_1_foobar,
  99_2_foobar, 01_3_foobar
 
  With five regions, the original key ranges for your regions would look
  something like : 00-19, 20-39, 40-59,
  60-79, 80-9
 
  So let's say you add another row.  It causes a split.  Now your regions
  look like :  00-19, 20-39, 40-59, 60-79,
  80-89, 90-99
 
  Since the value that you are prepending to your keys is essentially
 random,
  I don't see why your regions would only fill halfway.  A new, hashed key
  would be just as likely to fall within 80-89 as it would be to
 fall
  within 90-99.
 
  Are we working from different assumptions?
 
  On Tue, May 5, 2015 at 4:46 PM, Michael Segel michael_se...@hotmail.com
 
  wrote:
 
  Yes, what you described  mod(hash(rowkey),n) where n is the number of
  regions will remove the hotspotting issue.
 
  However, if your key is sequential you will only have regions half full
  post region split.
 
  Look at it this way…
 
  If I have a key that is a sequential count 1,2,3,4,5 … I am always
 adding
  a new row to the last region and its always being added to the right.
  (reading left from right.) Always at the end of the line…
 
  So if I have 10,000 rows and I split the region… region 1 has 0 to 4,999
  and region 2 has 5000 to 1.
 
  Now my next row is 10001, the following is 10002 … so they will be added
  at the tail end of region 2 until it splits.  (And so on, and so on…)
 
  If you take a modulus of the hash, you create n buckets. Again for each
  bucket… I will still be adding a new larger number so it will be added
 to
  the right hand side or tail of the list.
 
  Once a region is split… that’s it.
 
  Bucketing will solve the hot spotting issue by creating n lists of rows,
  but you’re still always adding to the end of the list.
 
  Does that make sense?
 
 
  On May 5, 2015, at 10:04 AM, jeremy p athomewithagroove...@gmail.com
  wrote:
 
  Thank you for your response!
 
  So I guess 'salt' is a bit of a misnomer.  What I used to do is this :
 
  1) Say that my key value is something like '1234foobar'
  2) I obtain the hash of '1234foobar'.  Let's say that's '54824923'
  3) I mod the hash by my number of regions.  Let's say I have 2000
  regions.
  54824923 % 2000 = 923
  4) I prepend that value to my original key value, so my new key is
  '923_1234foobar'
 
  Is