Exploring Flash and Compression Acceleration

2018-05-04 Thread Thad Omura
Hello HBase Community,

We have been doing a lot of exploratory work into different modes of Flash 
acceleration for HBase, specifically adding Flash to HDD systems in both One 
SSD as well as BucketCache modes.  In addition, we are also exploring the 
impact of hardware acceleration for GZIP compression.  We have been able to 
prove some nice gains on YCSB synthetic workloads (complete system benchmark 
info and results summarized at 
http://www.scaleflux.com/downloads/ScaleFlux_HBase_Solution_Brief.pdf) but we'd 
like to reach out to the community to see if there are any HBase users who'd 
like to try this in a real world workload environment and report back findings. 
  We'd provide our PCIe cards (Flash storage / GZIP acceleration) for your 
cluster and publish results for the community.

Thanks for your consideration - you can either request a POC at the bottom of 
http://www.scaleflux.com/hbase.html or email me at 
thad.om...@scaleflux.com.

Thanks,
Thad

Thad Omura | EVP Marketing | ScaleFlux | 
thad.om...@scaleflux.com | Mobile 408-750-6236 
| 97 E. Brokaw Rd., Ste 260, San Jose, CA  95112
www.scaleflux.com | 
LinkedIn | 
Facebook | #compute2data



Re: [HELP] on AWS EC2, responseTooSlow and rollback when dropping a table with a lot of regions

2018-05-04 Thread Ted Yu
For #2, the reason was:

state=ROLLEDBACK exec-time=1mins,59.108sec
exception=org.apache.hadoop.hbase.TableNotFoundException:
mytable

For #3, 179108 corresponded with '1mins,59.108sec' shown above which was
the processing time (> 10,000ms)

I think you only posted part of the master log, right ?
>From 12:39:58 to 12:42:57, did the master log contain more lines ?

Please turn DEBUG log on if you haven't.

On Fri, May 4, 2018 at 6:34 AM, 李响  wrote:

> Dear community,
>
> Really appreciate it if someone could provide some guides on how to debug
> this.
>
> We are using HBase 1.1.9 ( + rsgroup patch) on AWS EC2 instances, and I
> found it is quite slow to drop a table with a lot of regions (300+).
>
> The Master's log shows:
>
> 2018-04-27 12:39:58,483 INFO
> [B.defaultRpcServer.handler=11,queue=2,port=16000] master.HMaster:
> Client=ec2-user/null delete mytable
> 2018-04-27 12:39:58,737 INFO  [ProcedureExecutorThread-0]
> hbase.MetaTableAccessor: Deleted [{ENCODED =>
> 0b3a3c3e991b38dd0921905f7200f094, NAME =>
> 'mytable,,1524805200089.0b3a3c3e991b38dd0921905f7200f094.', STARTKEY =>
> '',
> ENDKEY => '01-000-##'}, {ENCODED =>
> 5fba610c04629121e53fec53af869fe3, NAME =>
> 'mytable,01-000-##,1524805200089.
> 5fba610c04629121e53fec53af869fe3.',
> STARTKEY => '01-000-##', ENDKEY =>
> '02-000-##'}, {ENCODED =>
> 653dbf4a1aba1fedb56fea9522d7c96a, NAME =>
> 'mytable,02-000-##,1524805200089.
> 653dbf4a1aba1fedb56fea9522d7c96a.',
> STARTKEY => '02-000-##', ENDKEY =>
> '03-000-##'}, {ENCODED =>
> 7ba6bce2e7cc53a9bcc234244c835ac8, NAME =>
> 'mytable,03-000-##,1524805200089.
> 7ba6bce2e7cc53a9bcc234244c835ac8.',
> STARTKEY => '03-000-##', ENDKEY =>
> '04-000-##'}, {ENCODED =>
> 5106e0ca76d2c026d2d8ff5ce59316cc,...
> 2018-04-27 12:40:58,687 INFO
> [B.defaultRpcServer.handler=20,queue=2,port=16000] master.HMaster:
> Client=ec2-user/null delete mytable
> 2018-04-27 12:42:57,592 WARN
> [B.defaultRpcServer.handler=11,queue=2,port=16000] ipc.RpcServer:
> (responseTooSlow):
> {"call":"DeleteTable(org.apache.hadoop.hbase.protobuf.
> generated.MasterProtos$DeleteTableRequest)","starttimems":1524832798483,"
> responsesize":4,"method":"DeleteTable","param":"TODO:
> class
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$
> DeleteTableRequest","processingtimems":179108,"client":"
> 10.22.22.236:45640","queuetimems":0,"class":"HMaster"}
> 2018-04-27 12:42:57,924 INFO [ProcedureExecutorThread-3]
> procedure2.ProcedureExecutor: Rolledback procedure DeleteTableProcedure
> (table=mytable) id=16515 owner=ec2-user state=ROLLEDBACK exec-time=1mins,
> 59.108sec exception=org.apache.hadoop.hbase.TableNotFoundException:
> mytable
>
>
> The stack trace is like
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.
> TableNotFoundException):
> mytable
> at
> org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.
> java:2216)
> at
> org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.
> prepareDelete(DeleteTableProcedure.java:255)
> at
> org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.
> executeFromState(DeleteTableProcedure.java:101)
> ...
>
>
> It is quite weird for me that:
> 1. Why is the following message logged twice (with different handlers: 20
> and 11) ?  Does it means MasterProcedureUtil.NonceProcedureRunnable#run()
> is called twice and handled by different handlers?
>INFO [B.defaultRpcServer.handler=20(or 11),queue=2,port=16000]
> master.HMaster: Client=ec2-user/null delete mytable"
>
> 2. The rollback is triggered by the warning of "responseTooSlow" ?
>
> 3. I set hbase.ipc.warn.response.time to take the default value (10,000
> ms), why the responseTooSlow warning is reported with
> "processingtimems":179108
> ?
>
> 4. How could I debug further?
> --
>
>李响 Xiang Li
>
> 邮件 e-mail  :wate...@gmail.com
>


[HELP] on AWS EC2, responseTooSlow and rollback when dropping a table with a lot of regions

2018-05-04 Thread 李响
Dear community,

Really appreciate it if someone could provide some guides on how to debug
this.

We are using HBase 1.1.9 ( + rsgroup patch) on AWS EC2 instances, and I
found it is quite slow to drop a table with a lot of regions (300+).

The Master's log shows:

2018-04-27 12:39:58,483 INFO
[B.defaultRpcServer.handler=11,queue=2,port=16000] master.HMaster:
Client=ec2-user/null delete mytable
2018-04-27 12:39:58,737 INFO  [ProcedureExecutorThread-0]
hbase.MetaTableAccessor: Deleted [{ENCODED =>
0b3a3c3e991b38dd0921905f7200f094, NAME =>
'mytable,,1524805200089.0b3a3c3e991b38dd0921905f7200f094.', STARTKEY => '',
ENDKEY => '01-000-##'}, {ENCODED =>
5fba610c04629121e53fec53af869fe3, NAME =>
'mytable,01-000-##,1524805200089.5fba610c04629121e53fec53af869fe3.',
STARTKEY => '01-000-##', ENDKEY =>
'02-000-##'}, {ENCODED =>
653dbf4a1aba1fedb56fea9522d7c96a, NAME =>
'mytable,02-000-##,1524805200089.653dbf4a1aba1fedb56fea9522d7c96a.',
STARTKEY => '02-000-##', ENDKEY =>
'03-000-##'}, {ENCODED =>
7ba6bce2e7cc53a9bcc234244c835ac8, NAME =>
'mytable,03-000-##,1524805200089.7ba6bce2e7cc53a9bcc234244c835ac8.',
STARTKEY => '03-000-##', ENDKEY =>
'04-000-##'}, {ENCODED =>
5106e0ca76d2c026d2d8ff5ce59316cc,...
2018-04-27 12:40:58,687 INFO
[B.defaultRpcServer.handler=20,queue=2,port=16000] master.HMaster:
Client=ec2-user/null delete mytable
2018-04-27 12:42:57,592 WARN
[B.defaultRpcServer.handler=11,queue=2,port=16000] ipc.RpcServer:
(responseTooSlow):
{"call":"DeleteTable(org.apache.hadoop.hbase.protobuf.generated.MasterProtos$DeleteTableRequest)","starttimems":1524832798483,"responsesize":4,"method":"DeleteTable","param":"TODO:
class
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$DeleteTableRequest","processingtimems":179108,"client":"
10.22.22.236:45640","queuetimems":0,"class":"HMaster"}
2018-04-27 12:42:57,924 INFO [ProcedureExecutorThread-3]
procedure2.ProcedureExecutor: Rolledback procedure DeleteTableProcedure
(table=mytable) id=16515 owner=ec2-user state=ROLLEDBACK exec-time=1mins,
59.108sec exception=org.apache.hadoop.hbase.TableNotFoundException: mytable


The stack trace is like
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableNotFoundException):
mytable
at
org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:2216)
at
org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.prepareDelete(DeleteTableProcedure.java:255)
at
org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.executeFromState(DeleteTableProcedure.java:101)
...


It is quite weird for me that:
1. Why is the following message logged twice (with different handlers: 20
and 11) ?  Does it means MasterProcedureUtil.NonceProcedureRunnable#run()
is called twice and handled by different handlers?
   INFO [B.defaultRpcServer.handler=20(or 11),queue=2,port=16000]
master.HMaster: Client=ec2-user/null delete mytable"

2. The rollback is triggered by the warning of "responseTooSlow" ?

3. I set hbase.ipc.warn.response.time to take the default value (10,000
ms), why the responseTooSlow warning is reported with "processingtimems":179108
?

4. How could I debug further?
-- 

   李响 Xiang Li

邮件 e-mail  :wate...@gmail.com