[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629369#comment-15629369
 ] 

Ted Yu commented on HBASE-14417:
--------------------------------

I observed this in the TestHRegionServerBulkLoad-output for the version (v11 
and earlier) where bulk load marker is written directly to hbase:backup table 
in postAppend hook:
{code}
2016-09-13 23:10:14,072 DEBUG [B.defaultRpcServer.handler=4,queue=0,port=35667] 
ipc.CallRunner(112): B.defaultRpcServer.handler=4,queue=0,port=35667: callId: 
10646 service: ClientService methodName: Scan size: 264 connection:    
172.18.128.12:59780
org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 60000 
ms. regionName=atomicBulkLoad,,1473808150804.6b6c67612b01bce3348c144b959b7f0e., 
server=cn012.l42scl.hortonworks.com,35667,1473808145352
  at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7744)
  at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7725)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:7634)
  at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2588)
  at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2582)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2569)
  at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33516)
  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2229)
  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:136)
  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:111)
  at java.lang.Thread.run(Thread.java:745)
{code}
Here was the state of the BulkLoadHandler thread (stuck):
{code}
"RS:0;cn012:36301.append-pool9-t1" #453 prio=5 os_prio=0 tid=0x00007fc3945bb000 
nid=0x18ec in Object.wait() [0x00007fc30dada000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  at 
org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1727)
  - locked <0x0000000794750580> (a java.util.concurrent.atomic.AtomicLong)
  at 
org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1756)
  at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:241)
  at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191)
  - locked <0x0000000794750048> (a 
org.apache.hadoop.hbase.client.BufferedMutatorImpl)
  at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949)
  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569)
  at 
org.apache.hadoop.hbase.backup.impl.BackupSystemTable.writeBulkLoadDesc(BackupSystemTable.java:227)
  at 
org.apache.hadoop.hbase.backup.impl.BulkLoadHandler.postAppend(BulkLoadHandler.java:83)
  at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.postAppend(FSHLog.java:1448)
{code}
Even increasing handler count didn't help:
{code}
diff --git a/hbase-server/src/test/resources/hbase-site.xml 
b/hbase-server/src/test/resources/hbase-site.xml
index bca90a3..829fcc9 100644
--- a/hbase-server/src/test/resources/hbase-site.xml
+++ b/hbase-server/src/test/resources/hbase-site.xml
@@ -30,6 +30,10 @@
     </description>
   </property>
   <property>
+    <name>hbase.backup.enable</name>
+    <value>true</value>
+  </property>
+  <property>
     <name>hbase.defaults.for.version.skip</name>
     <value>true</value>
   </property>
@@ -48,11 +52,11 @@
   </property>
   <property>
     <name>hbase.regionserver.handler.count</name>
-    <value>5</value>
+    <value>50</value>
   </property>
   <property>
{code}
Post v11, the data stored in zookeeper is temporary: once an incremental backup 
is run for the table receiving bulk load, data in zookeeper would be stored for 
the backup Id and removed from zookeeper.

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 
> 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 
> 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to