[jira] [Updated] (HBASE-15411) Rewrite backup with Procedure V2

2016-03-11 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-15411:
---
Attachment: 15411-v7.txt

Patch v7 renames the HBaseAdmin method to backupTables and adds BackupType enum 
to this method so that the same method can be used to request both types of 
backups.

> Rewrite backup with Procedure V2
> 
>
> Key: HBASE-15411
> URL: https://issues.apache.org/jira/browse/HBASE-15411
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 15411-v1.txt, 15411-v3.txt, 15411-v5.txt, 15411-v6.txt, 
> 15411-v7.txt, FullTableBackupProcedure.java
>
>
> Currently full / incremental backup is driven by BackupHandler (see call() 
> method for flow).
> This issue is to rewrite the flow using Procedure V2.
> States (enum) for full / incremental backup would be introduced in 
> Backup.proto which correspond to the steps performed in BackupHandler#call().
> executeFromState() would pace the backup based on the current state.
> serializeStateData() / deserializeStateData() would be used to persist state 
> into procedure WAL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15130) Backport 0.98 Scan different TimeRange for each column family

2016-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191577#comment-15191577
 ] 

Hadoop QA commented on HBASE-15130:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 10m 27s 
{color} | {color:red} Docker failed to build yetus/hbase:date2016-03-11. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12792592/HBASE-15130-0.98.v4.patch
 |
| JIRA Issue | HBASE-15130 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/942/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Backport 0.98 Scan different TimeRange for each column family 
> --
>
> Key: HBASE-15130
> URL: https://issues.apache.org/jira/browse/HBASE-15130
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Affects Versions: 0.98.17
>Reporter: churro morales
>Assignee: churro morales
> Fix For: 0.98.19
>
> Attachments: HBASE-15130-0.98.patch, HBASE-15130-0.98.v1.patch, 
> HBASE-15130-0.98.v1.patch, HBASE-15130-0.98.v2.patch, 
> HBASE-15130-0.98.v3.patch, HBASE-15130-0.98.v4.patch
>
>
> branch 98 version backport for HBASE-14355



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck

2016-03-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191553#comment-15191553
 ] 

Sangjin Lee commented on HBASE-15436:
-

Thanks [~anoop.hbase] for your comments. To answer your question,

bq. So after seeing this log how long u wait?

I believe the user tried to shut it down about 30 minutes after this failure:
{noformat}
Fri Feb 26 00:39:03 IST 2016, null, java.net.SocketTimeoutException: 
callTimeout=6, callDuration=68065: row 
'timelineservice.entity,naga!yarn_cluster!flow_1456425026132_1!���!M�����!YARN_CONTAINER!container_1456425026132_0001_01_01,99'
 on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
hostname=localhost,16201,1456365764939, seqNum=0
...
2016-02-26 01:09:19,799 ERROR 
org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: 
SIGTERM
{noformat}

Also, I'm not too sure it is the case that flush is still going through the 
mutations. This is the stack trace of the thread that was in the {{flush()}} 
call (taken *after* this exception was seen):

{noformat}
"pool-14-thread-1" prio=10 tid=0x7f4215268000 nid=0x46e6 waiting on 
condition [0x7f41fe75d000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xeeb5a010> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at 
java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374)
at 
org.apache.hadoop.hbase.util.BoundedCompletionService.take(BoundedCompletionService.java:75)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:190)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at 
org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:211)
at 
org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:185)
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1200)
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1109)
at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369)
at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
- locked <0xc246f268> (a 
org.apache.hadoop.hbase.client.BufferedMutatorImpl)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.common.BufferedMutatorDelegator.flush(BufferedMutatorDelegator.java:66)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.flush(HBaseTimelineWriterImpl.java:457)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager$WriterFlushTask.run(TimelineCollectorManager.java:230)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The stack trace strongly indicates that it is waiting for more tasks to be 
completed and *is idle*. I wasn't the one who observed this, and don't have any 
more thread dumps around that time.

> BufferedMutatorImpl.flush() appears to get stuck
> 
>
> Key: HBASE-15436
> URL: https://issues.apache.org/jira/browse/HBASE-15436
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.0.2
>Reporter: Sangjin Lee
> Attachments: hbaseException.log, threaddump.log
>
>
> We noticed an instance where the thread that was executing a flush 
> ({{BufferedMutatorImpl.flush()}

[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large

2016-03-11 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-15430:

Attachment: HBASE-15430-v4.patch

[~miloveme] take a look at v4 and see if you are ok with it. It looked like 
that with a couple of changes in SnapshotTestingUtil we can avoid all the 
manual protobuf coding and HRegion instantiation in the test.

> Failed taking snapshot - Manifest proto-message too large
> -
>
> Key: HBASE-15430
> URL: https://issues.apache.org/jira/browse/HBASE-15430
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 0.98.11
>Reporter: JunHo Cho
>Assignee: JunHo Cho
>Priority: Critical
> Fix For: 0.98.18
>
> Attachments: HBASE-15430-v4.patch, hbase-15430-v1.patch, 
> hbase-15430-v2.patch, hbase-15430-v3.branch.0.98.patch, hbase-15430.patch
>
>
> the size of a protobuf message is 64MB (default). but the size of snapshot 
> meta is over 64MB. 
> Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed 
> taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to 
> exception:Protocol message was too large.  May be malicious.  Use 
> CodedInputStream.setSizeLimit() to increase the size 
> limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message 
> was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to 
> increase the size limit.
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341)
> ... 10 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
> message was too large.  May be malicious.  Use 
> CodedInputStream.setSizeLimit() to increase the size limit.
> at 
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at 
> com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at 
> com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
> at 
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889)
> at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094)
> at 
> org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433)
> at 
> org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotMani

[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-03-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191477#comment-15191477
 ] 

stack commented on HBASE-15406:
---

To be more clear, -1 on a patch that has master doing a rollback of a state set 
by external administrator's tool. HBCK already leaves the cluster in a state of 
disequilibrium when killed mid-processing... Usual way this is addressed is 
HBCK gets rerun... not master does cleanup.

On the patch, the admin additions look useful. The serialization of data to zk 
needs to be a pb a/ PBUF Magic as preface See other places where we do zk 
serialization.



> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15448) HBase Backup Phase 3: Restore optimization 2

2016-03-11 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15448:
--
Summary: HBase Backup Phase 3: Restore optimization 2  (was: HBase Backup 
Phase 2: Restore optimization 2)

> HBase Backup Phase 3: Restore optimization 2
> 
>
> Key: HBASE-15448
> URL: https://issues.apache.org/jira/browse/HBASE-15448
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> JIRA opened to continue work on restore optimization.
> This will focus on the following
> # During incremental backup image restore - restoring full backup into region 
> boundaries of the most recent incremental  backup image.
> # Combining multiple tables into single M/R job 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191426#comment-15191426
 ] 

Andrew Purtell edited comment on HBASE-15431 at 3/11/16 7:29 PM:
-

bq.  After that I no longer see any "hot method too big" messages. But 
performance actually seemed to be slower. 

This is why you normally shouldn't try to tune inlining heuristics. :-)

bq. Note that almost all interesting methods using during scanning are not 
inlined.

That's because interesting == nontrivial == big, right? 

bq. But as I said, it's not a trivial quick fix, it's need to rethinking of the 
structure of this method. 

I will venture the opinion that JVM compiler heuristics are pretty good. 
However it does make sense to look at methods right on the edge that show up as 
hot. By all means tinker and see if we can get an improvement after a refactor. 
It's a bit like prospecting though, you'll have to drill in a bunch of places 
before striking oil.

Edit: And the payoff you see will depend on and vary between major JVM compiler 
drops. 7u versus 8u. 8u60 versus earlier versions, etc etc.


was (Author: apurtell):
bq.  After that I no longer see any "hot method too big" messages. But 
performance actually seemed to be slower. 

This is why you normally shouldn't try to tune inlining heuristics. :-)

bq. Note that almost all interesting methods using during scanning are not 
inlined.

That's because interesting == nontrivial == big, right? 

bq. But as I said, it's not a trivial quick fix, it's need to rethinking of the 
structure of this method. 

I will venture the opinion that JVM compiler heuristics are pretty good. 
However it does make sense to look at methods right on the edge that show up as 
hot. By all means tinker and see if we can get an improvement after a refactor. 
It's a bit like prospecting though, you'll have to drill in a bunch of places 
before striking oil.

> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15130) Backport 0.98 Scan different TimeRange for each column family

2016-03-11 Thread churro morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

churro morales updated HBASE-15130:
---
Status: Patch Available  (was: Reopened)

> Backport 0.98 Scan different TimeRange for each column family 
> --
>
> Key: HBASE-15130
> URL: https://issues.apache.org/jira/browse/HBASE-15130
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Affects Versions: 0.98.17
>Reporter: churro morales
>Assignee: churro morales
> Fix For: 0.98.19
>
> Attachments: HBASE-15130-0.98.patch, HBASE-15130-0.98.v1.patch, 
> HBASE-15130-0.98.v1.patch, HBASE-15130-0.98.v2.patch, 
> HBASE-15130-0.98.v3.patch, HBASE-15130-0.98.v4.patch
>
>
> branch 98 version backport for HBASE-14355



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191426#comment-15191426
 ] 

Andrew Purtell commented on HBASE-15431:


bq.  After that I no longer see any "hot method too big" messages. But 
performance actually seemed to be slower. 

This is why you normally shouldn't try to tune inlining heuristics. :-)

bq. Note that almost all interesting methods using during scanning are not 
inlined.

That's because interesting == nontrivial == big, right? 

bq. But as I said, it's not a trivial quick fix, it's need to rethinking of the 
structure of this method. 

I will venture the opinion that JVM compiler heuristics are pretty good. 
However it does make sense to look at methods right on the edge that show up as 
hot. By all means tinker and see if we can get an improvement after a refactor. 
It's a bit like prospecting though, you'll have to drill in a bunch of places 
before striking oil.

> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15448) HBase Backup Phase 2: Restore optimization 2

2016-03-11 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15448:
--
Description: 
JIRA opened to continue work on restore optimization.

This will focus on the following

# During incremental backup image restore - restoring full backup into region 
boundaries of the most recent incremental  backup image.
# Combining multiple tables into single M/R job 

  was:JIRA opened to continue work on restore optimization


> HBase Backup Phase 2: Restore optimization 2
> 
>
> Key: HBASE-15448
> URL: https://issues.apache.org/jira/browse/HBASE-15448
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> JIRA opened to continue work on restore optimization.
> This will focus on the following
> # During incremental backup image restore - restoring full backup into region 
> boundaries of the most recent incremental  backup image.
> # Combining multiple tables into single M/R job 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15331) HBase Backup/Restore Phase 2: Optimized Restore operation

2016-03-11 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-15331.
---
Resolution: Fixed

The work continues in HBASE-15448

> HBase Backup/Restore Phase 2: Optimized Restore operation
> -
>
> Key: HBASE-15331
> URL: https://issues.apache.org/jira/browse/HBASE-15331
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> The current implementation for restore uses WALReplay M/R job. This has 
> performance and stability problems, since it uses HBase client API to insert 
> data. We have to migrate to bulk load approach: generate hfiles directly from 
> snapshot and incremental images. We run separate M/R job for every backup 
> image between last FULL backup and current incremental backup we restore to 
> and for every table in the list (image). If we have 10 tables and 30 days of 
> incremental backup images - this results in 30x10 = 300 M/R jobs. MUST be 
> optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15449) HBase Backup Phase 2: Support physical table layout change

2016-03-11 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15449:
--
Summary: HBase Backup Phase 2: Support physical table layout change   (was: 
HBase Backup Phase 2: support physical table layout change )

> HBase Backup Phase 2: Support physical table layout change 
> ---
>
> Key: HBASE-15449
> URL: https://issues.apache.org/jira/browse/HBASE-15449
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> Table operation such as add column family, delete column family, truncate , 
> delete table may result in subsequent backup restore failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15449) HBase Backup Phase 2: support physical table layout change

2016-03-11 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-15449:
-

 Summary: HBase Backup Phase 2: support physical table layout 
change 
 Key: HBASE-15449
 URL: https://issues.apache.org/jira/browse/HBASE-15449
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Table operation such as add column family, delete column family, truncate , 
delete table may result in subsequent backup restore failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck

2016-03-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191409#comment-15191409
 ] 

Ted Yu commented on HBASE-15406:


Heng:
Can you describe how you tested the latest patch ?

Thanks

> Split / merge switch left disabled after early termination of hbck
> --
>
> Key: HBASE-15406
> URL: https://issues.apache.org/jira/browse/HBASE-15406
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15406.patch, HBASE-15406.v1.patch, wip.patch
>
>
> This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday:
> Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster
> Terminate hbck early
> Enter hbase shell where I observed:
> {code}
> hbase(main):001:0> splitormerge_enabled 'SPLIT'
> false
> 0 row(s) in 0.3280 seconds
> hbase(main):002:0> splitormerge_enabled 'MERGE'
> false
> 0 row(s) in 0.0070 seconds
> {code}
> Expectation is that the split / merge switches should be restored to default 
> value after hbck exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15448) HBase Backup Phase 2: Restore optimization 2

2016-03-11 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-15448:
-

 Summary: HBase Backup Phase 2: Restore optimization 2
 Key: HBASE-15448
 URL: https://issues.apache.org/jira/browse/HBASE-15448
 Project: HBase
  Issue Type: Task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


JIRA opened to continue work on restore optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15331) HBase Backup/Restore Phase 2: Optimized Restore operation

2016-03-11 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191388#comment-15191388
 ] 

Vladimir Rodionov commented on HBASE-15331:
---

Partially resolved (in HBASE-14123-v11 patch). 

Now we restore all intermediate incremental images in a single M/R job instead 
of multiple (one per image) and we use bulk load and nor streaming puts. 





> HBase Backup/Restore Phase 2: Optimized Restore operation
> -
>
> Key: HBASE-15331
> URL: https://issues.apache.org/jira/browse/HBASE-15331
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> The current implementation for restore uses WALReplay M/R job. This has 
> performance and stability problems, since it uses HBase client API to insert 
> data. We have to migrate to bulk load approach: generate hfiles directly from 
> snapshot and incremental images. We run separate M/R job for every backup 
> image between last FULL backup and current incremental backup we restore to 
> and for every table in the list (image). If we have 10 tables and 30 days of 
> incremental backup images - this results in 30x10 = 300 M/R jobs. MUST be 
> optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15411) Rewrite backup with Procedure V2

2016-03-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191364#comment-15191364
 ] 

Ted Yu commented on HBASE-15411:


In test run of TestFullBackup#testFullBackupSingle, even though snapshot 
succeeded:
{code}
2016-03-10 20:33:13,000 DEBUG [B.defaultRpcServer.handler=1,queue=0,port=56481] 
snapshot.SnapshotManager(359): Snapshot '{ 
ss=snapshot_1457670792650_default_test-1457670784996   table=test-1457670784996 
type=FLUSH }' has completed, notifying client.
{code}
there was no occurrence of the following log at the beginning of state 3 (test 
finished soon after taking snapshot):
{code}
case SNAPSHOT_COPY:
  // do snapshot copy
  LOG.debug("snapshot copy");
{code}
Tried moving snapshot action into FullTableBackupProcedure - result was the 
same.

> Rewrite backup with Procedure V2
> 
>
> Key: HBASE-15411
> URL: https://issues.apache.org/jira/browse/HBASE-15411
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 15411-v1.txt, 15411-v3.txt, 15411-v5.txt, 15411-v6.txt, 
> FullTableBackupProcedure.java
>
>
> Currently full / incremental backup is driven by BackupHandler (see call() 
> method for flow).
> This issue is to rewrite the flow using Procedure V2.
> States (enum) for full / incremental backup would be introduced in 
> Backup.proto which correspond to the steps performed in BackupHandler#call().
> executeFromState() would pace the backup based on the current state.
> serializeStateData() / deserializeStateData() would be used to persist state 
> into procedure WAL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-15447) Improve javadocs description for Delete methods

2016-03-11 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil reassigned HBASE-15447:


Assignee: Wellington Chevreuil

> Improve javadocs description for Delete methods
> ---
>
> Key: HBASE-15447
> URL: https://issues.apache.org/jira/browse/HBASE-15447
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
>
> Current javadoc for Delete operation is a bit confusing. Even though initial 
> section does describe the proper behaviour, where calling delete without 
> specifying timestamp will only delete records from now to the past, this is 
> not as clear in the method specific description
> {noformat}
> public Delete(byte[] row)
> Create a Delete operation for the specified row.
> If no further operations are done, this will delete everything associated 
> with the specified row (all versions of all columns in all families).
> Parameters:
> row - row key
> {noformat}
> The description can lead to the conclusion that all versions will be deleted. 
> But that's not true if a row has a future timestamp. Although this is a very 
> unusual scenario (having a future timestamp), it should still be clarified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15447) Improve javadocs description for Delete methods

2016-03-11 Thread Wellington Chevreuil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15447:
-
Component/s: documentation

> Improve javadocs description for Delete methods
> ---
>
> Key: HBASE-15447
> URL: https://issues.apache.org/jira/browse/HBASE-15447
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Wellington Chevreuil
>Priority: Minor
>
> Current javadoc for Delete operation is a bit confusing. Even though initial 
> section does describe the proper behaviour, where calling delete without 
> specifying timestamp will only delete records from now to the past, this is 
> not as clear in the method specific description
> {noformat}
> public Delete(byte[] row)
> Create a Delete operation for the specified row.
> If no further operations are done, this will delete everything associated 
> with the specified row (all versions of all columns in all families).
> Parameters:
> row - row key
> {noformat}
> The description can lead to the conclusion that all versions will be deleted. 
> But that's not true if a row has a future timestamp. Although this is a very 
> unusual scenario (having a future timestamp), it should still be clarified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15447) Improve javadocs description for Delete methods

2016-03-11 Thread Wellington Chevreuil (JIRA)
Wellington Chevreuil created HBASE-15447:


 Summary: Improve javadocs description for Delete methods
 Key: HBASE-15447
 URL: https://issues.apache.org/jira/browse/HBASE-15447
 Project: HBase
  Issue Type: Improvement
Reporter: Wellington Chevreuil
Priority: Minor


Current javadoc for Delete operation is a bit confusing. Even though initial 
section does describe the proper behaviour, where calling delete without 
specifying timestamp will only delete records from now to the past, this is not 
as clear in the method specific description

{noformat}
public Delete(byte[] row)
Create a Delete operation for the specified row.
If no further operations are done, this will delete everything associated with 
the specified row (all versions of all columns in all families).

Parameters:
row - row key
{noformat}

The description can lead to the conclusion that all versions will be deleted. 
But that's not true if a row has a future timestamp. Although this is a very 
unusual scenario (having a future timestamp), it should still be clarified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13963) avoid leaking jdk.tools

2016-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191301#comment-15191301
 ] 

Hudson commented on HBASE-13963:


FAILURE: Integrated in HBase-1.0 #1149 (See 
[https://builds.apache.org/job/HBase-1.0/1149/])
HBASE-13963 Do not leak jdk.tools dependency from hbase-annotations (busbey: 
rev 7947de62853794354dc423c72d1496081af0ef16)
* hbase-examples/pom.xml
* hbase-hadoop2-compat/pom.xml
* hbase-testing-util/pom.xml
* hbase-rest/pom.xml
* hbase-client/pom.xml
* hbase-common/pom.xml
* hbase-protocol/pom.xml


> avoid leaking jdk.tools
> ---
>
> Key: HBASE-13963
> URL: https://issues.apache.org/jira/browse/HBASE-13963
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, documentation
>Reporter: Sean Busbey
>Assignee: Gabor Liptak
>Priority: Critical
> Fix For: 2.0.0, 0.98.14, 1.2.0, 1.3.0, 1.1.4, 1.0.4
>
> Attachments: HBASE-13963.1.patch, HBASE-13963.2.patch
>
>
> Right now hbase-annotations uses jdk7 jdk.tools and exposes that to 
> downstream via hbase-client. We need it for building and using our custom 
> doclet, but can improve a couple of things: 
> -1) We should be using a jdk.tools version based on our java version (use jdk 
> activated profiles to set it)-
> 2) We should not be including any jdk.tools version in our hbase-client 
> transitive dependencies (or other downstream-facing artifacts). 
> Unfortunately, system dependencies are included in transitive resolution, so 
> we'll need to exclude it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-11 Thread Daniel Pol (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191285#comment-15191285
 ] 

Daniel Pol commented on HBASE-15392:


I'm getting things mixed now and just want to clear them a little. We have 2 
separate issues:
1. Single Cell get reads the next block also. This happens every time. I 
believe here we are in agreement that this is wrong and it can be fixed, you're 
maybe still discussing what's the best way to fix this. 
2. Single row scan (it's a 'get' in hbase shell) reads the next block also. I 
need to test a few more cases, but I believe this happens every time you're 
scanning the last cell in a block. If I understand correctly the issue is that 
looking at the index for the next block it actually has an imaginary key in it 
based on the current block row. So unless there's additional info in the 
indexes or bloom filters, there's no way to know for sure there's not another 
valid cell in the next block.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreF

[jira] [Updated] (HBASE-14123) HBase Backup/Restore Phase 2

2016-03-11 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14123:
--
Attachment: HBASE-14123-v11.patch

> HBase Backup/Restore Phase 2
> 
>
> Key: HBASE-14123
> URL: https://issues.apache.org/jira/browse/HBASE-14123
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Attachments: HBASE-14123-v1.patch, HBASE-14123-v10.patch, 
> HBASE-14123-v11.patch, HBASE-14123-v2.patch, HBASE-14123-v3.patch, 
> HBASE-14123-v4.patch, HBASE-14123-v5.patch, HBASE-14123-v6.patch, 
> HBASE-14123-v7.patch, HBASE-14123-v9.patch
>
>
> Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15187) Integrate CSRF prevention filter to REST gateway

2016-03-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191222#comment-15191222
 ] 

Ted Yu commented on HBASE-15187:


Looks like the two linked JIRAs are both resolved.

> Integrate CSRF prevention filter to REST gateway
> 
>
> Key: HBASE-15187
> URL: https://issues.apache.org/jira/browse/HBASE-15187
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HBASE-15187-branch-1.v13.patch, HBASE-15187.v1.patch, 
> HBASE-15187.v10.patch, HBASE-15187.v11.patch, HBASE-15187.v12.patch, 
> HBASE-15187.v13.patch, HBASE-15187.v2.patch, HBASE-15187.v3.patch, 
> HBASE-15187.v4.patch, HBASE-15187.v5.patch, HBASE-15187.v6.patch, 
> HBASE-15187.v7.patch, HBASE-15187.v8.patch, HBASE-15187.v9.patch
>
>
> HADOOP-12691 introduced a filter in Hadoop Common to help REST APIs guard 
> against cross-site request forgery attacks.
> This issue tracks the integration of that filter into HBase REST gateway.
> From REST section of refguide:
> To delete a table, use a DELETE request with the /schema endpoint:
> http://example.com:8000/schema
> Suppose an attacker hosts a malicious web form on a domain under his control. 
> The form uses the DELETE action targeting a REST URL. Through social 
> engineering, the attacker tricks an authenticated user into accessing the 
> form and submitting it.
> The browser sends the HTTP DELETE request to the REST gateway.
> At REST gateway, the call is executed and user table is dropped



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191185#comment-15191185
 ] 

Anoop Sam John commented on HBASE-15325:


bq. if user only accept complete row, when region moved we will 
clearPartialResults() and set startRow to Bytes.add(lastResult.getRow(), new 
byte[1]), it will be ok.
So user does not accept partial rows.. But RS may send partial results. Say for 
a row with 10 cells, we got a partial result of 5 cells and then a region move 
happens.  Now the new scan is being created with start row = current row + [0]. 
 So what abt the remaining Cells in cur row.  And if that partial results was 
cleared what abt those?

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v2.txt, 
> HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, 
> HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, 
> HBASE-15325-v6.5.txt, HBASE-15325-v6.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-11 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190916#comment-15190916
 ] 

Phil Yang commented on HBASE-15325:
---

I am still working on this. Will submit new patch this weekend or next week.

This issue does appear only when user setBatch or setAllowPartialResult(true), 
because if user only accept complete row, when region moved we will 
clearPartialResults() and set startRow to Bytes.add(lastResult.getRow(), new 
byte[1]), it will be ok.

{quote}
So this is another issue related to Filter when batched read in place.
{quote}
Yes, there is another issue about the filter. I find this
{code}
// If the size limit was reached it means a partial Result is being
// returned. Returning a
// partial Result means that we should not reset the filters; filters
// should only be reset in
// between rows
if (!scannerContext.partialResultFormed()) resetFilters();
{code}
So we need check BATCH_LIMIT_REACHED,too right?

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v2.txt, 
> HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, 
> HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, 
> HBASE-15325-v6.5.txt, HBASE-15325-v6.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-11 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190915#comment-15190915
 ] 

Ashish Singhi commented on HBASE-15433:
---

{quote}
If the snapshot contains less regions than the current table's, 
'checkAndUpdateNamespaceRegionQuota' will update the region count for the 
table, we need to reset the region count in 'catch' block if 'restoreSnapshot' 
throw exception?
{quote}
Sorry I am not getting your point. 
Correct me If I am wrong. The issue what I think is if 
{{checkAndUpdateNamespaceRegionQuota}} throws exception then also we were 
updating the table information in the namespace quota which was not correct. So 
in catch clause first let us catch QEE and then IOE. If QEE is caught then we 
will not update the quota information.

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-11 Thread Jianwei Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190901#comment-15190901
 ] 

Jianwei Cui commented on HBASE-15433:
-

I get your point, Yes, it will be more concise to remove the 'else' keyword:)

> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception

2016-03-11 Thread Jianwei Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190895#comment-15190895
 ] 

Jianwei Cui commented on HBASE-15433:
-

Thanks for your comment, will make a new unit test and update the patch.
{quote}
I was thinking of a simple fix like, just catch the QuoteExceededException and 
don't remove the table from namespace quota.
{quote}
If the snapshot contains less regions than the current table's, 
'checkAndUpdateNamespaceRegionQuota' will update the region count for the 
table, we need to reset the region count in 'catch' block if 'restoreSnapshot' 
throw exception?


> SnapshotManager#restoreSnapshot not update table and region count quota 
> correctly when encountering exception
> -
>
> Key: HBASE-15433
> URL: https://issues.apache.org/jira/browse/HBASE-15433
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
> Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, 
> HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be 
> checked and updated as:
> {code}
>   try {
> // Table already exist. Check and update the region quota for this 
> table namespace
> checkAndUpdateNamespaceRegionQuota(manifest, tableName);
> restoreSnapshot(snapshot, snapshotTableDesc);
>   } catch (IOException e) {
> 
> this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
> LOG.error("Exception occurred while restoring the snapshot " + 
> snapshot.getName()
> + " as table " + tableName.getNameAsString(), e);
> throw e;
>   }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot 
> make the region count quota exceeded, then, the table will be removed in the 
> 'catch' block. This will make the current table count and region count 
> decrease, following table creation or region split will succeed even if the 
> actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190892#comment-15190892
 ] 

Hadoop QA commented on HBASE-15392:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
6s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
7s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 5s 
{color} | {color:red} hbase-server: patch generated 10 new + 175 unchanged - 9 
fixed = 185 total (was 184) {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 55s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 26s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 110m 21s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 224m 35s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Timed out junit tests | 
org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestModifyTableProcedure |
|   | org.apache.hadoop.hbase.snapshot.TestSnapshotClientRetries |
|   | org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 |
|   | org.apache.hadoop.hbase.master.TestMasterFailover |
|   | org.apache.hadoop.hbase.coprocessor.TestRegionObserverScannerOpenHook |
| 

[jira] [Commented] (HBASE-14030) HBase Backup/Restore Phase 1

2016-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190857#comment-15190857
 ] 

Hadoop QA commented on HBASE-14030:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 12s 
{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
31s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 28s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 31m 
2s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 
56s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 8s 
{color} | {color:red} hbase-protocol in the patch failed. {color} |
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 9s 
{color} | {color:red} hbase-protocol in the patch failed. {color} |
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 12s 
{color} | {color:red} root in the patch failed. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped branch modules with no Java source: . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 50s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 42s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 29m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 
5s {color} | {color:green} There were no new shellcheck issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s 
{color} | {color:red} The patch has 134 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 5s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} ha

[jira] [Assigned] (HBASE-15439) Mob compaction is not triggered after extended period of time

2016-03-11 Thread Jingcheng Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du reassigned HBASE-15439:


Assignee: Jingcheng Du

> Mob compaction is not triggered after extended period of time
> -
>
> Key: HBASE-15439
> URL: https://issues.apache.org/jira/browse/HBASE-15439
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Jingcheng Du
>
> I was running IntegrationTestIngestWithMOB test.
> I lower the mob compaction chore interval to this value:
> {code}
> 
>   hbase.mob.compaction.chore.period
>   6000
> 
> {code}
> After whole night, there was no indication from master log that mob 
> compaction ran.
> All I found was:
> {code}
> 2016-03-09 04:18:52,194 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 05:58:52,516 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 07:38:52,847 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 09:18:52,848 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 10:58:52,932 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 12:38:52,932 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 14:18:52,933 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 15:58:52,957 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 17:38:52,960 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15265) Implement an asynchronous FSHLog

2016-03-11 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190830#comment-15190830
 ] 

Duo Zhang commented on HBASE-15265:
---

{quote}
Now asyncfs also coming in. So how to setup a WAL system with new way of async 
and also have N wals per RS?
{quote}
Currently no way...
See the above comments, I think multiwal should not act as a 'WALProvider' in 
config, instead, it should be an option.
What do you think? [~anoopsamjohn]

> Implement an asynchronous FSHLog
> 
>
> Key: HBASE-15265
> URL: https://issues.apache.org/jira/browse/HBASE-15265
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-15265-v1.patch, HBASE-15265-v2.patch, 
> HBASE-15265-v3.patch, HBASE-15265-v4.patch, HBASE-15265-v5.patch, 
> HBASE-15265-v6.patch, HBASE-15265.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15441) Fix WAL splitting when region has moved multiple times

2016-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190820#comment-15190820
 ] 

Hadoop QA commented on HBASE-15441:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
48s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
38s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 5m 21s 
{color} | {color:red} hbase-server: patch generated 1 new + 289 unchanged - 0 
fixed = 290 total (was 289) {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
37m 25s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 130m 40s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 140m 22s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
31s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 336m 20s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.hbase.security.access.TestAccessController |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hbase.security.access.TestAccessController |
|   | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.9.1 Server=1.9.1 Image:yetus/hbase:d

[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190792#comment-15190792
 ] 

Ashish Singhi commented on HBASE-15446:
---

Need to think more on this on how to make this operation atomic. I Will comment 
back then.

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190774#comment-15190774
 ] 

Ashish Singhi commented on HBASE-15446:
---

Sorry for the confusion.
When we fail to append the event marker in the WAL we will return false and 
then internally LoadIncrementalHFiles will retry with the loading of these 
hfiles again for configurable number of times and here we have to assume that 
it is second trial and ignore the FNFE.
Suppose if it was first attempt and the file was not found then we may end up 
writing a entry for this file in the WAL. Currently I think this entries in the 
WAL are mainly used in replication of bulk loaded data where we will ignore the 
hfiles which is not found in source data directory and in archive directory.

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190758#comment-15190758
 ] 

Anoop Sam John commented on HBASE-15446:


bq.Yes, but I don't see any issue if its loaded again, data is same but then we 
will have another hfile with different seq id but I don't see any major impact 
with it.
Its not about the duplicated file.. What I say is abt the op status we say to 
the user is very much incorrect.  The file is loaded actually right? Even a 
scan will be able to see the cells in that.. (Say user has not retried the 
failed attempt)

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190751#comment-15190751
 ] 

Anoop Sam John commented on HBASE-15446:


Sorry still confusing.   So u say here that when the 1st attempt failed to 
write to WAL (But actually loaded the file) we will fail the op to the user 
app.  
And then say
bq.so on next retry we will not find this file to load and will get 
FileNotFoundException which I think we can catch that exception and ignore it.
Here we mean HBase code?  How we can know whether it is the second trial? 
Because the 1st op is failed to user. So the 2nd call also came from user.

If we say back to the user, that the bulk load op is failed, we should be sure 
that the load is not actually happened in any of the cluster.  Else it will be 
very much confusing for the user later.

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190745#comment-15190745
 ] 

Anoop Sam John commented on HBASE-15325:


bq.ResultScanner allowing partial result will miss the rest of the row if the 
region is moved between two rpc requests
We need change the subject and desc.
It is not when allow partial is set on Scan or batch is set.   The issue comes 
when a region moves while we are fetching chunked results to client.  Mind 
changing?
allowing partial result  has nothing to do in server side as we discussed.


> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v2.txt, 
> HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, 
> HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, 
> HBASE-15325-v6.5.txt, HBASE-15325-v6.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190741#comment-15190741
 ] 

Anoop Sam John commented on HBASE-15325:


[~yangzhe1991]  U will provide new patch with client side only change?

As noted in RB
{quote}
I fear we can not do this.  Checking where all we use this, at server side 
usage area, seems we have to have this change.
eg: Filter#reset() should be called btw rows not btw batches.  So batch limit 
check addition seems correct there.
{quote}

So this is another issue related to Filter when batched read in place.  We can 
track it as another issue?  Will raise?

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v2.txt, 
> HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, 
> HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, 
> HBASE-15325-v6.5.txt, HBASE-15325-v6.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190736#comment-15190736
 ] 

Ashish Singhi commented on HBASE-15446:
---

bq. When u say this, u mean fail op to user app which tried to bulk load the 
file? But the file is actually loaded here in this cluster.
Yes, but I don't see any issue if its loaded again, data is same but then we 
will have another hfile with different seq id but I don't see any major impact 
with it.
Also this will happen only when the input data is not in same FS. As in the 
case of same FS (most common use case I believe) the hfiles will be actually 
moved so on next retry we will not find this file to load and will get 
FileNotFoundException which I think we can catch that exception and ignore it.

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-11 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-15180:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to master.  Thanks for the reviews and great suggestions

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190725#comment-15190725
 ] 

Anoop Sam John commented on HBASE-15446:


bq.throw back the exception to the client and let the client retry
When u say this, u mean fail op to user app which tried to bulk load  the file? 
  But the file is actually loaded here in this cluster.

> Do not abort RS when WAL append for bulk load event marker is failed
> 
>
> Key: HBASE-15446
> URL: https://issues.apache.org/jira/browse/HBASE-15446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>
> During bulk load process when the RS fails to append the bulk load event 
> marker in WAL we abort that RS which is actually not really required. Instead 
> just throw back the exception to the client and let the client retry. This 
> will be helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck

2016-03-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190717#comment-15190717
 ] 

Anoop Sam John commented on HBASE-15436:


So you say after u see the log abt failure (after some 30+ mins, in fact 36 
mins I guess, as 1 min seems socket time out and 36 attempts there), still the 
flush is not coming out. So after seeing this log how long u wait?
So this is an async way of write to table.. Ya when the size of accumulated 
puts become some configured size, we will do a flush. Till then puts are 
accumulated at client side.
I believe I got the issue. This is not a dead lock or so.  
To this flush we will pass all the Rows to flush (Write to RS).  Rows I mean 
Mutations.
It will try to group the mutations per server and will contact each of the 
server with List of mutations to go there.
Well to group this it checks the region locations for each of the row. And the 
scan happens to META (as shown in logs) and it fails.  For the 1st Mutation in 
this list itself, it took 36 mins.  Because the scan to META has retries.  Each 
of the trial fails after the SocketTimeout

See in AsyncProcess#submit
{code}
do {
  ...
  int posInList = -1;
  Iterator it = rows.iterator();
  while (it.hasNext()) {
Row r = it.next();
HRegionLocation loc;
try {
  if (r == null) throw new IllegalArgumentException("#" + id + ", row 
cannot be null");
  // Make sure we get 0-s replica.
  RegionLocations locs = connection.locateRegion(
  tableName, r.getRow(), true, true, 
RegionReplicaUtil.DEFAULT_REPLICA_ID);
  
} catch (IOException ex) {
  locationErrors = new ArrayList();
  locationErrorRows = new ArrayList();
  LOG.error("Failed to get region location ", ex);
  // This action failed before creating ars. Retain it, but do not add 
to submit list.
  // We will then add it to ars in an already-failed state.
  retainedActions.add(new Action(r, ++posInList));
  locationErrors.add(ex);
  locationErrorRows.add(posInList);
  it.remove();
  break; // Backward compat: we stop considering actions on location 
error.
}

   .
  }
} while (retainedActions.isEmpty() && atLeastOne && (locationErrors == 
null));
{code}
The List 'rows' is the same List which BufferedMutatorImpl hold. (ie. 
writeAsyncBuffer).   So for the 1st Mutation the region location lookup failed 
and that Mutation got removed from this List also as u can see.  This will 
eventually marked as failed op. And the flow comes back to 
BufferedMutatorImpl#backgroundFlushCommits
Here we can see
{code}
if (synchronous || ap.hasError()) {
while (!writeAsyncBuffer.isEmpty()) {
  ap.submit(tableName, writeAsyncBuffer, true, null, false);
}
{code}
The loop continues till writeAsyncBuffer is non empty.  So in this 36 mins we 
could remove only one item from the list.  Again it goes on and removes the  
2nd and so on.   So if there are 100 Mutation in the list when we called 
flush(), it would get over after  36 * 100 mins  !

Am not much knowing the design consideration of this AsyncProcess etc.   May be 
we should narrow down the lock on close() method from method level and set some 
thing like a closing state to true, the retries within the flows should check 
for this state and early out with a fat WARN log saying we will loose some of 
the mutations applied till now. (?)

> BufferedMutatorImpl.flush() appears to get stuck
> 
>
> Key: HBASE-15436
> URL: https://issues.apache.org/jira/browse/HBASE-15436
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.0.2
>Reporter: Sangjin Lee
> Attachments: hbaseException.log, threaddump.log
>
>
> We noticed an instance where the thread that was executing a flush 
> ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster 
> shut down and was unable to get out of that stuck state.
> The setup is a single node HBase cluster, and apparently the cluster went 
> away when the client was executing flush. The flush eventually logged a 
> failure after 30+ minutes of retrying. That is understandable.
> What is unexpected is that thread is stuck in this state (i.e. in the 
> {{flush()}} call). I would have expected the {{flush()}} call to return after 
> the complete failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored

2016-03-11 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190692#comment-15190692
 ] 

Ashish Singhi commented on HBASE-15425:
---

Raised HBASE-15446 and linked to this issue.

> Failing to write bulk load event marker in the WAL is ignored
> -
>
> Key: HBASE-15425
> URL: https://issues.apache.org/jira/browse/HBASE-15425
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-15425.patch, HBASE-15425.v1.patch
>
>
> During LoadIncrementalHFiles process if we fail to write the bulk load event 
> marker in the WAL, it is ignored. So this will lead to data mismatch issue in 
> source and peer cluster in case of bulk loaded data replication scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15446) Do not abort RS when WAL append for bulk load event marker is failed

2016-03-11 Thread Ashish Singhi (JIRA)
Ashish Singhi created HBASE-15446:
-

 Summary: Do not abort RS when WAL append for bulk load event 
marker is failed
 Key: HBASE-15446
 URL: https://issues.apache.org/jira/browse/HBASE-15446
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Ashish Singhi
Assignee: Ashish Singhi


During bulk load process when the RS fails to append the bulk load event marker 
in WAL we abort that RS which is actually not really required. Instead just 
throw back the exception to the client and let the client retry. This will be 
helpful in case of replication of bulk loaded data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15439) Mob compaction is not triggered after extended period of time

2016-03-11 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190685#comment-15190685
 ] 

Jingcheng Du commented on HBASE-15439:
--

Working on this. Will post a patch later.

> Mob compaction is not triggered after extended period of time
> -
>
> Key: HBASE-15439
> URL: https://issues.apache.org/jira/browse/HBASE-15439
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> I was running IntegrationTestIngestWithMOB test.
> I lower the mob compaction chore interval to this value:
> {code}
> 
>   hbase.mob.compaction.chore.period
>   6000
> 
> {code}
> After whole night, there was no indication from master log that mob 
> compaction ran.
> All I found was:
> {code}
> 2016-03-09 04:18:52,194 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 05:58:52,516 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 07:38:52,847 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 09:18:52,848 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 10:58:52,932 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 12:38:52,932 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 14:18:52,933 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 15:58:52,957 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> 2016-03-09 17:38:52,960 INFO  
> [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] 
> hbase.ScheduledChore: Chore: 
> tyu-hbase-rhel-re-2.novalocal,2,1457491115327-  MobCompactionChore missed 
> its start time
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-11 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15392:
--
Attachment: 15392v5.patch

v5. Doc not done yet because I am not yet sure on the index compare. Also need 
to add an extra test, one that has a stopRow on a Scan.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apa

[jira] [Commented] (HBASE-15322) Operations using Unsafe path broken for platforms not having sun.misc.Unsafe

2016-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190615#comment-15190615
 ] 

Hudson commented on HBASE-15322:


SUCCESS: Integrated in HBase-1.2-IT #461 (See 
[https://builds.apache.org/job/HBase-1.2-IT/461/])
HBASE-15322 Operations using Unsafe path broken for platforms not having 
(anoopsamjohn: rev 149dc79d855e520a23f554ed177163eaaa113e44)
* 
hbase-common/src/main/java/org/apache/hadoop/hbase/util/UnsafeAvailChecker.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/util/UnsafeAccess.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FuzzyRowFilter.java


> Operations using Unsafe path broken for platforms not having sun.misc.Unsafe
> 
>
> Key: HBASE-15322
> URL: https://issues.apache.org/jira/browse/HBASE-15322
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.0.0, 2.0.0, 0.98.7, 0.94.24
> Environment: OS: Ubuntu 14.04/Ubuntu 15.10  
> JDK: OpenJDK8/OpenJDK9
>Reporter: Anant Sharma
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.18, 1.4.0
>
> Attachments: BASE-15322.patch, HBASE-15322-0.98.patch, 
> HBASE-15322-branch-1.2.patch, HBASE-15322-branch-1.patch, 
> HBASE-15322_branch-1.1.patch
>
>
> HBase crashes in standalone mode with the following log:
> __
> 2016-02-24 22:38:37,578 ERROR [main] master.HMasterCommandLine: Master exiting
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMaster
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2341)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:233)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2355)
> Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer
> at org.apache.hadoop.hbase.util.Bytes.putInt(Bytes.java:899)
> at 
> org.apache.hadoop.hbase.KeyValue.createByteArray(KeyValue.java:1082)
> at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:652)
> at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:580)
> at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:483)
> at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:370)
> at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:267)
> at org.apache.hadoop.hbase.HConstants.(HConstants.java:978)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.(HTableDescriptor.java:1488)
> at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.(FSTableDescriptors.java:124)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:570)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:365)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2336)
> __
> The class is in the hbase-common.jar and its there in the classpath as can be 
> seen from the log:
> _
> 2016-02-24 22:38:32,538 INFO  [main] util.ServerCommandLine: 
> env:CLASSPATH=/home/hduser/hbase/hbase-1.1.3:/home/hduser/hbase/hbase-1.1.3/lib/activation-1.1.jar:/home/hduser/hbase/hbase-1.1.3/lib/aopalliance-1.0.jar:/home/hduser/hbase/hbase-1.1.3/lib/apacheds-i18n-2.0.0-M15.jar:/home/hduser/hbase/hbase-1.1.3/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hduser/hbase/hbase-1.1.3/lib/api-asn1-api-1.0.0-M20.jar:/home/hduser/hbase/hbase-1.1.3/lib/api-util-1.0.0-M20.jar:/home/hduser/hbase/hbase-1.1.3/lib/asm-3.1.jar:/home/hduser/hbase/hbase-1.1.3/lib/avro-1.7.4.jar:/home/hduser/hbase/hbase-1.1.3/lib/commons-beanutils-1.7.0.jar:/home/hduser/hbase/hbase-1.1.3/lib/commons-beanutils-core-1.8.0.jar:/home/hduser/hbase/hbase-1.1.3/lib/commons-cli-1.2.jar:/home/hduser/hbase/hbase-

[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large

2016-03-11 Thread JunHo Cho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JunHo Cho updated HBASE-15430:
--
Release Note: Failed taking snapshot - Manifest proto-message too large. 
add property ("snapshot.manifest.size.limit")  to change max size of 
proto-message  (was: Failed taking snapshot - Manifest proto-message too large. 
add property ("snapshot.manifest.size.limit")  to change max size of 
photo-message)

> Failed taking snapshot - Manifest proto-message too large
> -
>
> Key: HBASE-15430
> URL: https://issues.apache.org/jira/browse/HBASE-15430
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 0.98.11
>Reporter: JunHo Cho
>Assignee: JunHo Cho
>Priority: Critical
> Fix For: 0.98.18
>
> Attachments: hbase-15430-v1.patch, hbase-15430-v2.patch, 
> hbase-15430-v3.branch.0.98.patch, hbase-15430.patch
>
>
> the size of a protobuf message is 64MB (default). but the size of snapshot 
> meta is over 64MB. 
> Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed 
> taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to 
> exception:Protocol message was too large.  May be malicious.  Use 
> CodedInputStream.setSizeLimit() to increase the size 
> limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message 
> was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to 
> increase the size limit.
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341)
> ... 10 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
> message was too large.  May be malicious.  Use 
> CodedInputStream.setSizeLimit() to increase the size limit.
> at 
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at 
> com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at 
> com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
> at 
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815)
> at 
> org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889)
> at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> at 
> org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094)
> at 
> org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433)
> at 
> org.apache.hadoop.hbase.snapshot.Snapsh