[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely

2016-07-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362038#comment-15362038
 ] 

Duo Zhang commented on HBASE-16144:
---

So make it configurable? We should not introduce new findbugs warning usually.

> Replication queue's lock will live forever if RS acquiring the lock has died 
> prematurely
> 
>
> Key: HBASE-16144
> URL: https://issues.apache.org/jira/browse/HBASE-16144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.1, 1.1.5, 0.98.20
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-16144-v1.patch, HBASE-16144-v2.patch, 
> HBASE-16144-v3.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if 
> we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy 
> nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock 
> will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362033#comment-15362033
 ] 

Anoop Sam John commented on HBASE-16162:


{quote}
Just one thing – the scope of the try-blocks in flushInMemory () seems odd. 
Shouldn't it include the lines which are reverted in the finally-block? 
(namely, setting the atomic boolean and acquiring the lock)
{quote}
Sorry am not following ur comment.  Can u pls check in RB and add comment 
around the line?

> Compacting Memstore : unnecessary push of active segments to pipeline
> -
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, 
> HBASE-16162_V3.patch, HBASE-16162_V4.patch
>
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
>  InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>   }
>   getPool().execute(runnable);
> }
>   }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
>   // size above flush threshold
>   return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
>   }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
>   MutableSegment active = getActive();
>   pushActiveToPipeline(active);
> } finally {
>   getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
>   if (allowCompaction.get() && 
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this 
> method is invoked
> // directly (only in tests) in the common path setting from true to 
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is 
> forced while
> // compaction is in progress
> compactor.startCompaction();
>   }
> {code}
> So every write of cell will produce the check checkActiveSize().   When we 
> are at border of in mem flush,  many threads doing writes to this memstore 
> can get this checkActiveSize () to pass.  Yes the AtomicBoolean is still 
> false only. It is turned ON after some time once the new thread is started 
> run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just 
> takes the active segment and pushes that to pipeline. Yes we dont allow any 
> new writes to memstore at this time. But before that write lock on 
> region, other handler thread also might have added entry to this thread pool. 
>  When the 1st one finishes, it releases the lock on region and handler 
> threads trying for write to memstore, might get lock and add some data. Now 
> this 2nd in mem flush thread may get a chance and get the lock and so it just 
> takes current active segment and flush that in memory !This will produce 
> very small sized segments to pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362029#comment-15362029
 ] 

stack commented on HBASE-16074:
---

I tried revert and at least one loop completed. Will let it run tonight and see 
how far it gets.

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12083) Deprecate new HBaseAdmin() in favor of Connection.getAdmin()

2016-07-04 Thread li xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361998#comment-15361998
 ] 

li xiang commented on HBASE-12083:
--

Hi Enis, Stack

I am new to HBase. Could you please elaborate more about the reason why 
HBaseAdmin is changed to @InterfaceAudience.Private?
Does it has something to do with deprecating new HBaseAdmin() in favor of 
Connection.getAdmin()?
I used some APIs in HBaseAdmin in my program and so does the example code in 
Ref Guide(see section 89.3.3). 

Could you please explain more why the "admin" APIs are not intended to be 
called externally? Thanks!

> Deprecate new HBaseAdmin() in favor of Connection.getAdmin()
> 
>
> Key: HBASE-12083
> URL: https://issues.apache.org/jira/browse/HBASE-12083
> Project: HBase
>  Issue Type: Bug
>Reporter: Solomon Duskis
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.99.2
>
> Attachments: hbase-12083_v1.patch, hbase-12083_v2.patch, 
> hbase-12083_v3-branch-1.patch, hbase-12083_v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely

2016-07-04 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361996#comment-15361996
 ] 

Phil Yang commented on HBASE-16144:
---

TTL in ReplicationZKLockCleanerChore will be changed in the test case, so it 
can not set to final.

> Replication queue's lock will live forever if RS acquiring the lock has died 
> prematurely
> 
>
> Key: HBASE-16144
> URL: https://issues.apache.org/jira/browse/HBASE-16144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.1, 1.1.5, 0.98.20
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-16144-v1.patch, HBASE-16144-v2.patch, 
> HBASE-16144-v3.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if 
> we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy 
> nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock 
> will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361997#comment-15361997
 ] 

stack commented on HBASE-15716:
---

Let me upload a new patch w/ your changes included. A few comments on the patch 
below:

I should make getReadPoint private. It is only for use in this class. I should 
remove 'abstract long getMvccReadPoint();'?

On this comment, "   //  Ignore the result; Another thread already did 
for you.", you mean another thread has move the tail on further than what we 
wanted to set it too?

Should we be upping the tail reference count? We only checks if the 
tail.readPoint is less than mvccReadPoint. We didn't check it is equal. Could 
tail be > mvccReadPoint? It shouldn't ever be I suppose.. Is that what you are 
depending on here?

This should never happen (mvcc read point should always be > head.readPoint...)?

110 +long mvccReadPoint = getMvccReadPoint();
111 +if (head.readPoint >= mvccReadPoint) {
112 +  return head.readPoint;
113 +}

Otherwise, patch looks good. Any suggestions on how to test for correctness? (I 
can check perf easy). Thanks [~ikeda]



> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 
> 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 
> 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, 
> Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 
> AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 
> 9.54.08 PM.png, TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely

2016-07-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361993#comment-15361993
 ] 

Ted Yu commented on HBASE-16144:


Can you fix the findbugs warning ?

> Replication queue's lock will live forever if RS acquiring the lock has died 
> prematurely
> 
>
> Key: HBASE-16144
> URL: https://issues.apache.org/jira/browse/HBASE-16144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.1, 1.1.5, 0.98.20
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-16144-v1.patch, HBASE-16144-v2.patch, 
> HBASE-16144-v3.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if 
> we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy 
> nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock 
> will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361910#comment-15361910
 ] 

stack commented on HBASE-15716:
---

[~ikeda] Pardon me. That backport was so we had parity with hadoop rpc. It 
struck me as cleanup encapsulating connection handling inside 
'ConnectionManager'. I can revert. I'm just looking for a sketch of what you 
are thinking. I can take it from there. I have rig to try stuff on and 
appreciate your insight and suggestion.

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 
> 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 
> 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, 
> Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 
> AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 
> 9.54.08 PM.png, TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361882#comment-15361882
 ] 

stack commented on HBASE-16074:
---

Yes.

I'd say revert [~mantonov], Mr. RM. Meantime I'll keep looking at this. It is a 
perf optimization. Not sure how this patch is causing the issue and it may be 
that the patch just brings out an issue dormant in the Scan but hey, things are 
better w/o. I'll keep plugging away meantime since this 'failure' is odd.


> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361878#comment-15361878
 ] 

Hiroshi Ikeda commented on HBASE-15716:
---

I found there is some changes on the master around this (HBASE-15948). Using a 
blocking queue doesn't make sense. Moreover, unlike HADOOP-9956, calling the 
method {{add}} just crash the server if any. Also I don't understand why 
HBASE-15948 intentionally changes the code and leaks sockets which are failed 
to initialize. Introducing another thread (Timer) make the situation chaotic, 
and I give up.

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 
> 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 
> 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, 
> Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 
> AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 
> 9.54.08 PM.png, TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361787#comment-15361787
 ] 

Mikhail Antonov commented on HBASE-16074:
-

You mean eight unreferenced nodes were with normal in-Loop run w/ Monkeys and 
the last step was w/o them?

With the last patch half of the runs failed for me with the same errors (seems 
to be it happens less often not, though I'm not sure which one I like more - 
the bug which causes data loss always or in half runs..yeah). Also tried to 
revert the patch and run w/o it. Saw 4 or 5 successful runs so far, none failed.

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361788#comment-15361788
 ] 

Mikhail Antonov commented on HBASE-16074:
-

[~stack] ^^

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361770#comment-15361770
 ] 

stack commented on HBASE-16074:
---

With patch in place and rerunning, it complained that had eight unreferenced. 
When I reran Verify, it said all was fine (I'm echoing Elliott's experience 
above). Let me dig in.

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361766#comment-15361766
 ] 

Hadoop QA commented on HBASE-16157:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
1s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 24s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 24s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 138m 29s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12816081/HBASE-16157-v4.patch |
| JIRA Issue | HBASE-16157 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 7f44dfd |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
| findbugs | v3.0.0 |
| unit | 

[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361751#comment-15361751
 ] 

Hadoop QA commented on HBASE-16144:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
8s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
6s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
59s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 29s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 11s 
{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 93m 49s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
31s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 147m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  org.apache.hadoop.hbase.master.cleaner.ReplicationZKLockCleanerChore.TTL 
isn't final but should be  At ReplicationZKLockCleanerChore.java:be  At 
ReplicationZKLockCleanerChore.java:[line 57] |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815988/HBASE-16144-v3.patch |
| JIRA 

[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified

2016-07-04 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-14548:
-
Fix Version/s: (was: 1.2.0)
   2.0.0

> Expand how table coprocessor jar and dependency path can be specified
> -
>
> Key: HBASE-14548
> URL: https://issues.apache.org/jira/browse/HBASE-14548
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: li xiang
> Fix For: 2.0.0
>
> Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch, 
> HBASE-14548-master-v1.patch
>
>
> Currently you can specify the location of the coprocessor jar in the table 
> coprocessor attribute.
> The problem is that it only allows you to specify one jar that implements the 
> coprocessor.  You will need to either bundle all the dependencies into this 
> jar, or you will need to copy the dependencies into HBase lib dir.
> The first option may not be ideal sometimes.  The second choice can be 
> troublesome too, particularly when the hbase region sever node and dirs are 
> dynamically added/created.
> There are a couple things we can expand here.  We can allow the coprocessor 
> attribute to specify a directory location, probably on hdfs.
> We may even allow some wildcard in there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361731#comment-15361731
 ] 

Hudson commented on HBASE-16132:


SUCCESS: Integrated in HBase-Trunk_matrix #1168 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1168/])
HBASE-16132 Scan does not return all the result when regionserver is (liyu: rev 
7f44dfd85fc1aacd451cb8514fbce6dafd3443ca)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java


> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361704#comment-15361704
 ] 

Hudson commented on HBASE-16135:


FAILURE: Integrated in HBase-1.2 #663 (See 
[https://builds.apache.org/job/HBase-1.2/663/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev e3e39a693e91f5de77010f6b80b4111f377b03ce)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified

2016-07-04 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361701#comment-15361701
 ] 

Jerry He commented on HBASE-14548:
--

Patch looks good, [~water]  Thanks for working on it.

bq. +Path pathPattern1 = fs.isDirectory(pathPattern) ?

The fs.isDirectory works fine on a wildcard path?

The tests are all on local file system. 
You can enhance them to have the test jars on hdfs. You can enhance 
ClassLoaderTestHelper.buildJar() to the jar on a working dir on hdfs 
minicluster.
This  can be done in a separate JIRA if you like.

> Expand how table coprocessor jar and dependency path can be specified
> -
>
> Key: HBASE-14548
> URL: https://issues.apache.org/jira/browse/HBASE-14548
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: li xiang
> Fix For: 1.2.0
>
> Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch, 
> HBASE-14548-master-v1.patch
>
>
> Currently you can specify the location of the coprocessor jar in the table 
> coprocessor attribute.
> The problem is that it only allows you to specify one jar that implements the 
> coprocessor.  You will need to either bundle all the dependencies into this 
> jar, or you will need to copy the dependencies into HBase lib dir.
> The first option may not be ideal sometimes.  The second choice can be 
> troublesome too, particularly when the hbase region sever node and dirs are 
> dynamically added/created.
> There are a couple things we can expand here.  We can allow the coprocessor 
> attribute to specify a directory location, probably on hdfs.
> We may even allow some wildcard in there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361686#comment-15361686
 ] 

Hadoop QA commented on HBASE-16074:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
45s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} branch-1.3 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s 
{color} | {color:red} hbase-server in branch-1.3 has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
15m 46s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 78m 23s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 115m 18s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12816069/HBASE-16074.branch-1.3.003.patch
 |
| JIRA Issue | HBASE-16074 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 

[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361679#comment-15361679
 ] 

Hudson commented on HBASE-16132:


SUCCESS: Integrated in HBase-1.3 #768 (See 
[https://builds.apache.org/job/HBase-1.3/768/])
HBASE-16132 Scan does not return all the result when regionserver is (liyu: rev 
b3834d7f72af4b689bc49f799b9f64671af8be44)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java


> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361668#comment-15361668
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.4 #271 (See 
[https://builds.apache.org/job/HBase-1.4/271/])
HBASE-16135 addendum format comments (zhangduo: rev 
4807836304ab75a2c9bedcc4afb30d2332cebb7b)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361667#comment-15361667
 ] 

Hudson commented on HBASE-16132:


SUCCESS: Integrated in HBase-1.4 #271 (See 
[https://builds.apache.org/job/HBase-1.4/271/])
HBASE-16132 Scan does not return all the result when regionserver is (liyu: rev 
84dd9cbcb64933a9511c34a433c28b423e5cc266)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java


> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Attachment: HBASE-16157-v4.patch

re-attach

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Attachment: (was: HBASE-16157-v4.patch)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Patch Available  (was: Open)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Open  (was: Patch Available)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16071) The VisibilityLabelFilter and AccessControlFilter should not count the "delete cell"

2016-07-04 Thread ChiaPing Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361631#comment-15361631
 ] 

ChiaPing Tsai commented on HBASE-16071:
---

any comment ?

thanks

> The VisibilityLabelFilter and AccessControlFilter should not count the 
> "delete cell"
> 
>
> Key: HBASE-16071
> URL: https://issues.apache.org/jira/browse/HBASE-16071
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16071-v1.patch, HBASE-16071-v2.patch, 
> HBASE-16071-v3.patch
>
>
> The VisibilityLabelFilter will see and count the "delete cell" if the 
> scan.isRaw() returns true, so the (put) cell will be skipped if it has lower 
> version than "delete cell"
> The critical code is shown below:
> {code:title=VisibilityLabelFilter.java|borderStyle=solid}
>   public ReturnCode filterKeyValue(Cell cell) throws IOException {
> if (curFamily.getBytes() == null
> || !(CellUtil.matchingFamily(cell, curFamily.getBytes(), 
> curFamily.getOffset(),
> curFamily.getLength( {
>   curFamily.set(cell.getFamilyArray(), cell.getFamilyOffset(), 
> cell.getFamilyLength());
>   // For this family, all the columns can have max of 
> curFamilyMaxVersions versions. No need to
>   // consider the older versions for visibility label check.
>   // Ideally this should have been done at a lower layer by HBase (?)
>   curFamilyMaxVersions = cfVsMaxVersions.get(curFamily);
>   // Family is changed. Just unset curQualifier.
>   curQualifier.unset();
> }
> if (curQualifier.getBytes() == null
> || !(CellUtil.matchingQualifier(cell, curQualifier.getBytes(), 
> curQualifier.getOffset(),
> curQualifier.getLength( {
>   curQualifier.set(cell.getQualifierArray(), cell.getQualifierOffset(),
>   cell.getQualifierLength());
>   curQualMetVersions = 0;
> }
> curQualMetVersions++;
> if (curQualMetVersions > curFamilyMaxVersions) {
>   return ReturnCode.SKIP;
> }
> return this.expEvaluator.evaluate(cell) ? ReturnCode.INCLUDE : 
> ReturnCode.SKIP;
>   }
> {code}
> [VisibilityLabelFilter.java|https://github.com/apache/hbase/blob/d7a4499dfc8b3936a0eca867589fc2b23b597866/hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityLabelFilter.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361627#comment-15361627
 ] 

ChiaPing Tsai commented on HBASE-16157:
---

Do I need to delete the older patch before re-attaching ?

thanks

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361622#comment-15361622
 ] 

stack commented on HBASE-16074:
---

Trying patch.

Looking in logs, I can't find these 'bad rows', as though we are mis-scanning.

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16074) ITBLL fails, reports lost big or tiny families

2016-07-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-16074:
--
Summary: ITBLL fails, reports lost big or tiny families  (was: ITBLL fails, 
reports lost big or tine families)

> ITBLL fails, reports lost big or tiny families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361620#comment-15361620
 ] 

Ted Yu commented on HBASE-16157:


ChiaPing:
If you re-attach the patch, QA would run the tests.

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tine families

2016-07-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361605#comment-15361605
 ] 

stack commented on HBASE-16074:
---

Second Verify run does this:
{code}
...
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$Counts
LOST_FAMILIES=26
REFERENCED=4
UNREFERENCED=26
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=1624
unref
5UUR=1
:\xAA\x0A\xE4=1
UUUP=1
\x05TP\xD3=1
\x15UUT=1
\x1A\xAA\xA4\x06=1
\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xF8=1
\x8A\xAA\xAA\xAA\xAA\xAA\xAA\xA2=1
\x8F\xFF\xF8\xAC=1
\x95UUL=1
\x9A\xAA\x1Cw=1
\x9F\xFF\xFF\xFF\xFF\xFF\xFF\xF6=1
\xB5UUJ=1
\xBA\xAA\x14\xB2=1
\xBDT\xAB\x17=1
\xBF\xFF\xFF\xFF\xFF\xFF\xFF\xF4=1
\xCA\xAA\xAA\xAA\xAA\xAA\xAA\x9E=1
\xD0\x00\xF9\x16=1
\xD5UUH=1
\xDA\xA9\xDA\x0A=1
\xDF\xFF\xFF\xFF\xFF\xFF\xFF\xF2=1
\xEA\xAA\xAA\xAA\xAA\xAA\xAA\x9C=1
\xF5UUF=1
_\xFF\xFF\xFF\xFF\xFF\xFF\xFA=1
eV1\xE3=1
j\xAA\xAA\xAA\xAA\xAA\xAA\xA4=1
16/07/04 09:27:27 ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes 
which lost big or tiny families, count=26
{code}

Third and fourth settle at same 'loss' of 26 familes.

> ITBLL fails, reports lost big or tine families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361598#comment-15361598
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-0.98-matrix #363 (See 
[https://builds.apache.org/job/HBase-0.98-matrix/363/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev f3002bf2f7e43e2846b50fcc20ac9185850f7075)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Patch Available  (was: Open)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-04 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Open  (was: Patch Available)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16074) ITBLL fails, reports lost big or tine families

2016-07-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-16074:
--
Attachment: HBASE-16074.branch-1.3.003.patch

Retry. Test passes locally. Retry.

Ran the ITBLL on my little cluster against tip of 1.3 and it fails with:

{code}

\xB7\xFF\xE6r=1
\xBA\x93\xA0\xE0\xF6\x5C\x8D\xA8>\xB82\x8F01\xC2S\x00=1
\xBC\xE0\x925\xD6H\x09\x0D\x0D\xF4Y\x8BA\x9F\xDA\x84\x00=1
\xBF\xFF\xFF\xFF\xFF\xFF\xFF\xF4=1
\xC8\xDA\xB2\xF9g\x00\xFET\x90@\xE9\xB25\xFD\xA2~\x00=1
\xCA\xAA\xAA\xAA\xAA\xAA\xAA\x9E=1
]\xADn\xE7#\xF3\xDB\xB3k\xAB\xF0k\x7F-\x1AA\x00=1
_\xFF\xFF\xFF\xFF\xFF\xFF\xFA=1
c\x85\xA4\x93HN8\xE7\x90\x8D\xA6\xA5\x8A\x15\xFF]\x00=1
eV1\xE3=1
h\xDB\x94\xEB\xA0\x82\xD4\x17\xF9\x1C\xE6o\xC9/\xE8$\x00=1
j\xAA\xAA\xAA\xAA\xAA\xAA\xA4=1
l\xCB\x83\xEC\x97\x86\xE1\x90\x7F\xA21J\x99\xF7Ji\x00=1
nH\x1C5\xD4\x16\xD9\xAE<\xE1E\xAF\x99\xBC\x1A\x8D\x00=1
uUUN=1
y\x14;9\x9E'\xE6\xB1E\xEE&\xE3\x9C`\x0E\x0D\x00=1
~W\x07t\xD2\x0B\x96\xF4\xD9P%h\xEA(\xBA\xC4\x00=1
16/07/03 22:39:39 ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes 
which lost big or tiny families, count=8669
{code}

> ITBLL fails, reports lost big or tine families
> --
>
> Key: HBASE-16074
> URL: https://issues.apache.org/jira/browse/HBASE-16074
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 1.3.0, 0.98.20
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
> Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361510#comment-15361510
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.1-JDK7 #1739 (See 
[https://builds.apache.org/job/HBase-1.1-JDK7/1739/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev 853e7d1dcd0f5caa64e1b83c1c532c5b917f817c)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361506#comment-15361506
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.1-JDK8 #1826 (See 
[https://builds.apache.org/job/HBase-1.1-JDK8/1826/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev 853e7d1dcd0f5caa64e1b83c1c532c5b917f817c)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361501#comment-15361501
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.2-IT #544 (See 
[https://builds.apache.org/job/HBase-1.2-IT/544/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev e3e39a693e91f5de77010f6b80b4111f377b03ce)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15916) incorrect work formating for shell scan

2016-07-04 Thread Alexey Diomin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Diomin updated HBASE-15916:
--
Attachment: 0001-HBASE-15916-incorrect-work-formating-for-shell-scan.patch

i check the code, all work correct with patch
we always invoke table._hash_to_scan before send custom scanner in 
table._scan_internal

> incorrect work formating for shell scan 
> 
>
> Key: HBASE-15916
> URL: https://issues.apache.org/jira/browse/HBASE-15916
> Project: HBase
>  Issue Type: Bug
>Reporter: Alexey Diomin
>Priority: Minor
> Attachments: 
> 0001-HBASE-15916-incorrect-work-formating-for-shell-scan.patch
>
>
> this commit change behavior of 'scan' command 
> https://github.com/apache/hbase/commit/e1e8434340f02907976f20566c3e55d8d627d4c4
> old behavior:
> 1. call  table._scan_internal with args
> 2.  table._scan_internal  
> 2.1. call @converters.clear()
> 2.2. create scan and fill @converters
> 2.3. return result
> 3. scan print result with formatters
> new behavior issue:
> 1. scan command prepare Scan and fill @converters
> 2.  table._scan_internal  
> 2.1. call @converters.clear()
> 2.2. if we have scan != nil use his
> 2.3. return result
> 3. scan print result WITHOUT formatters
> p.s. this example dosn't work 
> http://blog.cloudera.com/blog/2016/01/how-to-create-and-use-a-custom-formatter-in-the-apache-hbase-shell/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361455#comment-15361455
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-Trunk_matrix #1167 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1167/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev 6944a17ad4f039d05f76e1f75136bd121776e809)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestTableBasedReplicationSourceManagerImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManagerZkImpl.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361439#comment-15361439
 ] 

Hadoop QA commented on HBASE-14921:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 0s 
{color} | {color:blue} rubocop was not available. {color} |
| {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 0s 
{color} | {color:blue} Ruby-lint was not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
8s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
40s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 7s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 19s 
{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 57s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 2s 
{color} | {color:green} hbase-shell in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
47s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 159m 41s {color} 
| {color:black} {color} |

[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361422#comment-15361422
 ] 

Hudson commented on HBASE-16132:


SUCCESS: Integrated in HBase-1.3-IT #742 (See 
[https://builds.apache.org/job/HBase-1.3-IT/742/])
HBASE-16132 Scan does not return all the result when regionserver is (liyu: rev 
b3834d7f72af4b689bc49f799b9f64671af8be44)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java


> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361423#comment-15361423
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.3-IT #742 (See 
[https://builds.apache.org/job/HBase-1.3-IT/742/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev a9b7f3f0219556fa8023fe4684aef3724e624597)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361417#comment-15361417
 ] 

Hudson commented on HBASE-16135:


SUCCESS: Integrated in HBase-1.3 #767 (See 
[https://builds.apache.org/job/HBase-1.3/767/])
HBASE-16135 PeerClusterZnode under rs of removed peer may never be (zhangduo: 
rev a9b7f3f0219556fa8023fe4684aef3724e624597)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java


> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16084) Clean up the stale references in javadoc

2016-07-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16084:
---
Description: 
>From TestHFileOutputFormat2 , e.g.:
{code}
 * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}.
{code}
CellSortReducer doesn't exist.

>From TestSerialization.java :
{code}
   * Create a table of name name with {@link COLUMNS} for
{code}
COLUMNS cannot be found.

This issue is to clean up the stale references in javadoc.

  was:
>From TestHFileOutputFormat2 , e.g.:
{code}
 * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}.
{code}
CellSortReducer doesn't exist.

This issue is to clean up the stale references in javadoc.


> Clean up the stale references in javadoc
> 
>
> Key: HBASE-16084
> URL: https://issues.apache.org/jira/browse/HBASE-16084
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> From TestHFileOutputFormat2 , e.g.:
> {code}
>  * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}.
> {code}
> CellSortReducer doesn't exist.
> From TestSerialization.java :
> {code}
>* Create a table of name name with {@link COLUMNS} for
> {code}
> COLUMNS cannot be found.
> This issue is to clean up the stale references in javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361357#comment-15361357
 ] 

Anoop Sam John commented on HBASE-16162:


Pls see https://reviews.apache.org/r/49592
Readability is much better with RB patches

> Compacting Memstore : unnecessary push of active segments to pipeline
> -
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, 
> HBASE-16162_V3.patch, HBASE-16162_V4.patch
>
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
>  InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>   }
>   getPool().execute(runnable);
> }
>   }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
>   // size above flush threshold
>   return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
>   }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
>   MutableSegment active = getActive();
>   pushActiveToPipeline(active);
> } finally {
>   getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
>   if (allowCompaction.get() && 
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this 
> method is invoked
> // directly (only in tests) in the common path setting from true to 
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is 
> forced while
> // compaction is in progress
> compactor.startCompaction();
>   }
> {code}
> So every write of cell will produce the check checkActiveSize().   When we 
> are at border of in mem flush,  many threads doing writes to this memstore 
> can get this checkActiveSize () to pass.  Yes the AtomicBoolean is still 
> false only. It is turned ON after some time once the new thread is started 
> run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just 
> takes the active segment and pushes that to pipeline. Yes we dont allow any 
> new writes to memstore at this time. But before that write lock on 
> region, other handler thread also might have added entry to this thread pool. 
>  When the 1st one finishes, it releases the lock on region and handler 
> threads trying for write to memstore, might get lock and add some data. Now 
> this 2nd in mem flush thread may get a chance and get the lock and so it just 
> takes current active segment and flush that in memory !This will produce 
> very small sized segments to pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression

2016-07-04 Thread Robert James (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361344#comment-15361344
 ] 

Robert James commented on HBASE-5313:
-

This ticket seems to have been abandoned.  Why? The results posted by [~he 
yongqiang] show a lot of performance gain: half the disk usage.  Has it just 
been forgotten, or has a decision been made not to do this? Why?

> Restructure hfiles layout for better compression
> 
>
> Key: HBASE-5313
> URL: https://issues.apache.org/jira/browse/HBASE-5313
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> A HFile block contain a stream of key-values. Can we can organize these kvs 
> on the disk in a better way so that we get much greater compression ratios?
> One option (thanks Prakash) is to store all the keys in the beginning of the 
> block (let's call this the key-section) and then store all their 
> corresponding values towards the end of the block. This will allow us to 
> not-even decompress the values when we are scanning and skipping over rows in 
> the block.
> Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-16132:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.3.0
   2.0.0
   Status: Resolved  (was: Patch Available)

Pushed into master, branch-1 and branch-1.3, thanks [~aoxiang] for the patch, 
thanks all for review.

[~mantonov] FYI since this goes into branch-1.3

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16172) Unify the retry logic in ScannerCallableWithReplicas and RpcRetryingCallerWithReadReplicas

2016-07-04 Thread Yu Li (JIRA)
Yu Li created HBASE-16172:
-

 Summary: Unify the retry logic in ScannerCallableWithReplicas and 
RpcRetryingCallerWithReadReplicas
 Key: HBASE-16172
 URL: https://issues.apache.org/jira/browse/HBASE-16172
 Project: HBase
  Issue Type: Bug
Reporter: Yu Li
Assignee: Yu Li


The issue is pointed out by [~devaraj] in HBASE-16132 (Thanks D.D.), that in 
{{RpcRetryingCallerWithReadReplicas#call}} we will call 
{{ResultBoundedCompletionService#take}} instead of {{poll}} to dead-wait on the 
second one if the first replica timed out, while in 
{{ScannerCallableWithReplicas#call}} we still use 
{{ResultBoundedCompletionService#poll}} with some timeout for the 2nd replica.

This JIRA aims at discussing whether to unify the logic in these two kinds of 
caller with region replica and taking action if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-04 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361332#comment-15361332
 ] 

Yu Li commented on HBASE-16132:
---

Have opened HBASE-16172 as a follow up

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Eshcar Hillel (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361300#comment-15361300
 ] 

Eshcar Hillel commented on HBASE-16162:
---

As far as I can see (it is a bit hard to read the code in the patch) patch is 
ok.
Just one thing -- the scope of the try-blocks in flushInMemory () seems odd. 
Shouldn't it include the lines which are reverted in the finally-block? 
(namely, setting the atomic boolean and acquiring the lock)

> Compacting Memstore : unnecessary push of active segments to pipeline
> -
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, 
> HBASE-16162_V3.patch, HBASE-16162_V4.patch
>
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
>  InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>   }
>   getPool().execute(runnable);
> }
>   }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
>   // size above flush threshold
>   return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
>   }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
>   MutableSegment active = getActive();
>   pushActiveToPipeline(active);
> } finally {
>   getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
>   if (allowCompaction.get() && 
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this 
> method is invoked
> // directly (only in tests) in the common path setting from true to 
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is 
> forced while
> // compaction is in progress
> compactor.startCompaction();
>   }
> {code}
> So every write of cell will produce the check checkActiveSize().   When we 
> are at border of in mem flush,  many threads doing writes to this memstore 
> can get this checkActiveSize () to pass.  Yes the AtomicBoolean is still 
> false only. It is turned ON after some time once the new thread is started 
> run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just 
> takes the active segment and pushes that to pipeline. Yes we dont allow any 
> new writes to memstore at this time. But before that write lock on 
> region, other handler thread also might have added entry to this thread pool. 
>  When the 1st one finishes, it releases the lock on region and handler 
> threads trying for write to memstore, might get lock and add some data. Now 
> this 2nd in mem flush thread may get a chance and get the lock and so it just 
> takes current active segment and flush that in memory !This will produce 
> very small sized segments to pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted

2016-07-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-16135:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to all branches. Thanks all for reviewing.

> PeerClusterZnode under rs of removed peer may never be deleted
> --
>
> Key: HBASE-16135
> URL: https://issues.apache.org/jira/browse/HBASE-16135
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, 
> HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, 
> HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135-v3.patch, 
> HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs 
> directory had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, 
> but there are still some replication queue znode under some rs nodes. This 
> prevents the deletion of .oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-07-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361242#comment-15361242
 ] 

Anastasia Braginsky commented on HBASE-14921:
-

Trying again with new patch:  HBASE-14921-V05-CAO.patch

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> InitialCellArrayMapEvaluation.pdf, IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14921) Memory optimizations

2016-07-04 Thread Anastasia Braginsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anastasia Braginsky updated HBASE-14921:

Attachment: HBASE-14921-V05-CAO.patch

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> InitialCellArrayMapEvaluation.pdf, IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-07-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361215#comment-15361215
 ] 

Anastasia Braginsky commented on HBASE-14921:
-

Hey Guys!

Added a new patch: HBASE-14921-V04-CA-V02.patch
It includes bug fixes that you guys have found and we have found as well. It 
also reflects the fixes for the majority of your code review comments. If you 
do not see your comment fixed, I should had written an answer near your review 
comment.
The patch is available on the same review board as another diff.

[~anoop.hbase], [~stack], [~tedyu], [~ram_krish], please take a look on the 
code.
Your comments are very welcome! You can also raise an issue from previous 
review once again.

Thanks,
Anastasia 


> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361210#comment-15361210
 ] 

Hadoop QA commented on HBASE-14921:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HBASE-14921 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12816025/HBASE-14921-V04-CA-V02.patch
 |
| JIRA Issue | HBASE-14921 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2516/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14921) Memory optimizations

2016-07-04 Thread Anastasia Braginsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anastasia Braginsky updated HBASE-14921:

Attachment: HBASE-14921-V04-CA-V02.patch

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15945) Patch for Cell and CellImpl

2016-07-04 Thread Sudeep Sunthankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361173#comment-15361173
 ] 

Sudeep Sunthankar commented on HBASE-15945:
---

Thanks for the feedback Elliott. We have uploaded a patch based on your inputs.

> Patch for Cell and CellImpl
> ---
>
> Key: HBASE-15945
> URL: https://issues.apache.org/jira/browse/HBASE-15945
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-15945-HBASE-14850.v2.patch, 
> HBASE-15945-HBASE-14850.v3.patch, HBASE-15945-HBASE-14850.v4.patch, 
> HBASE-15945.HBASE-14850.v1.patch, HBASE-15945.HBASE-14850.v5.patch, 
> HBASE-15945.HBASE-14850.v6.patch
>
>
> This patch contains an implementation of Key Value, Bytes and Cell modeled on 
> the lines of Java implementation.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15945) Patch for Cell and CellImpl

2016-07-04 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar updated HBASE-15945:
--
Attachment: HBASE-15945.HBASE-14850.v6.patch

This patch consists of a Cell implementation without any additional classes or 
interfaces.

> Patch for Cell and CellImpl
> ---
>
> Key: HBASE-15945
> URL: https://issues.apache.org/jira/browse/HBASE-15945
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-15945-HBASE-14850.v2.patch, 
> HBASE-15945-HBASE-14850.v3.patch, HBASE-15945-HBASE-14850.v4.patch, 
> HBASE-15945.HBASE-14850.v1.patch, HBASE-15945.HBASE-14850.v5.patch, 
> HBASE-15945.HBASE-14850.v6.patch
>
>
> This patch contains an implementation of Key Value, Bytes and Cell modeled on 
> the lines of Java implementation.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16169) RegionSizeCalculator should not depend on master

2016-07-04 Thread li xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361104#comment-15361104
 ] 

li xiang commented on HBASE-16169:
--

Hi Thiruvel, I happened to read this JIRA and it is awesome if it can be 
implemented.
So do you plan to add the API into HRegionServer? 
Also, ServerLoad.getRegionLoad() seems provide a similar function to return a 
map of region name and RegionLoad hosted by the region server. Do you mind 
elaborating more about your proposal, what are the differences between the API 
you are proposing and ServerLoad.getRegionLoad()?

Please correct me if I did not get it correctly.

> RegionSizeCalculator should not depend on master
> 
>
> Key: HBASE-16169
> URL: https://issues.apache.org/jira/browse/HBASE-16169
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, scaling
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
>
> RegionSizeCalculator is needed for better split generation of MR jobs. This 
> requires RegionLoad which can be obtained via ClusterStatus, i.e. accessing 
> Master. We don't want master to be in this path.
> The proposal is to add an API to the RegionServer that gets RegionLoad of all 
> regions hosted on it or those of a table if specified. RegionSizeCalculator 
> can use the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16171) Fix the potential problems in TestHCM.testConnectionCloseAllowsInterrupt

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361101#comment-15361101
 ] 

Hadoop QA commented on HBASE-16171:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
22s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
12s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
54m 30s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 48s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 176m 54s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
|   | hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint |
| Timed out junit tests | 
org.apache.hadoop.hbase.regionserver.TestTimestampFilterSeekHint |
|   | org.apache.hadoop.hbase.regionserver.TestFailedAppendAndSync |
|   | org.apache.hadoop.hbase.regionserver.TestHMobStore |
|   | org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents |
|   | org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815978/HBASE-16171.001.patch 
|
| JIRA Issue | HBASE-16171 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency 

[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361092#comment-15361092
 ] 

Anoop Sam John commented on HBASE-16162:


Pls refer to patch xxx_V4.patch..   That is the numbering scheme we usually 
follow :-)
This patch is bit diff than what u pasted above. inMemoryFlushInProgress CAS 
happens in shouldFlushInMemory() only.  Within flushInMemory()  the atomic 
boolean is just set to true (for test cases).. That am doing at 1st not after 
push to pipeline. This is because the variable responsibility is not just 
compaction now.. Also the reset to false happens in finally block not in 
MemstoreCompactor , as u suggested.


> Compacting Memstore : unnecessary push of active segments to pipeline
> -
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, 
> HBASE-16162_V3.patch, HBASE-16162_V4.patch
>
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
>  InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>   }
>   getPool().execute(runnable);
> }
>   }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
>   // size above flush threshold
>   return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
>   }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
>   MutableSegment active = getActive();
>   pushActiveToPipeline(active);
> } finally {
>   getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
>   if (allowCompaction.get() && 
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this 
> method is invoked
> // directly (only in tests) in the common path setting from true to 
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is 
> forced while
> // compaction is in progress
> compactor.startCompaction();
>   }
> {code}
> So every write of cell will produce the check checkActiveSize().   When we 
> are at border of in mem flush,  many threads doing writes to this memstore 
> can get this checkActiveSize () to pass.  Yes the AtomicBoolean is still 
> false only. It is turned ON after some time once the new thread is started 
> run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just 
> takes the active segment and pushes that to pipeline. Yes we dont allow any 
> new writes to memstore at this time. But before that write lock on 
> region, other handler thread also might have added entry to this thread pool. 
>  When the 1st one finishes, it releases the lock on region and handler 
> threads trying for write to memstore, might get lock and add some data. Now 
> this 2nd in mem flush thread may get a chance and get the lock and so it just 
> takes current active segment and flush that in memory !This will produce 
> very small sized segments to pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-07-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361086#comment-15361086
 ] 

Anastasia Braginsky commented on HBASE-14921:
-

The sizing was wrong :) 
Good eye you have :)
Patch with aa the fixes :) will be ready today

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA.patch, 
> InitialCellArrayMapEvaluation.pdf, IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16164) Missing close of new compacted segments in few occasions which might leak MSLAB chunks from pool

2016-07-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361072#comment-15361072
 ] 

Anastasia Braginsky commented on HBASE-16164:
-

OK, leave the fix as is. Thanks!!!

> Missing close of new compacted segments in few occasions which might leak 
> MSLAB chunks from pool
> 
>
> Key: HBASE-16164
> URL: https://issues.apache.org/jira/browse/HBASE-16164
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-16164.patch
>
>
> An in memory compaction of N segments in progress. Inn between snapshot() 
> call comes. We will stop the in progress compaction then. This just sets an 
> AtomicBoolean.  We check this boolean state in the compaction loop (while 
> loop reading the cells from the segments) and before swapping the segments. 
> But if this scenario comes, we are just ignoring the new newly compacted 
> Segment. This is a problem maker when we work with MSLAB pool. The new 
> segment would have acquired some chunks but when will they get released? As 
> we dont close the segment this will leak them.
> Also in swap we have
> {code}
> public boolean swap(VersionedSegmentsList versionedList, ImmutableSegment 
> segment) {
> if(versionedList.getVersion() != version) {
>   return false;
> }
> LinkedList suffix;
> synchronized (pipeline){
>   if(versionedList.getVersion() != version) {
> return false;
>   }
> {code}
> I dont see any possibility for this code flow to happen.  Still for 
> correctness, we should close the segment here too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361029#comment-15361029
 ] 

Anastasia Braginsky commented on HBASE-16162:
-

Hi [~anoop.hbase]!

Thank you for all your effort and the important input!
I agree with you that checking the boolean created for testing is heavy to be 
done in the common *add* path. 
Although the check happens only when (getActive().getSize() > 
inmemoryFlushSize), this check is not needed (it was there for future use).
We can remove the allowCompaction.get() from shouldFlushInMemory().

I am already confused with all the patches. May be you can put in the review 
board the final version?
For example, I see in master that 
inMemoryFlushInProgress.compareAndSet(false,true) is used in flushInMemory() 
and it should not be there... Is it just old code? Or one of the fixes?
Generally, it is important to do the inMemoryFlushInProgress CAS *only once* 
and in shouldFlushInMemory().

Anyway, I am putting here the code as I think it should be. The code is taken 
from HBASE-14921 new patch.

{code}
  // internally used method, externally visible only for tests
  // when invoked directly from tests it must be verified that the caller 
doesn't hold updatesLock,
  // otherwise there is a deadlock
  @VisibleForTesting
  void flushInMemory() throws IOException {
// Phase I: Update the pipeline
getRegionServices().blockUpdates();
try {
  MutableSegment active = getActive();
  if (LOG.isDebugEnabled()) {
LOG.debug("IN-MEMORY FLUSH: Pushing active segment into compaction 
pipeline, "
+ "and initiating compaction.");
  }
  pushActiveToPipeline(active);
} finally {
  getRegionServices().unblockUpdates();
}
// Phase II: Compact the pipeline
try {
  if (allowCompaction.get()) {
// setting the inMemoryFlushInProgress flag again for the case this 
method is invoked
// directly (only in tests) in the common path setting from true to 
true is idempotent
inMemoryFlushInProgress.set(true);
// Speculative compaction execution, may be interrupted if flush is 
forced while
// compaction is in progress
compactor.start();
  }
} catch (IOException e) {
  LOG.warn("Unable to run memstore compaction. region "
  + getRegionServices().getRegionInfo().getRegionNameAsString()
  + "store: "+ getFamilyName(), e);
} finally {
  stopCompaction();
}
  }

  private byte[] getFamilyNameInByte() {
return store.getFamily().getName();
  }

  private ThreadPoolExecutor getPool() {
return getRegionServices().getInMemoryCompactionPool();
  }

  private boolean shouldFlushInMemory() {
if(getActive().getSize() > inmemoryFlushSize) { // size above flush 
threshold
// the inMemoryFlushInProgress is CASed to be true here in order to 
mutual exclude
// the insert of the active into the compaction pipeline
return (inMemoryFlushInProgress.compareAndSet(false,true));
}
return false;
  }  
{code}

Thank you very much once again! :)
Very much sorry for probably being unclear and slow :) :)

> Compacting Memstore : unnecessary push of active segments to pipeline
> -
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, 
> HBASE-16162_V3.patch, HBASE-16162_V4.patch
>
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
>  InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
>   }
>   getPool().execute(runnable);
> }
>   }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
>   // size above flush threshold
>   return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
>   }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
>   MutableSegment active = getActive();
>   pushActiveToPipeline(active);
> } finally {
>   getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
>   if (allowCompaction.get() && 
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this 
> method is invoked
> // directly (only in tests) in the common path setting from true to 
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is 
> forced while
> 

[jira] [Updated] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread Hiroshi Ikeda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hiroshi Ikeda updated HBASE-15716:
--
Attachment: ScannerReadPoints.v2.java

Added a revised class, but messy :(


> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: Hiroshi Ikeda
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 
> 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 
> 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, 
> Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 
> AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 
> 9.54.08 PM.png, TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread Hiroshi Ikeda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hiroshi Ikeda updated HBASE-15716:
--
Assignee: stack  (was: Hiroshi Ikeda)

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, ScannerReadPoints.v2.java, Screen Shot 2016-04-26 at 
> 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 
> 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, 
> Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 
> AM.png, Screen Shot 2016-06-30 at 9.52.52 PM.png, Screen Shot 2016-06-30 at 
> 9.54.08 PM.png, TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread Hiroshi Ikeda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hiroshi Ikeda updated HBASE-15716:
--
Assignee: Hiroshi Ikeda  (was: stack)

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: Hiroshi Ikeda
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, Screen Shot 2016-04-26 at 2.05.45 PM.png, Screen Shot 
> 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, 
> Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 
> PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, Screen Shot 2016-06-30 at 
> 9.52.52 PM.png, Screen Shot 2016-06-30 at 9.54.08 PM.png, 
> TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-04 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361009#comment-15361009
 ] 

Hiroshi Ikeda commented on HBASE-15716:
---

bq. You have tests that prove going backwards?

It is difficult to test and I found by consideration.

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, Screen Shot 2016-04-26 at 2.05.45 PM.png, Screen Shot 
> 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, 
> Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 
> PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, Screen Shot 2016-06-30 at 
> 9.52.52 PM.png, Screen Shot 2016-06-30 at 9.54.08 PM.png, 
> TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360981#comment-15360981
 ] 

Hadoop QA commented on HBASE-16162:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
55s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 58s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 21s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 134m 20s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815984/HBASE-16162_V4.patch |
| JIRA Issue | HBASE-16162 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d22c23c |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 

[jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tine families

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360928#comment-15360928
 ] 

Hadoop QA commented on HBASE-16074:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
55s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
0s {color} | {color:green} branch-1.3 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} branch-1.3 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 13s 
{color} | {color:red} hbase-server in branch-1.3 has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} branch-1.3 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} branch-1.3 passed with JDK v1.7.0_80 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 41s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 1s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 9s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
28s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 125m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815981/HBASE-16074.branch-1.3.003.patch
 |
| JIRA Issue | HBASE-16074 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  

[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader

2016-07-04 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360884#comment-15360884
 ] 

Hiroshi Ikeda commented on HBASE-14479:
---

RpcServer.Responder is sort of a safety net used when the native sending buffer 
of a socket is full, and that is rarely used if clients are well-behaved and 
wait their response for each request. That means, YCSB should call multiple 
requests simultaneously in one connection.

I checked the source of RpcServer and I found that the method 
Reader.doRead(SelectionKey) just does one request for each call, regardless of 
whether the next request is available, unless requests are through SASL. That 
makes the patch of this issue unnecessarily change registration of the key of a 
connection for each request, causing overhead (as shown by 
sun.nio.ch.EPollArrayWrapper::updateRegistrations, though I didn't think such 
different through-puts).

BTW, in order to resolve this, when we read as many requests from a connection 
as possible, the queue will easily become full and it will be difficult to 
handle requests fairly as to connections. I think it is better to cap the count 
of the requests simultaneously executing for each connection, according to the 
current requests queued (instead of using a fixed bounded queue).

> Apply the Leader/Followers pattern to RpcServer's Reader
> 
>
> Key: HBASE-14479
> URL: https://issues.apache.org/jira/browse/HBASE-14479
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, 
> HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, 
> flamegraph-32667.svg, gc.png, gets.png, io.png, median.png
>
>
> {{RpcServer}} uses multiple selectors to read data for load distribution, but 
> the distribution is just done by round-robin. It is uncertain, especially for 
> long run, whether load is equally divided and resources are used without 
> being wasted.
> Moreover, multiple selectors may cause excessive context switches which give 
> priority to low latency (while we just add the requests to queues), and it is 
> possible to reduce throughput of the whole server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely

2016-07-04 Thread Phil Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Yang updated HBASE-16144:
--
Attachment: HBASE-16144-v3.patch

TTL in ReplicationZKLockCleanerChore will be changed in test cause, so it can 
not set to final.
Fix other issues

> Replication queue's lock will live forever if RS acquiring the lock has died 
> prematurely
> 
>
> Key: HBASE-16144
> URL: https://issues.apache.org/jira/browse/HBASE-16144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.1, 1.1.5, 0.98.20
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: HBASE-16144-v1.patch, HBASE-16144-v2.patch, 
> HBASE-16144-v3.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if 
> we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy 
> nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock 
> will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)