[jira] [Updated] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-04 Thread Eshcar Hillel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel updated HBASE-18294:
--
Status: Open  (was: Patch Available)

> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-04 Thread Eshcar Hillel (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351674#comment-16351674
 ] 

Eshcar Hillel commented on HBASE-18294:
---

Here is my response from RB regarding the change in HCD

In the new implementation we compare on-heap size against on-heap threshold and 
off-heap size against off-heap threshold. Exceeding one of these two thresholds 
triggers a flush. If we only consider one unified threshold for on- and 
off-heap data we can get OOME.
For example, consider an off-heap cluster with 96GB memory machines, and assume 
in each machine 90GB are allocated to off-heap data and 6GB are allocated to 
on-heap objects. Assume we manage 700 regions per RS, and the flush size is 
128MB. It only requires 46 regions to get to 128MB on-heap or for all regions 
to exceed 8.5MB(!) on-heap to get an OOME.
Therefore I think giving the user an optional(!) control on the flush size both 
for on-heap and off-heap sizes is essential. And they both fall back on default 
value so they are both optional and may not be an unnecessary burden on the 
user.

 

I also see I need to re-base my patch. Waiting for review comments to create my 
next patch.

> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-04 Thread Edward Bortnikov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351714#comment-16351714
 ] 

Edward Bortnikov commented on HBASE-18294:
--

I second [~eshcar]. Off-heap and on-heap memory are different resources, with 
potentially very different allocations within the same machine. The code 
already addresses them separately all the way long. The user does need this 
(optional) design knob.  

> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19929) Call RS.stop on a session expired RS may hang

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351722#comment-16351722
 ] 

Duo Zhang commented on HBASE-19929:
---

AsyncDFSClient is not the problem. The problem is AsyncFSWAL. By design it will 
not fail any requests and will always try to open a new writer to write the 
pending requests. When rolling failed, the log rolle will abort the rs, and 
when aborting we will close the WAL and the pending sync will be notified.

The problem here is, we enter the shutdown processing before setting 
abortRequested to true, so we will try to flush all the regions first and wait 
them to be closed. And then we found that the WAL is broken and there is an 
abort request from the log roller, but it does not help, the close of WAL is 
after the waiting of regions to be closed, so it is something like a dead lock 
here...

So I think a possible solution is to close WAL directly when log roller wants 
to abort an RS.  Let me prepare a patch.

Thanks.

> Call RS.stop on a session expired RS may hang
> -
>
> Key: HBASE-19929
> URL: https://issues.apache.org/jira/browse/HBASE-19929
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
>
> See the discussion in HBASE-19927. The problem is that, for a normal stop we 
> will try to close all the regions and wait until they are all closed. But if 
> the RS has already session expired, master will start the failover work which 
> will move the WAL directory, and then we will be stuck in writing flush 
> marker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19926) Use a separated class to implement the WALActionListener for Replication

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351723#comment-16351723
 ] 

Duo Zhang commented on HBASE-19926:
---

Let me commit.

> Use a separated class to implement the WALActionListener for Replication
> 
>
> Key: HBASE-19926
> URL: https://issues.apache.org/jira/browse/HBASE-19926
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19926-v1.patch, HBASE-19926.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19726) Failed to start HMaster due to infinite retrying on meta assign

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351758#comment-16351758
 ] 

Hudson commented on HBASE-19726:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4524 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4524/])
HBASE-19726 Failed to start HMaster due to infinite retrying on meta (stack: 
rev b0e998f2a50a50a8d84daa35baff1d4ac99d1c6a)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java


> Failed to start HMaster due to infinite retrying on meta assign
> ---
>
> Key: HBASE-19726
> URL: https://issues.apache.org/jira/browse/HBASE-19726
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 19726.patch
>
>
> This is what I got at first, an exception when trying to write something to 
> meta when meta has not been onlined yet.
> {noformat}
> 2018-01-07,21:03:14,389 INFO org.apache.hadoop.hbase.master.HMaster: Running 
> RecoverMetaProcedure to ensure proper hbase:meta deploy.
> 2018-01-07,21:03:14,637 INFO 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure: Start pid=1, 
> state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
> failedMetaServer=null, splitWal=true
> 2018-01-07,21:03:14,645 INFO org.apache.hadoop.hbase.master.MasterWalManager: 
> Log folder 
> hdfs://c402tst-community/hbase/c402tst-community/WALs/c4-hadoop-tst-st27.bj,38900,1515330173896
>  belongs to an existing region server
> 2018-01-07,21:03:14,646 INFO org.apache.hadoop.hbase.master.MasterWalManager: 
> Log folder 
> hdfs://c402tst-community/hbase/c402tst-community/WALs/c4-hadoop-tst-st29.bj,38900,1515330177232
>  belongs to an existing region server
> 2018-01-07,21:03:14,648 INFO 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure: pid=1, 
> state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure 
> failedMetaServer=null, splitWal=true; Retaining meta assignment to server=null
> 2018-01-07,21:03:14,653 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Initialized 
> subprocedures=[{pid=2, ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE; 
> AssignProcedure table=hbase:meta, region=1588230740}]
> 2018-01-07,21:03:14,660 INFO 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: pid=2, 
> ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
> table=hbase:meta, region=1588230740 hbase:meta hbase:meta,,1.1588230740
> 2018-01-07,21:03:14,663 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Start pid=2, 
> ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
> table=hbase:meta, region=1588230740; rit=OFFLINE, location=null; 
> forceNewPlan=false, retain=false
> 2018-01-07,21:03:14,831 INFO 
> org.apache.hadoop.hbase.zookeeper.MetaTableLocator: Setting hbase:meta 
> (replicaId=0) location in ZooKeeper as 
> c4-hadoop-tst-st27.bj,38900,1515330173896
> 2018-01-07,21:03:14,841 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=2, ppid=1, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=hbase:meta, region=1588230740; rit=OPENING, 
> location=c4-hadoop-tst-st27.bj,38900,1515330173896
> 2018-01-07,21:03:14,992 INFO 
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Using 
> procedure batch rpc execution for 
> serverName=c4-hadoop-tst-st27.bj,38900,1515330173896 version=3145728
> 2018-01-07,21:03:15,593 ERROR 
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl: Cannot get replica 0 
> location for 
> {"totalColumns":1,"row":"hbase:meta","families":{"table":[{"qualifier":"state","vlen":2,"tag":[],"timestamp":1515330195514}]},"ts":1515330195514}
> 2018-01-07,21:03:15,594 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: 
> Retryable error trying to transition: pid=2, ppid=1, 
> state=RUNNABLE:REGION_TRANSITION_FINISH; AssignProcedure table=hbase:meta, 
> region=1588230740; rit=OPEN, 
> location=c4-hadoop-tst-st27.bj,38900,1515330173896
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: IOException: 1 time, servers with issues: null
> at 
> org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54)
> at 
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1250)
> at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:457)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:570)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.put(MetaTableAccessor.java:1450)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.putToMetaTable(MetaTableAccessor.java:1439)
>  

[jira] [Commented] (HBASE-19914) Refactor TestVisibilityLabelsOnNewVersionBehaviorTable

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351757#comment-16351757
 ] 

Hudson commented on HBASE-19914:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4524 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4524/])
HBASE-19914 Refactor TestVisibilityLabelsOnNewVersionBehaviorTable (zhangduo: 
rev 2e1ec3d3d8dc4ef771463decb814e7c118523bf9)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDeletes.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ColumnFamilyDescriptorBuilder.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsOnNewVersionBehaviorTable.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithDefaultVisLabelService.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/VisibilityLabelsWithDeletesTestBase.java


> Refactor TestVisibilityLabelsOnNewVersionBehaviorTable
> --
>
> Key: HBASE-19914
> URL: https://issues.apache.org/jira/browse/HBASE-19914
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19914-v1.patch, HBASE-19914-v2.patch, 
> HBASE-19914.patch, HBASE-19914.patch
>
>
> And both TestVisibilityLabelsOnNewVersionBehaviorTable and its parent class 
> run about 2 minutes, which is not safe to declared as MediumTests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19133) Transfer big cells or upserted/appended cells into MSLAB upon flattening to CellChunkMap

2018-02-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351766#comment-16351766
 ] 

Anastasia Braginsky commented on HBASE-19133:
-

Hi [~yuzhih...@gmail.com], 

Thanks for your comments. Regarding your first comment, indeed there is a 
problem when proceeding merge on the list of segments where flattening of big 
cells should still happen. Thank you for raising our attention to this point. 
However, forceCopyOfBigCellInto() should still happen in the case it is still 
needed. Therefore the solution is not to check the type of MSLAB and eliminate 
the call, but to proceed with forceCopyOfBigCellInto() on the 
ImmutableMemStoreLAB as well. We will come with the fix very soon on the 
separate JIRA. Gali will proceed with the fix.
{quote}bq. maybeCloneWithAllocator() should check whether clone is supported by 
this.memStoreLAB. If not, it just returns the Cell. 
{quote}
This is wrong. In our case when big cell is encountered, the clone should 
happen anyway unless MSLAB is generally undefined. The meaning of the 
forceCloneOfBigCell() is to clone the cell despite the possible limitations of 
maxAlloc or chunkSize.

> Transfer big cells or upserted/appended cells into MSLAB upon flattening to 
> CellChunkMap
> 
>
> Key: HBASE-19133
> URL: https://issues.apache.org/jira/browse/HBASE-19133
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Gali Sheffi
>Priority: Major
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19133-V01.patch, HBASE-19133-V02.patch, 
> HBASE-19133-V03.patch, HBASE-19133.01.patch, HBASE-19133.02.patch, 
> HBASE-19133.03.patch, HBASE-19133.04.patch, HBASE-19133.05.patch, 
> HBASE-19133.06.patch, HBASE-19133.07.patch, HBASE-19133.08.patch, 
> HBASE-19133.09.patch, HBASE-19133.10.patch, HBASE-19133.11.patch
>
>
> CellChunkMap Segment index requires all cell data to be written in the MSLAB 
> Chunks. Eventhough MSLAB is enabled, cells bigger than chunk size or 
> upserted/incremented/appended cells are still allocated on the JVM stack. If 
> such cells are found in the process of flattening into CellChunkMap 
> (in-memory-flush) they need to be copied into MSLAB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19658) Fix and reenable TestCompactingToCellFlatMapMemStore#testFlatteningToJumboCellChunkMap

2018-02-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351767#comment-16351767
 ] 

Anastasia Braginsky commented on HBASE-19658:
-

+1 from my side. Since it looks like we all agreed on the fix and we have two 
times +1, I'll commit this patch.

> Fix and reenable 
> TestCompactingToCellFlatMapMemStore#testFlatteningToJumboCellChunkMap
> --
>
> Key: HBASE-19658
> URL: https://issues.apache.org/jira/browse/HBASE-19658
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-beta-1
>Reporter: stack
>Assignee: Anastasia Braginsky
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19658-V01.patch, HBASE-19658-V02.patch, 
> HBASE-19658-V03.patch, HBASE-19658-V04.patch, HBASE-19658-V05.patch, 
> HBASE-19658.09.patch, HBASE-19658.09.patch, HBASE-19658.8.patch, 
> HBASE-19658.8.patch, HBASE-19658.0007.patch, HBASE-19658.006.patch, 
> HBASE-19658.05.patch, 
> org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore-output.txt
>
>
> testFlatteningToJumboCellChunkMap was disabled so could commit HBASE-19282 on 
> branch-2. This test is failing reliably. Assigned to [~anastas]. This issue 
> is about fixing the failing test and reenabling it in time for beta-2. Thanks 
> A.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19133) Transfer big cells or upserted/appended cells into MSLAB upon flattening to CellChunkMap

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351771#comment-16351771
 ] 

Ted Yu commented on HBASE-19133:


Can you describe the fix in more detail ?
Are you going to change the type of segment merge operation produces ?

> Transfer big cells or upserted/appended cells into MSLAB upon flattening to 
> CellChunkMap
> 
>
> Key: HBASE-19133
> URL: https://issues.apache.org/jira/browse/HBASE-19133
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Gali Sheffi
>Priority: Major
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19133-V01.patch, HBASE-19133-V02.patch, 
> HBASE-19133-V03.patch, HBASE-19133.01.patch, HBASE-19133.02.patch, 
> HBASE-19133.03.patch, HBASE-19133.04.patch, HBASE-19133.05.patch, 
> HBASE-19133.06.patch, HBASE-19133.07.patch, HBASE-19133.08.patch, 
> HBASE-19133.09.patch, HBASE-19133.10.patch, HBASE-19133.11.patch
>
>
> CellChunkMap Segment index requires all cell data to be written in the MSLAB 
> Chunks. Eventhough MSLAB is enabled, cells bigger than chunk size or 
> upserted/incremented/appended cells are still allocated on the JVM stack. If 
> such cells are found in the process of flattening into CellChunkMap 
> (in-memory-flush) they need to be copied into MSLAB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Gali Sheffi (JIRA)
Gali Sheffi created HBASE-19930:
---

 Summary: fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
 Key: HBASE-19930
 URL: https://issues.apache.org/jira/browse/HBASE-19930
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0-beta-1
Reporter: Gali Sheffi
Assignee: Gali Sheffi


This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto. This 
method only throws an IllegalStateException, instead of forcing the copy as it 
is supposed to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19133) Transfer big cells or upserted/appended cells into MSLAB upon flattening to CellChunkMap

2018-02-04 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351772#comment-16351772
 ] 

Anastasia Braginsky commented on HBASE-19133:
-

There going to be a new JIRA where the fix is going to be described. In short, 
as I said, in ImmutableMemStoreLAB while forceCloneOfBigCell() is requested, 
forceCloneOfBigCell() is going to proceed with forceCloneOfBigCell() performed 
on the first MSLABImpl in the linked list.
{quote}Are you going to change the type of segment merge operation produces ?
{quote}
No

> Transfer big cells or upserted/appended cells into MSLAB upon flattening to 
> CellChunkMap
> 
>
> Key: HBASE-19133
> URL: https://issues.apache.org/jira/browse/HBASE-19133
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Gali Sheffi
>Priority: Major
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19133-V01.patch, HBASE-19133-V02.patch, 
> HBASE-19133-V03.patch, HBASE-19133.01.patch, HBASE-19133.02.patch, 
> HBASE-19133.03.patch, HBASE-19133.04.patch, HBASE-19133.05.patch, 
> HBASE-19133.06.patch, HBASE-19133.07.patch, HBASE-19133.08.patch, 
> HBASE-19133.09.patch, HBASE-19133.10.patch, HBASE-19133.11.patch
>
>
> CellChunkMap Segment index requires all cell data to be written in the MSLAB 
> Chunks. Eventhough MSLAB is enabled, cells bigger than chunk size or 
> upserted/incremented/appended cells are still allocated on the JVM stack. If 
> such cells are found in the process of flattening into CellChunkMap 
> (in-memory-flush) they need to be copied into MSLAB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19905) ReplicationSyncUp tool will not exit if a peer replication is disabled

2018-02-04 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-19905:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-2
   1.5.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks Ted for the review.

Pushed to branch-1+

> ReplicationSyncUp tool will not exit if a peer replication is disabled
> --
>
> Key: HBASE-19905
> URL: https://issues.apache.org/jira/browse/HBASE-19905
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.0.0-beta-2
>
> Attachments: HBASE-19905.patch
>
>
> In our test cluster we had two peer clusters, in which one peer cluster 
> replication was disabled. Now when used ReplicationSyncUp tool to replicate 
> the data to peer cluster, the tool replicated the data to the enabled peer 
> cluster but it was keep on retrying to replicate the data to disabled peer 
> cluster and hence it was not getting terminated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Gali Sheffi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gali Sheffi updated HBASE-19930:

Attachment: HBASE-19930-V01.patch

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto. This 
> method only throws an IllegalStateException, instead of forcing the copy as 
> it is supposed to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Gali Sheffi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-19930 started by Gali Sheffi.
---
> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto. This 
> method only throws an IllegalStateException, instead of forcing the copy as 
> it is supposed to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Gali Sheffi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gali Sheffi updated HBASE-19930:

Status: Patch Available  (was: In Progress)

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto. This 
> method only throws an IllegalStateException, instead of forcing the copy as 
> it is supposed to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19506) Support variable sized chunks from ChunkCreator

2018-02-04 Thread Gali Sheffi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gali Sheffi updated HBASE-19506:

Status: Patch Available  (was: Open)

> Support variable sized chunks from ChunkCreator
> ---
>
> Key: HBASE-19506
> URL: https://issues.apache.org/jira/browse/HBASE-19506
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Priority: Major
> Attachments: HBASE-19506-V01.patch
>
>
> When CellChunkMap is created it allocates a special index chunk (or chunks) 
> where array of cell-representations is stored. When the number of 
> cell-representations is small, it is preferable to allocate a chunk smaller 
> than a default value which is 2MB.
> On the other hand, those "non-standard size" chunks can not be used in pool. 
> On-demand allocations in off-heap are costly. So this JIRA is about to 
> investigate the trade of between memory usage and the final performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li commented on HBASE-19917:
--

Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:38 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue {noformat}O(1){noformat} for contains(). 
Time complexity is O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.




was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19133) Transfer big cells or upserted/appended cells into MSLAB upon flattening to CellChunkMap

2018-02-04 Thread Gali Sheffi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351785#comment-16351785
 ] 

Gali Sheffi commented on HBASE-19133:
-

[~yuzhih...@gmail.com] - the new JIRA is HBASE-19930

> Transfer big cells or upserted/appended cells into MSLAB upon flattening to 
> CellChunkMap
> 
>
> Key: HBASE-19133
> URL: https://issues.apache.org/jira/browse/HBASE-19133
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Gali Sheffi
>Priority: Major
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19133-V01.patch, HBASE-19133-V02.patch, 
> HBASE-19133-V03.patch, HBASE-19133.01.patch, HBASE-19133.02.patch, 
> HBASE-19133.03.patch, HBASE-19133.04.patch, HBASE-19133.05.patch, 
> HBASE-19133.06.patch, HBASE-19133.07.patch, HBASE-19133.08.patch, 
> HBASE-19133.09.patch, HBASE-19133.10.patch, HBASE-19133.11.patch
>
>
> CellChunkMap Segment index requires all cell data to be written in the MSLAB 
> Chunks. Eventhough MSLAB is enabled, cells bigger than chunk size or 
> upserted/incremented/appended cells are still allocated on the JVM stack. If 
> such cells are found in the process of flattening into CellChunkMap 
> (in-memory-flush) they need to be copied into MSLAB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:39 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.




was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:39 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.




was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:39 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.




was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue {noformat}O(1){noformat} for contains(). 
Time complexity is O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O(n) for time complexity (if I get 
it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).
I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19506) Support variable sized chunks from ChunkCreator

2018-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351788#comment-16351788
 ] 

Hadoop QA commented on HBASE-19506:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-19506 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908827/HBASE-19506-V01.patch 
|
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11382/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Support variable sized chunks from ChunkCreator
> ---
>
> Key: HBASE-19506
> URL: https://issues.apache.org/jira/browse/HBASE-19506
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Priority: Major
> Attachments: HBASE-19506-V01.patch
>
>
> When CellChunkMap is created it allocates a special index chunk (or chunks) 
> where array of cell-representations is stored. When the number of 
> cell-representations is small, it is preferable to allocate a chunk smaller 
> than a default value which is 2MB.
> On the other hand, those "non-standard size" chunks can not be used in pool. 
> On-demand allocations in off-heap are costly. So this JIRA is about to 
> investigate the trade of between memory usage and the final performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:41 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
filterServers() is only called in RSGroupBasedLoadBalancer#filterServers(), as 
follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
RSGroupInfo#getServers() returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling filterServers(), 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
{{filterServers()}} is only called in 
{{RSGroupBasedLoadBalancer#filterServers()}}, as follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
{{RSGroupInfo#getServers()}} returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling {{filterServers()}}, 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:43 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
filterServers() is only called in RSGroupBasedLoadBalancer#filterServers(), as 
follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
RSGroupInfo#getServers() returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling filterServers(), 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the whole TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1)

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
filterServers() is only called in RSGroupBasedLoadBalancer#filterServers(), as 
follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
RSGroupInfo#getServers() returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling filterServers(), 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the whole TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.


> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351784#comment-16351784
 ] 

Xiang Li edited comment on HBASE-19917 at 2/4/18 1:43 PM:
--

Thanks for your comment [~yuzhih...@gmail.com]!
filterServers() is only called in RSGroupBasedLoadBalancer#filterServers(), as 
follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
RSGroupInfo#getServers() returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling filterServers(), 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the whole TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.



was (Author: water):
Thanks for your comment [~yuzhih...@gmail.com]!
filterServers() is only called in RSGroupBasedLoadBalancer#filterServers(), as 
follow:
{code}
return filterServers(RSGroupInfo.getServers(), onlineServers);
{code}
RSGroupInfo#getServers() returns servers, a SortedSet. It is a TreeSet 
actually, built by its constructor.

Given a TreeSet, there are 2 ways: (Let's say when calling filterServers(), 
size of servers is n and size of onlineServers is m)
# Keep using TreeSet. Time complexity is O(m * logn). Because 
TreeSet#contains() is logn and we loop for m.
# Turn TreeSet into HashSet, to pursue O(1) for contains(). Time complexity is 
O(m + n), as the following 2 steps are included:
## Construct a HashSet from a TreeSet. It is O( n ) for time complexity (if I 
get it correctly) as it needs to iterate the TreeSet
## Calculate the union of severs and onlineServers. The time complexity is m * 
O(1).

I think #1 is good enough, although it is worse than #2 which is linear. What 
is your opinion?

Regarding
bq. If possible, we should change those to using HashSet.
In RSGroupInfo, servers as well as tables is TreeSet. According to the 
comments, 
{code}
// Keep servers in a sorted set so has an expected ordering when displayed.
private final SortedSet servers;
// Keep tables sorted too.
private final SortedSet tables;
{code}
TreeSet is only used for display purpose. I am checking if HashSet could be 
used to replace TreeSet throughout the calling chain.


> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Gali Sheffi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gali Sheffi updated HBASE-19930:

Description: 
This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.

Following a comment in HBASE-19133 regarding a bug in 
ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
called for an ImmutableMemStoreLAB, and just throwing an IllegalStateException 
whenever called), the forceCopyOfBigCellInto method now performs the copy of 
big cells on the first MSLABImpl in its mslabs linked-list.

  was:This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto. 
This method only throws an IllegalStateException, instead of forcing the copy 
as it is supposed to do.


> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.
> Following a comment in HBASE-19133 regarding a bug in 
> ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
> called for an ImmutableMemStoreLAB, and just throwing an 
> IllegalStateException whenever called), the forceCopyOfBigCellInto method now 
> performs the copy of big cells on the first MSLABImpl in its mslabs 
> linked-list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351806#comment-16351806
 ] 

Ted Yu commented on HBASE-19917:


I also think #1 Is good enough performance wise. 
You can keep the TreeSet 

Thanks

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351813#comment-16351813
 ] 

Ted Yu commented on HBASE-19930:


Can you add a test ?

Thanks

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.
> Following a comment in HBASE-19133 regarding a bug in 
> ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
> called for an ImmutableMemStoreLAB, and just throwing an 
> IllegalStateException whenever called), the forceCopyOfBigCellInto method now 
> performs the copy of big cells on the first MSLABImpl in its mslabs 
> linked-list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19926) Use a separated class to implement the WALActionListener for Replication

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351819#comment-16351819
 ] 

Hudson commented on HBASE-19926:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4525 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4525/])
HBASE-19926 Use a separated class to implement the WALActionListener for 
(zhangduo: rev 14420e1b415cd468f652bf0137bda575e0a5980a)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALActionListener.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> Use a separated class to implement the WALActionListener for Replication
> 
>
> Key: HBASE-19926
> URL: https://issues.apache.org/jira/browse/HBASE-19926
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19926-v1.patch, HBASE-19926.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19658) Fix and reenable TestCompactingToCellFlatMapMemStore#testFlatteningToJumboCellChunkMap

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351820#comment-16351820
 ] 

Hudson commented on HBASE-19658:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4525 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4525/])
HBASE-19658 make the test testFlatteningToJumboCellChunkMap() stable, by 
(anastas: rev 170ffbba683217bdb30e5c99f0e728e0dc660d56)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactingToCellFlatMapMemStore.java


> Fix and reenable 
> TestCompactingToCellFlatMapMemStore#testFlatteningToJumboCellChunkMap
> --
>
> Key: HBASE-19658
> URL: https://issues.apache.org/jira/browse/HBASE-19658
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-beta-1
>Reporter: stack
>Assignee: Anastasia Braginsky
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19658-V01.patch, HBASE-19658-V02.patch, 
> HBASE-19658-V03.patch, HBASE-19658-V04.patch, HBASE-19658-V05.patch, 
> HBASE-19658.09.patch, HBASE-19658.09.patch, HBASE-19658.8.patch, 
> HBASE-19658.8.patch, HBASE-19658.0007.patch, HBASE-19658.006.patch, 
> HBASE-19658.05.patch, 
> org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore-output.txt
>
>
> testFlatteningToJumboCellChunkMap was disabled so could commit HBASE-19282 on 
> branch-2. This test is failing reliably. Assigned to [~anastas]. This issue 
> is about fixing the failing test and reenabling it in time for beta-2. Thanks 
> A.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19905) ReplicationSyncUp tool will not exit if a peer replication is disabled

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351818#comment-16351818
 ] 

Hudson commented on HBASE-19905:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4525 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4525/])
HBASE-19905 ReplicationSyncUp tool will not exit if a peer replication 
(ashishsinghi: rev 397d34736e63d7661a2f01524f8b302e1309d40f)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> ReplicationSyncUp tool will not exit if a peer replication is disabled
> --
>
> Key: HBASE-19905
> URL: https://issues.apache.org/jira/browse/HBASE-19905
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.0.0-beta-2
>
> Attachments: HBASE-19905.patch
>
>
> In our test cluster we had two peer clusters, in which one peer cluster 
> replication was disabled. Now when used ReplicationSyncUp tool to replicate 
> the data to peer cluster, the tool replicated the data to the enabled peer 
> cluster but it was keep on retrying to replicate the data to disabled peer 
> cluster and hence it was not getting terminated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19922) ProtobufUtils::PRIMITIVES is unused

2018-02-04 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351821#comment-16351821
 ] 

Chia-Ping Tsai commented on HBASE-19922:


patch LGTM. run the hadoop check with patch locally - PASS. [~mdrob] WDYT?

> ProtobufUtils::PRIMITIVES is unused
> ---
>
> Key: HBASE-19922
> URL: https://issues.apache.org/jira/browse/HBASE-19922
> Project: HBase
>  Issue Type: Task
>  Components: Protobufs
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.0
>
> Attachments: HBASE-19922.patch
>
>
> It looks like ProtobufUtils::PRIMITIVES is never read in both the shaded and 
> non-shaded versions of the class. Is it safe to remove?
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java#L128
> We populate the map in a static initializer but never read any values from 
> it...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351834#comment-16351834
 ] 

Hadoop QA commented on HBASE-19930:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  8m 
16s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  7m 
 1s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 
26s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19930 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909141/HBASE-19930-V01.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2e2a8e2af9d7 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 170ffbba68 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11381/testReport/ |
| Max. process+thread count | 5057 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11381/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> fix ImmutableMemStoreLAB#fo

[jira] [Commented] (HBASE-19929) Call RS.stop on a session expired RS may hang

2018-02-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351905#comment-16351905
 ] 

stack commented on HBASE-19929:
---

Thanks for the explanation.

> Call RS.stop on a session expired RS may hang
> -
>
> Key: HBASE-19929
> URL: https://issues.apache.org/jira/browse/HBASE-19929
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
>
> See the discussion in HBASE-19927. The problem is that, for a normal stop we 
> will try to close all the regions and wait until they are all closed. But if 
> the RS has already session expired, master will start the failover work which 
> will move the WAL directory, and then we will be stuck in writing flush 
> marker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19931) TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas

2018-02-04 Thread stack (JIRA)
stack created HBASE-19931:
-

 Summary: TestMetaWithReplicas failing 100% of the time in 
testHBaseFsckWithMetaReplicas
 Key: HBASE-19931
 URL: https://issues.apache.org/jira/browse/HBASE-19931
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: stack
 Fix For: 2.0.0-beta-2


Somehow we missed a test that depends on a run of HBCK. It fails 100% of the 
time now because of HBASE-19726 Failed to start HMaster due to infinite 
retrying on meta assign where we no longer update hbase:meta with the state of 
hbase:meta; rather, hbase:meta's always-ENABLED state is inferred. It broke 
HBCK here.

So, disable the test and just-in-case add meta as ENABLED to hbck though hbck 
as is is not for hbase2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19931) TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas

2018-02-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19931:
--
Attachment: HBASE-19931.branch-2.001.patch

> TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas
> --
>
> Key: HBASE-19931
> URL: https://issues.apache.org/jira/browse/HBASE-19931
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19931.branch-2.001.patch
>
>
> Somehow we missed a test that depends on a run of HBCK. It fails 100% of the 
> time now because of HBASE-19726 Failed to start HMaster due to infinite 
> retrying on meta assign where we no longer update hbase:meta with the state 
> of hbase:meta; rather, hbase:meta's always-ENABLED state is inferred. It 
> broke HBCK here.
> So, disable the test and just-in-case add meta as ENABLED to hbck though hbck 
> as is is not for hbase2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19931) TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas

2018-02-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-19931.
---
Resolution: Fixed

.001 is what I pushed on master and branch-2.

> TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas
> --
>
> Key: HBASE-19931
> URL: https://issues.apache.org/jira/browse/HBASE-19931
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19931.branch-2.001.patch
>
>
> Somehow we missed a test that depends on a run of HBCK. It fails 100% of the 
> time now because of HBASE-19726 Failed to start HMaster due to infinite 
> retrying on meta assign where we no longer update hbase:meta with the state 
> of hbase:meta; rather, hbase:meta's always-ENABLED state is inferred. It 
> broke HBCK here.
> So, disable the test and just-in-case add meta as ENABLED to hbck though hbck 
> as is is not for hbase2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-19931) TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas

2018-02-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-19931:
---

Reopening. We fixed one test. Not it is flakey in another.

org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds
at 
org.apache.hadoop.hbase.client.TestMetaWithReplicas.shutdownMetaAndDoValidations(TestMetaWithReplicas.java:265)
at 
org.apache.hadoop.hbase.client.TestMetaWithReplicas.testShutdownHandling(TestMetaWithReplicas.java:191)



> TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas
> --
>
> Key: HBASE-19931
> URL: https://issues.apache.org/jira/browse/HBASE-19931
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19931.branch-2.001.patch
>
>
> Somehow we missed a test that depends on a run of HBCK. It fails 100% of the 
> time now because of HBASE-19726 Failed to start HMaster due to infinite 
> retrying on meta assign where we no longer update hbase:meta with the state 
> of hbase:meta; rather, hbase:meta's always-ENABLED state is inferred. It 
> broke HBCK here.
> So, disable the test and just-in-case add meta as ENABLED to hbck though hbck 
> as is is not for hbase2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)
Ted Yu created HBASE-19932:
--

 Summary: TestSecureIPC in branch-1 fails with NoSuchMethodError 
against hadoop 3
 Key: HBASE-19932
 URL: https://issues.apache.org/jira/browse/HBASE-19932
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
 Fix For: 1.5.0


Error below can be observed when running the test against hadoop 3:
{code}
org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
ERROR!
java.lang.NoSuchMethodError: 
org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
at 
org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351932#comment-16351932
 ] 

Ted Yu commented on HBASE-19932:


branch-1 uses 1.0.0-RC2 of kerby.
branch-2 uses 1.0.1 of kerby.

Upgrading kerby to 1.0.1 makes the test pass.

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-19917:
-
Status: Open  (was: Patch Available)

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-19917:
-
Attachment: HBASE-19917.master.001.patch

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-19917:
-
Status: Patch Available  (was: Open)

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351933#comment-16351933
 ] 

Xiang Li commented on HBASE-19917:
--

Thanks Ted. I uploaded patch 001 to implement #1 (keep using TreeSet and its 
contains())

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19909) TestRegionLocationFinder Timeout

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19909.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Seems worked? Resolve.

> TestRegionLocationFinder Timeout
> 
>
> Key: HBASE-19909
> URL: https://issues.apache.org/jira/browse/HBASE-19909
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19909.branch-2.001.patch
>
>
> This test is timing out a bunch in runs since we moved over to the nice new 
> fancy, smancy, timeout thingymajig.
> Similar to HBASE-19908, I see that on Jenkins, the test is making progress 
> but is running at a slower rate.
> This is a 'smalltest' that starts a minicluster with 5 servers creating a 
> table with 26 odd regions.
> On my uncontested machine, it takes 20 seconds to complete the create table. 
> On jenkins it takes  29 seconds (see 
> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/11303/testReport/org.apache.hadoop.hbase.master.balancer/TestRegionLocationFinder/org_apache_hadoop_hbase_master_balancer_TestRegionLocationFinder/)
>  Small tests are supposed to complete inside 30 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19908) TestCoprocessorShortCircuitRPC Timeout....

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19908.
---
   Resolution: Fixed
 Assignee: stack
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-2

Resolve.

> TestCoprocessorShortCircuitRPC Timeout
> --
>
> Key: HBASE-19908
> URL: https://issues.apache.org/jira/browse/HBASE-19908
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19908.master.001.patch
>
>
> Timedout in HBASE-19906
> Comparing a local run (16seconds total) to a timed out run up on jenkins, I 
> see it takes my local test 5 seconds to get the STOPPED server log line. On 
> jenkins in this timed out test it takes 30 seconds. Test is still running 
> when it is killed. Let me make it a medium test.
> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/lastCompletedBuild/testReport/org.apache.hadoop.hbase.coprocessor/TestCoprocessorShortCircuitRPC/org_apache_hadoop_hbase_coprocessor_TestCoprocessorShortCircuitRPC/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19868) TestCoprocessorWhitelistMasterObserver is flakey

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19868.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Resolve.

> TestCoprocessorWhitelistMasterObserver is flakey
> 
>
> Key: HBASE-19868
> URL: https://issues.apache.org/jira/browse/HBASE-19868
> Project: HBase
>  Issue Type: Sub-task
>  Components: flakey, test
>Affects Versions: 2.0.0-beta-1
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19868.branch-2.001.patch, 
> HBASE-19868.master.002.patch
>
>
> TestCoprocessorWhitelistMasterObserver is failing 33% of the time. In the 
> logs it looks like the failure is related to Master initialization.
> Following log is from 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/203] 
> {noformat}
> 2018-01-26 02:36:36,686 WARN [M:0;1f0c4777c1ba:35049] 
> master.TableNamespaceManager(307): Caught exception in initializing namespace 
> table manager
> org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:684)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:562)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:73)
> at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:388)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:362)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:141)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:281)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:103)
> at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1059)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-26 02:36:36,691 ERROR [M:0;1f0c4777c1ba:35049] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1061)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:921)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2034)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: 
> hconnection-0x18cd2ac8 closed
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:722)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:714

[jira] [Resolved] (HBASE-19916) TestCacheOnWrite Times Out

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19916.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Resolve.

> TestCacheOnWrite Times Out
> --
>
> Key: HBASE-19916
> URL: https://issues.apache.org/jira/browse/HBASE-19916
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19916.master.001.patch
>
>
> All day it has been timing out. Its a medium test. There is a bit in the 
> middle where we are hung up for a minute or more:
> 2018-02-01 23:01:02,471 DEBUG [Time-limited test] 
> hfile.HFile$WriterFactory(336): Unable to set drop behind on 
> /testptch/hbase/hbase-server/target/test-data/6a153924-7f81-4008-ac7e-d0e69384655e/data/default/CompactionCacheOnWrite/6dd6ed35f3b6090bd8d04ed21d687424/.tmp/myCF/c0387b09f82840ab9e636faf5cf02d2d
> 2018-02-01 23:01:03,059 DEBUG [Time-limited test] 
> regionserver.HRegionFileSystem(463): Committing store file 
> /testptch/hbase/hbase-server/target/test-data/6a153924-7f81-4008-ac7e-d0e69384655e/data/default/CompactionCacheOnWrite/6dd6ed35f3b6090bd8d04ed21d687424/.tmp/myCF/c0387b09f82840ab9e63
> ...[truncated 1865657 bytes]...
> b663/myCF/61386855036d4facb75ce7eca2059661, entries=15000, sequenceid=1005, 
> filesize=85.3 K
> 2018-02-01 23:03:50,591 INFO  [Time-limited test] regionserver.HRegion(2713): 
> Finished memstore flush of ~1.73 MB/1814000, currentsize=0 B/0 for region 
> CompactionCacheOnWrite,,1517526229883.3b579f93f196a847ed1489e71585b663. in 
> 175ms, sequenceid=1005, compaction requested=false
> 2018-02-01 23:03:50,799 INFO  [Time-limited test] regionserver.HRegion(2517): 
> Flushing 1/1 column families, memstore=1.80 MB
> ...
> I've seen this a few times. The test takes 100seconds locally.
> Let me try changing it to type. If that doesn't work, will be back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19914) Refactor TestVisibilityLabelsOnNewVersionBehaviorTable

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19914:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to master and branch-2.

Thanks [~stack] for reviewing.

> Refactor TestVisibilityLabelsOnNewVersionBehaviorTable
> --
>
> Key: HBASE-19914
> URL: https://issues.apache.org/jira/browse/HBASE-19914
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19914-v1.patch, HBASE-19914-v2.patch, 
> HBASE-19914.patch, HBASE-19914.patch
>
>
> And both TestVisibilityLabelsOnNewVersionBehaviorTable and its parent class 
> run about 2 minutes, which is not safe to declared as MediumTests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-19932:
--

Assignee: Ted Yu

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19932.branch-1.txt
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19932:
---
Attachment: 19932.branch-1.txt

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19932.branch-1.txt
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19927) TestFullLogReconstruction flakey

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351941#comment-16351941
 ] 

Duo Zhang commented on HBASE-19927:
---

Let me commit this. Will use HBASE-19929 to address the shutdown problem.

> TestFullLogReconstruction flakey
> 
>
> Key: HBASE-19927
> URL: https://issues.apache.org/jira/browse/HBASE-19927
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19927.patch, js, out
>
>
> Fails pretty frequently in hadoopqa builds.
> There is a recent hang in 
> org.apache.hadoop.hbase.TestFullLogReconstruction.tearDownAfterClass(TestFullLogReconstruction.java:68)
> In here... 
> https://builds.apache.org/job/PreCommit-HBASE-Build/11363/testReport/org.apache.hadoop.hbase/TestFullLogReconstruction/org_apache_hadoop_hbase_TestFullLogReconstruction/
> ... see here.
> Thread 1250 (RS_CLOSE_META-edd281aedb18:59863-0):
>   State: TIMED_WAITING
>   Blocked count: 92
>   Waited count: 278
>   Stack:
> java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:133)
> 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:718)
> 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:605)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:154)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeFlushMarker(WALUtil.java:81)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2356)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2328)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2319)
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1531)
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1437)
> 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> We missed a signal? We need to do an interrupt? The log is not all there in 
> hadoopqa builds so hard to see all that is going on. This test is not in the 
> flakey set either



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19932:
---
Status: Patch Available  (was: Open)

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19932.branch-1.txt
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351944#comment-16351944
 ] 

Ted Yu commented on HBASE-19917:


lgtm, pending QA

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351946#comment-16351946
 ] 

Duo Zhang commented on HBASE-19932:
---

+1 if pre commit is OK.

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19932.branch-1.txt
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19910) TestBucketCache TimesOut

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19910.
---
   Resolution: Fixed
 Assignee: stack
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-2

Resolve.

> TestBucketCache TimesOut
> 
>
> Key: HBASE-19910
> URL: https://issues.apache.org/jira/browse/HBASE-19910
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 0001-HBASE-19910-TestBucketCache-TimesOut.patch
>
>
> See 
> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/11303/testReport/org.apache.hadoop.hbase.master.balancer/TestRegionLocationFinder/org_apache_hadoop_hbase_master_balancer_TestRegionLocationFinder/
> This is  small test. Runs fast locally. 8 tests. Each is a second or two. Odd 
> though up on jenkins is that in the middle of one, there is a 19 second 
> pause. See here:
> 2018-02-01 00:56:30,013 INFO  [Time-limited test] util.ByteBufferArray(70): 
> Allocating buffers total=32 MB, sizePerBuffer=2 MB, count=16
> 2018-02-01 00:56:49,678 INFO  [Time-limited test] bucket.BucketCache(279): 
> Instantiating BucketCache with acceptableFactor: 0.95, minFactor: 0.85, 
> extraFreeFactor: 0.1, singleFactor: 0.25, multiFactor: 0.5, memoryFactor: 0.25
> Here is full test run:
> 2018-02-01 00:56:29,981 INFO  [Time-limited test] hbase.ResourceChecker(148): 
> before: io.hfile.bucket.TestBucketCache#testInvalidCacheSplitFactorConfig[1: 
> blockSize=16,384, bucketSizes=[I@20322d26] Thread=77, OpenFileDescriptor=263, 
> MaxFileDescriptor=1048576, SystemLoadAverage=2127, ProcessCount=9, 
> AvailableMemoryMB=7801
> 2018-02-01 00:56:30,013 INFO  [Time-limited test] util.ByteBufferArray(70): 
> Allocating buffers total=32 MB, sizePerBuffer=2 MB, count=16
> 2018-02-01 00:56:49,678 INFO  [Time-limited test] bucket.BucketCache(279): 
> Instantiating BucketCache with acceptableFactor: 0.95, minFactor: 0.85, 
> extraFreeFactor: 0.1, singleFactor: 0.25, multiFactor: 0.5, memoryFactor: 0.25
> 2018-02-01 00:56:49,689 INFO  [Time-limited test] 
> bucket.BucketAllocator(334): Cache totalSize=33288192, buckets=63, bucket 
> capacity=528384=(4*132096)=(FEWEST_ITEMS_IN_BUCKET*(largest configured 
> bucketcache size))
> 2018-02-01 00:56:49,690 INFO  [Time-limited test] bucket.BucketCache(322): 
> Started bucket cache; ioengine=offheap, capacity=32 MB, blockSize=16 KB, 
> writerThreadNum=3, writerQLen=64, persistencePath=null, 
> bucketAllocator=org.apache.hadoop.hbase.io.hfile.bucket.BucketAllocator
> 2018-02-01 00:56:50,020 INFO  [Time-limited test] util.ByteBufferArray(70): 
> Allocating buffers total=32 MB, sizePerBuffer=2 MB, count=16
> 2018-02-01 00:56:50,080 ERROR [Time-limited test] util.ByteBufferArray(101): 
> Buffer creation interrupted
> java.lang.InterruptedException
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.hbase.util.ByteBufferArray.createBuffers(ByteBufferArray.java:96)
>   at 
> org.apache.hadoop.hbase.util.ByteBufferArray.(ByteBufferArray.java:74)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.(ByteBufferIOEngine.java:86)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:384)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.TestBucketCache.checkConfigValues(TestBucketCache.java:387)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.TestBucketCache.testInvalidCacheSplitFactorConfig(TestBucketCache.java:377)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
>

[jira] [Commented] (HBASE-19927) TestFullLogReconstruction flakey

2018-02-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351950#comment-16351950
 ] 

stack commented on HBASE-19927:
---

+1 go for it

> TestFullLogReconstruction flakey
> 
>
> Key: HBASE-19927
> URL: https://issues.apache.org/jira/browse/HBASE-19927
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19927.patch, js, out
>
>
> Fails pretty frequently in hadoopqa builds.
> There is a recent hang in 
> org.apache.hadoop.hbase.TestFullLogReconstruction.tearDownAfterClass(TestFullLogReconstruction.java:68)
> In here... 
> https://builds.apache.org/job/PreCommit-HBASE-Build/11363/testReport/org.apache.hadoop.hbase/TestFullLogReconstruction/org_apache_hadoop_hbase_TestFullLogReconstruction/
> ... see here.
> Thread 1250 (RS_CLOSE_META-edd281aedb18:59863-0):
>   State: TIMED_WAITING
>   Blocked count: 92
>   Waited count: 278
>   Stack:
> java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:133)
> 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:718)
> 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:605)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:154)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeFlushMarker(WALUtil.java:81)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2356)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2328)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2319)
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1531)
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1437)
> 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> We missed a signal? We need to do an interrupt? The log is not all there in 
> hadoopqa builds so hard to see all that is going on. This test is not in the 
> flakey set either



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19927) TestFullLogReconstruction flakey

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-19927:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to master and branch-2.

Thanks [~stack] for reviewing.

> TestFullLogReconstruction flakey
> 
>
> Key: HBASE-19927
> URL: https://issues.apache.org/jira/browse/HBASE-19927
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19927.patch, js, out
>
>
> Fails pretty frequently in hadoopqa builds.
> There is a recent hang in 
> org.apache.hadoop.hbase.TestFullLogReconstruction.tearDownAfterClass(TestFullLogReconstruction.java:68)
> In here... 
> https://builds.apache.org/job/PreCommit-HBASE-Build/11363/testReport/org.apache.hadoop.hbase/TestFullLogReconstruction/org_apache_hadoop_hbase_TestFullLogReconstruction/
> ... see here.
> Thread 1250 (RS_CLOSE_META-edd281aedb18:59863-0):
>   State: TIMED_WAITING
>   Blocked count: 92
>   Waited count: 278
>   Stack:
> java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:133)
> 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:718)
> 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:605)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:154)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeFlushMarker(WALUtil.java:81)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2356)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2328)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2319)
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1531)
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1437)
> 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> We missed a signal? We need to do an interrupt? The log is not all there in 
> hadoopqa builds so hard to see all that is going on. This test is not in the 
> flakey set either



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19803) False positive for the HBASE-Find-Flaky-Tests job

2018-02-04 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-19803.
---
   Resolution: Fixed
Fix Version/s: 2.0.0-beta-2

Fixed by HBASE-19873.

> False positive for the HBASE-Find-Flaky-Tests job
> -
>
> Key: HBASE-19803
> URL: https://issues.apache.org/jira/browse/HBASE-19803
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 2018-01-24T17-45-37_000-jvmRun1.dumpstream, 
> HBASE-19803.master.001.patch
>
>
> It reports two hangs for TestAsyncTableGetMultiThreaded, but I checked the 
> surefire output
> https://builds.apache.org/job/HBASE-Flaky-Tests/24830/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was likely to be killed in the middle of the run within 20 seconds.
> https://builds.apache.org/job/HBASE-Flaky-Tests/24852/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAsyncTableGetMultiThreaded-output.txt
> This one was also killed within about 1 minutes.
> The test is declared as LargeTests so the time limit should be 10 minutes. It 
> seems that the jvm may crash during the mvn test run and then we will kill 
> all the running tests and then we may mark some of them as hang which leads 
> to the false positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19554) AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351957#comment-16351957
 ] 

Duo Zhang commented on HBASE-19554:
---

Reenable for debugging.

> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --
>
> Key: HBASE-19554
> URL: https://issues.apache.org/jira/browse/HBASE-19554
> Project: HBase
>  Issue Type: Sub-task
>  Components: Recovery, wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) 
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of 
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is 
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the 
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19933) Make use of column family level attribute for skipping hfile range check before create reference during split

2018-02-04 Thread Rajeshbabu Chintaguntla (JIRA)
Rajeshbabu Chintaguntla created HBASE-19933:
---

 Summary: Make use of column family level attribute for skipping 
hfile range check before create reference during split
 Key: HBASE-19933
 URL: https://issues.apache.org/jira/browse/HBASE-19933
 Project: HBase
  Issue Type: Bug
Reporter: Rajeshbabu Chintaguntla
Assignee: Rajeshbabu Chintaguntla
 Fix For: 2.0.0-beta-2


Currently we are using split policy to  identify whether to skip store file 
range check or not at the time of reference creation during split. But the full 
fledged split with region reference cannot be used in master. So as an 
alternative way we need to make use of column family attribute to set it true 
or false at client level so the decision happen accordingly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19703) Functionality added as part of HBASE-12583 is not working after moving the split code to master

2018-02-04 Thread Rajeshbabu Chintaguntla (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351960#comment-16351960
 ] 

Rajeshbabu Chintaguntla commented on HBASE-19703:
-

[~stack] Thanks for the patch. Raised HBASE-19933 to support the same 
functionality using column family attribute. Seems like the failed tests not 
related to this patch.

> Functionality added as part of HBASE-12583 is not working after moving the 
> split code to master
> ---
>
> Key: HBASE-19703
> URL: https://issues.apache.org/jira/browse/HBASE-19703
> Project: HBase
>  Issue Type: Bug
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19703-WIP.patch, HBASE-19703.branch-2.001.patch, 
> HBASE-19703_v2.patch, HBASE-19703_v3.patch, HBASE-19703_v4.patch, 
> HBASE-19703_v5.patch
>
>
> As part of HBASE-12583 we are passing split policy to 
> HRegionFileSystem#splitStoreFile so that we can allow to create reference 
> files even the split key is out of HFile key range. This is needed for Local 
> Indexing implementation in Phoenix. But now after moving the split code to 
> master just passing null for split policy.
> {noformat}
> final String familyName = Bytes.toString(family);
> final Path path_first =
> regionFs.splitStoreFile(this.daughter_1_RI, familyName, sf, splitRow, 
> false, null);
> final Path path_second =
> regionFs.splitStoreFile(this.daughter_2_RI, familyName, sf, splitRow, 
> true, null);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351962#comment-16351962
 ] 

Hadoop QA commented on HBASE-19917:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} hbase-rsgroup: The patch generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
38s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hbase-rsgroup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19917 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909167/HBASE-19917.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b854d93410f1 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ab5a26ad5e |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11383/artifact/patchprocess/diff-checkstyle-hbase-rsgroup.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11383/testReport/ |
| Max. process+thread count | 1533 (vs. ulimit of 1) |
| modules | C: hbase-rsgroup U: hbase-rsgroup |
| Console output | 
http

[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19917:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-2
   Status: Resolved  (was: Patch Available)

Pushed to branch-2 + after fixing long line warning.

Thanks for the patch, Xiang.

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351987#comment-16351987
 ] 

Ted Yu commented on HBASE-19917:


Can you attach patch for branch-1 ?

Thanks

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19920) TokenUtil.obtainToken unnecessarily creates a local directory

2018-02-04 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351992#comment-16351992
 ] 

Francis Liu commented on HBASE-19920:
-

[~mdrob] Rohini and I were talking about this last friday. As far as I can tell 
DynamicClassLoader is mainly for supporting custom filters tho there are some 
other cases (eg custom comparator for checkAndPut).

In any case, it sounds reasonable to assume that if a client (not running in 
regionserver, master, etc) needs to use a custom filter/coprocessor/etc it 
would have direct access to the classes (ie in classpath) to use the apis. So 
it would seem reasonable to assume that we only need to enable 
DynamicClassLoader on clients running in an HBase daemon?

Having said that the approach you've currently taken sounds fine as it 
addresses the immediate concern. Tho a bit tricky as future code changes may 
make use of ProtobufUtil (sounds like we need to add an IT test to avoid 
regression). 

> TokenUtil.obtainToken unnecessarily creates a local directory
> -
>
> Key: HBASE-19920
> URL: https://issues.apache.org/jira/browse/HBASE-19920
> Project: HBase
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.0
>
> Attachments: HBASE-19920.patch
>
>
> On client code, when one calls TokenUtil.obtainToken it loads ProtobufUtil 
> which in its static block initializes DynamicClassLoader and that creates the 
> directory ${hbase.local.dir}/jars/ and also instantiates a filesystem class 
> to access hbase.dynamic.jars.dir.
> https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/util/DynamicClassLoader.java#L109-L127
> Since this is region server specific code, not expecting this to happen when 
> one accesses hbase as a client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351995#comment-16351995
 ] 

Xiang Li edited comment on HBASE-19917 at 2/5/18 2:32 AM:
--

Working on the patch for branch-1. A moment


was (Author: water):
working on the patch for branch-1. A moment

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351995#comment-16351995
 ] 

Xiang Li commented on HBASE-19917:
--

working on the patch for branch-1. A moment

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.master.000.patch, 
> HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352011#comment-16352011
 ] 

Hadoop QA commented on HBASE-19932:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
1s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} branch-1 passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} branch-1 passed with JDK v1.7.0_171 {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  7m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
4s{color} | {color:green} branch-1 passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
40s{color} | {color:green} branch-1 passed with JDK v1.7.0_171 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed with JDK v1.7.0_171 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  0s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
35s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  3m 
26s{color} | {color:red} The patch causes 44 errors with Hadoop v2.4.1. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  4m 
18s{color} | {color:red} The patch causes 44 errors with Hadoop v2.5.2. {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed with JDK v1.7.0_171 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m 10s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRegionServerAbort |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:36a7029 |
| JIRA Issue | HBASE-19932 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909169/19932.branch-1.txt |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  |
|

[jira] [Commented] (HBASE-19932) TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3

2018-02-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352017#comment-16352017
 ] 

Ted Yu commented on HBASE-19932:


TestRegionServerAbort failure was not related to patch - different profile.

I verified locally that with patch, TestSecureIPC passes against hadoop 3.

> TestSecureIPC in branch-1 fails with NoSuchMethodError against hadoop 3
> ---
>
> Key: HBASE-19932
> URL: https://issues.apache.org/jira/browse/HBASE-19932
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19932.branch-1.txt
>
>
> Error below can be observed when running the test against hadoop 3:
> {code}
> org.apache.hadoop.hbase.security.TestSecureIPC  Time elapsed: 1.756 sec  <<< 
> ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.getKadmin()Lorg/apache/kerby/kerberos/kerb/admin/kadmin/local/LocalKadmin;
>   at 
> org.apache.hadoop.hbase.security.TestSecureIPC.setUp(TestSecureIPC.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-19917:
-
Attachment: HBASE-19917.branch-1.000.patch

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.branch-1.000.patch, 
> HBASE-19917.master.000.patch, HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352021#comment-16352021
 ] 

Xiang Li commented on HBASE-19917:
--

Hi [~yuzhih...@gmail.com], I uploaded patch 000 for branch-1. Please review it 
at your most convenience.

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.branch-1.000.patch, 
> HBASE-19917.master.000.patch, HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352021#comment-16352021
 ] 

Xiang Li edited comment on HBASE-19917 at 2/5/18 4:00 AM:
--

Hi [~yuzhih...@gmail.com], I uploaded patch 000 for branch-1. Please review it 
at your most convenience.
UT for hbase-rsgroup passed on my local machine. Running full UT.


was (Author: water):
Hi [~yuzhih...@gmail.com], I uploaded patch 000 for branch-1. Please review it 
at your most convenience.

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19917.branch-1.000.patch, 
> HBASE-19917.master.000.patch, HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19917:
---
Fix Version/s: 1.4.2

> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19917.branch-1.000.patch, 
> HBASE-19917.master.000.patch, HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19900) Region-level exception destroy the result of batch

2018-02-04 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19900:
---
Release Note: 
This fix makes the following changes to how client handle the both of action 
result and region exception.
1) honor the action result rather region exception. If the action have both of 
true result and region exception, the action is fine as the exception is caused 
by other actions which are in the same region.
2) honor the action exception rather region exception. If the action have both 
of action exception and region exception, we deal with the action exception 
only. If we also handle the region exception for the same action, it will 
introduce the negative count of actions in progress. The 
AsyncRequestFuture#waitUntilDone will block forever.

> Region-level exception destroy the result of batch
> --
>
> Key: HBASE-19900
> URL: https://issues.apache.org/jira/browse/HBASE-19900
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 1.3.2, 1.5.0, 1.2.7, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19900.v0.patch
>
>
> 1) decrease action count repeatedly
> If the AsyncRequestFuture#waitUntilDone return prematurely, user will get the 
> incorrect results. Or user will be block by AsyncRequestFuture#waitUntilDone 
> as the count is never equal with 0.
> 2) the successive result will be overwrited 
> 3) The failed op is added to RetriesExhaustedWithDetailsException repeatedly 
> AsyncRequestFutureImpl#receiveMultiAction process the action-lever error 
> first, and then add the region-level exception to each action. Hence, user 
> may get the various exceptions for the same action (row op) from the 
> RetriesExhaustedWithDetailsException.
> In fact, if both of action-level exception and region-lever exception exist, 
> they always have the same context. I'm not sure whether that is what 
> RetriesExhaustedWithDetailsException want. As i see it, we shouldn't have the 
> duplicate ops in RetriesExhaustedWithDetailsException since that may confuse 
> users if they catch the RetriesExhaustedWithDetailsException to check the 
> invalid operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19917) Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352035#comment-16352035
 ] 

Hudson commented on HBASE-19917:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4528 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4528/])
HBASE-19917 Improve RSGroupBasedLoadBalancer#filterServers() to be more (tedyu: 
rev 7f7f2b2de53d11dc8ddde6954a4af3599a9e0fa5)
* (edit) 
hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java


> Improve RSGroupBasedLoadBalancer#filterServers() to be more efficient
> -
>
> Key: HBASE-19917
> URL: https://issues.apache.org/jira/browse/HBASE-19917
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19917.branch-1.000.patch, 
> HBASE-19917.master.000.patch, HBASE-19917.master.001.patch
>
>
> {code:title=hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java|borderStyle=solid}
> private List filterServers(Collection servers,
> Collection onlineServers) {
>   ArrayList finalList = new ArrayList();
>   for (Address server : servers) {
> for(ServerName curr: onlineServers) {
>   if(curr.getAddress().equals(server)) {
> finalList.add(curr);
>   }
> }
>   }
>   return finalList;
> }
> {code}
> filterServers is to return the union of servers and onlineServers. The 
> current implementation has time complexity as O(m * n) (2 loops), could be in 
> O(m + n) if HashSet is used. The trade-off is space complexity is increased.
> Another point which could be improved: filterServers() is only called in 
> filterOfflineServers(). filterOfflineServers calls filterServers(Set, List). 
> The current filterServers(Collection, Collection) seems could be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19931) TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352032#comment-16352032
 ] 

Hudson commented on HBASE-19931:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4528 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4528/])
HBASE-19931 TestMetaWithReplicas failing 100% of the time in (stack: rev 
ab5a26ad5e659b3a536a08c7f7515f0c40cea81d)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMetaWithReplicas.java


> TestMetaWithReplicas failing 100% of the time in testHBaseFsckWithMetaReplicas
> --
>
> Key: HBASE-19931
> URL: https://issues.apache.org/jira/browse/HBASE-19931
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19931.branch-2.001.patch
>
>
> Somehow we missed a test that depends on a run of HBCK. It fails 100% of the 
> time now because of HBASE-19726 Failed to start HMaster due to infinite 
> retrying on meta assign where we no longer update hbase:meta with the state 
> of hbase:meta; rather, hbase:meta's always-ENABLED state is inferred. It 
> broke HBCK here.
> So, disable the test and just-in-case add meta as ENABLED to hbck though hbck 
> as is is not for hbase2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19554) AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352034#comment-16352034
 ] 

Hudson commented on HBASE-19554:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4528 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4528/])
HBASE-19554 reenable TestDLSAsyncFSWAL/TestDLSFSHLog for debugging (zhangduo: 
rev 88d6e06a1fe4130b96e0e9f0e2c2b189f0b6affd)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDLSFSHLog.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDLSAsyncFSWAL.java


> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --
>
> Key: HBASE-19554
> URL: https://issues.apache.org/jira/browse/HBASE-19554
> Project: HBase
>  Issue Type: Sub-task
>  Components: Recovery, wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) 
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of 
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is 
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the 
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19927) TestFullLogReconstruction flakey

2018-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352033#comment-16352033
 ] 

Hudson commented on HBASE-19927:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4528 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4528/])
HBASE-19927 TestFullLogReconstruction flakey (zhangduo: rev 
e1cd10b002a07a35aa7666fcfbd01b54cfcff1bf)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/TestFullLogReconstruction.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java


> TestFullLogReconstruction flakey
> 
>
> Key: HBASE-19927
> URL: https://issues.apache.org/jira/browse/HBASE-19927
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19927.patch, js, out
>
>
> Fails pretty frequently in hadoopqa builds.
> There is a recent hang in 
> org.apache.hadoop.hbase.TestFullLogReconstruction.tearDownAfterClass(TestFullLogReconstruction.java:68)
> In here... 
> https://builds.apache.org/job/PreCommit-HBASE-Build/11363/testReport/org.apache.hadoop.hbase/TestFullLogReconstruction/org_apache_hadoop_hbase_TestFullLogReconstruction/
> ... see here.
> Thread 1250 (RS_CLOSE_META-edd281aedb18:59863-0):
>   State: TIMED_WAITING
>   Blocked count: 92
>   Waited count: 278
>   Stack:
> java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:133)
> 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:718)
> 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:605)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:154)
> 
> org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeFlushMarker(WALUtil.java:81)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2645)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2356)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2328)
> 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2319)
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1531)
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1437)
> 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> We missed a signal? We need to do an interrupt? The log is not all there in 
> hadoopqa builds so hard to see all that is going on. This test is not in the 
> flakey set either



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19506) Support variable sized chunks from ChunkCreator

2018-02-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352043#comment-16352043
 ] 

ramkrishna.s.vasudevan commented on HBASE-19506:


Left over some comments in RB. Thanks.

> Support variable sized chunks from ChunkCreator
> ---
>
> Key: HBASE-19506
> URL: https://issues.apache.org/jira/browse/HBASE-19506
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Priority: Major
> Attachments: HBASE-19506-V01.patch
>
>
> When CellChunkMap is created it allocates a special index chunk (or chunks) 
> where array of cell-representations is stored. When the number of 
> cell-representations is small, it is preferable to allocate a chunk smaller 
> than a default value which is 2MB.
> On the other hand, those "non-standard size" chunks can not be used in pool. 
> On-demand allocations in off-heap are costly. So this JIRA is about to 
> investigate the trade of between memory usage and the final performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19934) HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

2018-02-04 Thread Toshihiro Suzuki (JIRA)
Toshihiro Suzuki created HBASE-19934:


 Summary: HBaseSnapshotException when read replicas is enabled and 
online snapshot is taken after region splitting
 Key: HBASE-19934
 URL: https://issues.apache.org/jira/browse/HBASE-19934
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Toshihiro Suzuki


Investigating HBASE-19893, I'm encountering another issue.

Steps to reproduce are as follows:

1. Create a table
{code:java}
create "test", "cf", {REGION_REPLICATION => 2}{code}
2. Load data to the table
{code:java}
(0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}{code}
3. Split the table
{code:java}
split "test"{code}
4. Take a snapshot for the table
{code:java}
snapshot "test", "snap"{code}
And I encountered the following error:
{code:java}
hbase(main):004:0> snapshot "test", "snap"

ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] 
done=[] }
at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
Failed taking snapshot { ss=snap table=test type=FLUSH } due to 
exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, 
NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY 
=> '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't 
match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY 
=> '', OFFLINE => true, SPLIT => 
true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest 
region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY 
=> '', OFFLINE => true, SPLIT => true}
at 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
at 
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
... 6 more
Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: 
Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY 
=> '', OFFLINE => true, SPLIT => true}
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
at 
org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
at 
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Take a snapshot of specified table. Examples:

hbase> snapshot 'sourceTable', 'snapshotName'
hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH => true}

Took 0.3390 seconds{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19934) HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

2018-02-04 Thread Toshihiro Suzuki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki updated HBASE-19934:
-
Attachment: HBASE-19934-UT.patch

> HBaseSnapshotException when read replicas is enabled and online snapshot is 
> taken after region splitting
> 
>
> Key: HBASE-19934
> URL: https://issues.apache.org/jira/browse/HBASE-19934
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19934-UT.patch
>
>
> Investigating HBASE-19893, I'm encountering another issue.
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf", {REGION_REPLICATION => 2}{code}
> 2. Load data to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}{code}
> 3. Split the table
> {code:java}
> split "test"{code}
> 4. Take a snapshot for the table
> {code:java}
> snapshot "test", "snap"{code}
> And I encountered the following error:
> {code:java}
> hbase(main):004:0> snapshot "test", "snap"
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
> ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] 
> done=[] }
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
> Failed taking snapshot { ss=snap table=test type=FLUSH } due to 
> exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, 
> NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', 
> STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 
> 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, 
> NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => 
> '', ENDKEY => '', OFFLINE => true, SPLIT => 
> true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest 
> region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: 
> Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Take a snapshot of specified table. Examples:
> hbase> snapshot 'sourceTable', 'snapshotName'
> hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH =

[jira] [Commented] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352065#comment-16352065
 ] 

ramkrishna.s.vasudevan commented on HBASE-19930:


So the doubt here is we created a set of immutable segments and that goes for a 
merge. So while creating an immutable segment while flattening that time itself 
the big cell would have been copied to MSLAB right? So will we have a cell that 
is not yet copied to immutable segment and MSLAB till the merge happens?

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.
> Following a comment in HBASE-19133 regarding a bug in 
> ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
> called for an ImmutableMemStoreLAB, and just throwing an 
> IllegalStateException whenever called), the forceCopyOfBigCellInto method now 
> performs the copy of big cells on the first MSLABImpl in its mslabs 
> linked-list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352066#comment-16352066
 ] 

ramkrishna.s.vasudevan commented on HBASE-19930:


Ya we need a test too for this.

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.
> Following a comment in HBASE-19133 regarding a bug in 
> ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
> called for an ImmutableMemStoreLAB, and just throwing an 
> IllegalStateException whenever called), the forceCopyOfBigCellInto method now 
> performs the copy of big cells on the first MSLABImpl in its mslabs 
> linked-list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19934) HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

2018-02-04 Thread Toshihiro Suzuki (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352068#comment-16352068
 ] 

Toshihiro Suzuki commented on HBASE-19934:
--

I just attached a unit test patch to reproduce this issue.  the 
TestRestoreSnapshotFromClientWithRegionReplicas#testSnapshotAfterSplittingRegions
 fails.

> HBaseSnapshotException when read replicas is enabled and online snapshot is 
> taken after region splitting
> 
>
> Key: HBASE-19934
> URL: https://issues.apache.org/jira/browse/HBASE-19934
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19934-UT.patch
>
>
> Investigating HBASE-19893, I'm encountering another issue.
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf", {REGION_REPLICATION => 2}{code}
> 2. Load data to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}{code}
> 3. Split the table
> {code:java}
> split "test"{code}
> 4. Take a snapshot for the table
> {code:java}
> snapshot "test", "snap"{code}
> And I encountered the following error:
> {code:java}
> hbase(main):004:0> snapshot "test", "snap"
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
> ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] 
> done=[] }
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
> Failed taking snapshot { ss=snap table=test type=FLUSH } due to 
> exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, 
> NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', 
> STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 
> 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, 
> NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => 
> '', ENDKEY => '', OFFLINE => true, SPLIT => 
> true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest 
> region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: 
> Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.ja

[jira] [Comment Edited] (HBASE-19934) HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

2018-02-04 Thread Toshihiro Suzuki (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352068#comment-16352068
 ] 

Toshihiro Suzuki edited comment on HBASE-19934 at 2/5/18 6:09 AM:
--

I just attached a unit test patch to reproduce this issue.  The 
TestRestoreSnapshotFromClientWithRegionReplicas#testSnapshotAfterSplittingRegions
 fails.


was (Author: brfrn169):
I just attached a unit test patch to reproduce this issue.  the 
TestRestoreSnapshotFromClientWithRegionReplicas#testSnapshotAfterSplittingRegions
 fails.

> HBaseSnapshotException when read replicas is enabled and online snapshot is 
> taken after region splitting
> 
>
> Key: HBASE-19934
> URL: https://issues.apache.org/jira/browse/HBASE-19934
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19934-UT.patch
>
>
> Investigating HBASE-19893, I'm encountering another issue.
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf", {REGION_REPLICATION => 2}{code}
> 2. Load data to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}{code}
> 3. Split the table
> {code:java}
> split "test"{code}
> 4. Take a snapshot for the table
> {code:java}
> snapshot "test", "snap"{code}
> And I encountered the following error:
> {code:java}
> hbase(main):004:0> snapshot "test", "snap"
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
> ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] 
> done=[] }
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
> Failed taking snapshot { ss=snap table=test type=FLUSH } due to 
> exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, 
> NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', 
> STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 
> 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, 
> NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => 
> '', ENDKEY => '', OFFLINE => true, SPLIT => 
> true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest 
> region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: 
> Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventH

[jira] [Comment Edited] (HBASE-19934) HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

2018-02-04 Thread Toshihiro Suzuki (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352068#comment-16352068
 ] 

Toshihiro Suzuki edited comment on HBASE-19934 at 2/5/18 6:09 AM:
--

I just attached a unit test patch to reproduce this issue.  The 
TestRestoreSnapshotFromClientWithRegionReplicas#testSnapshotAfterSplittingRegions()
 fails.


was (Author: brfrn169):
I just attached a unit test patch to reproduce this issue.  The 
TestRestoreSnapshotFromClientWithRegionReplicas#testSnapshotAfterSplittingRegions
 fails.

> HBaseSnapshotException when read replicas is enabled and online snapshot is 
> taken after region splitting
> 
>
> Key: HBASE-19934
> URL: https://issues.apache.org/jira/browse/HBASE-19934
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-19934-UT.patch
>
>
> Investigating HBASE-19893, I'm encountering another issue.
> Steps to reproduce are as follows:
> 1. Create a table
> {code:java}
> create "test", "cf", {REGION_REPLICATION => 2}{code}
> 2. Load data to the table
> {code:java}
> (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}{code}
> 3. Split the table
> {code:java}
> split "test"{code}
> 4. Take a snapshot for the table
> {code:java}
> snapshot "test", "snap"{code}
> And I encountered the following error:
> {code:java}
> hbase(main):004:0> snapshot "test", "snap"
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
> ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] 
> done=[] }
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via 
> Failed taking snapshot { ss=snap table=test type=FLUSH } due to 
> exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, 
> NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', 
> STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 
> 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, 
> NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => 
> '', ENDKEY => '', OFFLINE => true, SPLIT => 
> true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest 
> region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
> at 
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: 
> Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 
> 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match 
> expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 
> 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', 
> ENDKEY => '', OFFLINE => true, SPLIT => true}
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
> at 
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
> at 
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
> at org.apache.hadoop.hbase.executor.EventHandler.run(Even

[jira] [Commented] (HBASE-19935) Only allow table replication for sync replication for now

2018-02-04 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352071#comment-16352071
 ] 

Duo Zhang commented on HBASE-19935:
---

[~zghaobac] FYI.

> Only allow table replication for sync replication for now
> -
>
> Key: HBASE-19935
> URL: https://issues.apache.org/jira/browse/HBASE-19935
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Duo Zhang
>Priority: Major
>
> Add pre check to only allow table replication for now, no namespace, or 
> replicate all and exclusion.
> This is used to reduce the difficulty for implementing the sync replication 
> state transition as we need to reopen all the related regions.
> We can add the support for these features later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19935) Only allow table replication for sync replication for now

2018-02-04 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-19935:
-

 Summary: Only allow table replication for sync replication for now
 Key: HBASE-19935
 URL: https://issues.apache.org/jira/browse/HBASE-19935
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Duo Zhang


Add pre check to only allow table replication for now, no namespace, or 
replicate all and exclusion.

This is used to reduce the difficulty for implementing the sync replication 
state transition as we need to reopen all the related regions.

We can add the support for these features later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used

2018-02-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352088#comment-16352088
 ] 

ramkrishna.s.vasudevan commented on HBASE-19863:


bq.So the reason for this check is only to check that we are still in the same 
block, so no need for reseek
I agree to this. But my arg would be that if getNextIndexedKey() is a key with 
the next row but we are actually trying to do only skipToNextCol then lets 
simply go with the reseek way? Doing next on the current block is not going to 
help us . Also see the method name  compareKeyForNextColumn(). we also have 
compareKeyForNextRow(). Expectation is for compareKeyForNextColumn() check with 
nextcolumn. But internally our compare method is such that we first check row 
and if that row is bigger than the current cell we still think we have found 
the next index. Just my 2 cents.

> java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter 
> is used
> -
>
> Key: HBASE-19863
> URL: https://issues.apache.org/jira/browse/HBASE-19863
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 1.4.1
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-19863-branch1.patch, HBASE-19863-test.patch
>
>
> Under some circumstances scan with SingleColumnValueFilter may fail with an 
> exception
> {noformat} 
> java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, 
> qualifier=C2, timestamp=1516433595543, comparison result: 1 
> at 
> org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {noformat}
> Conditions:
> table T with a single column family 0 that uses ROWCOL bloom filter 
> (important)  and column qualifiers C1,C2,C3,C4,C5. 
> When we fill the table for every row we put deleted cell for C3.
> The table has a single region with two HStore:
> A: start row: 0, stop row: 99 
> B: start row: 10 stop row: 99
> B has newer versions of rows 10-99. Store files have several blocks each 
> (important). 
> Store A is the result of major compaction,  so it doesn't have any deleted 
> cells (important).
> So, we are running a scan like:
> {noformat}
> scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter 
> ('0','C5',=,'binary:whatever')"}
> {noformat}  
> How the scan performs:
> First, we iterate A for rows 0 and 1 without any problems. 
> Next, we start to iterate A for row 10, so read the first cell and set hfs 
> scanner to A :
> 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : 
> 10:0/C1/1/Put/x, 
> so we make B as our current store scanner. Since we are looking for 
> particular columns 
> C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn 
> which 
> would run reseek for all store scanners.
> For store A the following magic would happen in requestSeek:
>   1. bloom filter check passesGeneralBloomFilter would set haveToSeek to 
> false because row 10 doesn't have C3 qualifier in store A.  
>   2. Since we don't have to seek we just create a fake row 
> 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for 
> us and it commented with :
> {noformat}
>  // Multi-column Bloom filter optimization.
> // Create a fake key/value, so that this scanner only bubbles up to the 
> top
> // of the KeyValueHeap in StoreScanner after we scanned this row/column in
> // all other store files. The query matcher will then just skip this fake
> // key/value and the store scanner will progress to 

[jira] [Created] (HBASE-19936) Introduce a new base class for replication peer procedure

2018-02-04 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-19936:
-

 Summary: Introduce a new base class for replication peer procedure
 Key: HBASE-19936
 URL: https://issues.apache.org/jira/browse/HBASE-19936
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang
Assignee: Duo Zhang
 Fix For: 3.0.0


As the sync replication peer state transition will have more steps than normal 
replication peer, it will be good to have a common base class for them.

Since the peer id will be stored in this class, I tend to change the protobuf 
message name from 'ModifyPeerStateData' to 'ReplicationPeerProcedureStateData'. 
This will be committed to master and HBASE-19397-branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)