[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507700#comment-16507700
 ] 

Guanghao Zhang commented on HBASE-20697:


bq. call RegionLocations.size() and it return 1 and seems only hold the last 
regionLocation of the list. It confused me.
The default replica is 1 so this works. But the right way should use a map and 
the value is a list of region replica of same region.

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18999) Put in hbase shell cannot do multiple columns

2018-06-10 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507696#comment-16507696
 ] 

Nihal Jain commented on HBASE-18999:


Thanks for the review [~mdrob]. I will try to attach a new patch addressing 
your comments later this weekend.
{quote}you have it switched in one of the examples in the help messages
{quote}
Yeah you are right. I will fix that along with the following already checked-in 
line.
{code:java}
  hbase> deleteall 't1', {ROWPREFIXFILTER => 'prefix'}, 'c1'//delete 
certain column family in the row ranges
{code}

> Put in hbase shell cannot do multiple columns
> -
>
> Key: HBASE-18999
> URL: https://issues.apache.org/jira/browse/HBASE-18999
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Affects Versions: 1.0.0, 3.0.0, 2.0.0
>Reporter: Mike Drob
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-18999.master.001.patch
>
>
> A {{Put}} can carry multiple cells, but doing so in the shell is very 
> difficult to construct. We should make this easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20662) Increasing space quota on a violated table does not remove SpaceViolationPolicy.DISABLE enforcement

2018-06-10 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507686#comment-16507686
 ] 

Nihal Jain commented on HBASE-20662:


compile and javac are red because i have 3 tests marked with @Ignore. I think I 
will have to remove them in a new patch and add them later with the Jira which 
will fix these scenarios. Otherwise all newly added tests have passed.

Ping. [~elserj], [~yuzhih...@gmail.com], [~gsbiju]

> Increasing space quota on a violated table does not remove 
> SpaceViolationPolicy.DISABLE enforcement
> ---
>
> Key: HBASE-20662
> URL: https://issues.apache.org/jira/browse/HBASE-20662
> Project: HBase
>  Issue Type: Bug
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20662.master.001.patch, 
> HBASE-20662.master.002.patch
>
>
> *Steps to reproduce*
>  * Create a table and set quota with {{SpaceViolationPolicy.DISABLE}} having 
> limit say 2MB
>  * Now put rows until space quota is violated and table gets disabled
>  * Next, increase space quota with limit say 4MB on the table
>  * Now try putting a row into the table
> {code:java}
>  private void testSetQuotaThenViolateAndFinallyIncreaseQuota() throws 
> Exception {
> SpaceViolationPolicy policy = SpaceViolationPolicy.DISABLE;
> Put put = new Put(Bytes.toBytes("to_reject"));
> put.addColumn(Bytes.toBytes(SpaceQuotaHelperForTests.F1), 
> Bytes.toBytes("to"),
>   Bytes.toBytes("reject"));
> // Do puts until we violate space policy
> final TableName tn = writeUntilViolationAndVerifyViolation(policy, put);
> // Now, increase limit
> setQuotaLimit(tn, policy, 4L);
> // Put some row now: should not violate as quota limit increased
> verifyNoViolation(policy, tn, put);
>   }
> {code}
> *Expected*
> We should be able to put data as long as newly set quota limit is not reached
> *Actual*
> We fail to put any new row even after increasing limit
> *Root cause*
> Increasing quota on a violated table triggers the table to be enabled, but 
> since the table is already in violation, the system does not allow it to be 
> enabled (may be thinking that a user is trying to enable it)
> *Relevant exception trace*
> {noformat}
> 2018-05-31 00:34:27,563 INFO  [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> client.HBaseAdmin$14(844): Started enable of 
> testSetQuotaAndThenIncreaseQuotaWithDisable0
> 2018-05-31 00:34:27,571 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=42525] 
> ipc.CallRunner(142): callId: 11 service: MasterService methodName: 
> EnableTable size: 104 connection: 127.0.0.1:38030 deadline: 1527707127568, 
> exception=org.apache.hadoop.hbase.security.AccessDeniedException: Enabling 
> the table 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to 
> a violated space quota.
> 2018-05-31 00:34:27,571 ERROR [regionserver/root1-ThinkPad-T440p:0.Chore.1] 
> quotas.RegionServerSpaceQuotaManager(210): Failed to disable space violation 
> policy for testSetQuotaAndThenIncreaseQuotaWithDisable0. This table will 
> remain in violation.
> org.apache.hadoop.hbase.security.AccessDeniedException: 
> org.apache.hadoop.hbase.security.AccessDeniedException: Enabling the table 
> 'testSetQuotaAndThenIncreaseQuotaWithDisable0' is disallowed due to a 
> violated space quota.
>   at org.apache.hadoop.hbase.master.HMaster$6.run(HMaster.java:2275)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
>   at org.apache.hadoop.hbase.master.HMaster.enableTable(HMaster.java:2258)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.enableTable(MasterRpcServices.java:725)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> 

[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-10 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507680#comment-16507680
 ] 

Francis Liu edited comment on HBASE-20704 at 6/11/18 5:03 AM:
--

{quote}Are we ensuring this always after this patch? What if the RS going down 
in between a close so not all the compacted files are archived? This issue is 
there with old impl also right (Archive immediately and not by the Discharger 
thread)
{quote}
In both cases if the RS is not gracefully shutdown then the WAL will get 
replayed and the compaction marker gets replayed thus making sure the compacted 
files are never accessed. Or so I'd like to confidently say. But it seems that 
even that part has a bug wherein the WAL containing the compaction marker thats 
needs to be replayed can get archived as sequence id tracking for WAL is only 
tied to memstore flushed, ignoring wether compaction archival for a given 
compaction even has completed. The same can be said for when edits are replayed 
on region open. 

I can think of a few reasons why this was not observed (or not as much) during 
pre-discharger versions. 1. Since we archive soon after compacting the window 
for exposure is pretty small. 2. At least for the delete case assuming the 
common case that the user does not mess with the timestamps. Since the 
compacted storefiles are sorted by seq id and removed in sequence, the 
storefiles containing rows that were deleted are removed first before the 
storefiles containing the corresponding tombstones for those rows. With the 
discharger we skip storefiles if they still have references.

So to sum things up with this other bug when a server aborts there's a 
possibility some compacted storefiles (the ones not removed) can reopened by 
the failover RS.

Should we address this issue here? Or create another jira? If another Jira then 
in this one we can prolly add a partial fix wherein the discharger only removes 
contiguous storefiles?

 

 

 


was (Author: toffer):
{quote}

Are we ensuring this always after this patch? What if the RS going down in 
between a close so not all the compacted files are archived? This issue is 
there with old impl also right (Archive immediately and not by the Discharger 
thread)

{quote}

In both cases if the RS is not gracefully shutdown then the WAL will get 
replayed and the compaction marker gets replayed thus making sure the compacted 
files are never accessed. Or so I'd like to confidently say. But it seems that 
even that part has a bug wherein the WAL containing the compaction marker thats 
needs to be replayed can get archived as sequence id tracking for WAL is only 
tied to memstore flushed, ignoring wether compaction archival for a given 
compaction even has completed. The same can be said for when edits are replayed 
on region open. 

I can think of a few reasons why this was not observed (or not as much) during 
pre-discharger versions. 1. Since we archive soon after compacting the window 
for exposure is pretty small. 2. At least for the delete case assuming the 
common case that the user does not mess with the timestamps. Since the 
compacted storefiles are sorted by seq id and removed in sequence, the 
storefiles containing rows that were deleted are removed first before the 
storefiles containing the corresponding tombstones for those rows. With the 
discharger we skip storefiles if they still have references.

Should we address this issue here? Or create another jira? If another Jira then 
in this one we can prolly add a partial fix wherein the discharger only removes 
contiguous storefiles?

 

 

 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references 

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-10 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507680#comment-16507680
 ] 

Francis Liu commented on HBASE-20704:
-

{quote}

Are we ensuring this always after this patch? What if the RS going down in 
between a close so not all the compacted files are archived? This issue is 
there with old impl also right (Archive immediately and not by the Discharger 
thread)

{quote}

In both cases if the RS is not gracefully shutdown then the WAL will get 
replayed and the compaction marker gets replayed thus making sure the compacted 
files are never accessed. Or so I'd like to confidently say. But it seems that 
even that part has a bug wherein the WAL containing the compaction marker thats 
needs to be replayed can get archived as sequence id tracking for WAL is only 
tied to memstore flushed, ignoring wether compaction archival for a given 
compaction even has completed. The same can be said for when edits are replayed 
on region open. 

I can think of a few reasons why this was not observed (or not as much) during 
pre-discharger versions. 1. Since we archive soon after compacting the window 
for exposure is pretty small. 2. At least for the delete case assuming the 
common case that the user does not mess with the timestamps. Since the 
compacted storefiles are sorted by seq id and removed in sequence, the 
storefiles containing rows that were deleted are removed first before the 
storefiles containing the corresponding tombstones for those rows. With the 
discharger we skip storefiles if they still have references.

Should we address this issue here? Or create another jira? If another Jira then 
in this one we can prolly add a partial fix wherein the discharger only removes 
contiguous storefiles?

 

 

 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20708) Make sure there is no race between the RMP scheduled when start up and the SCP

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507656#comment-16507656
 ] 

stack commented on HBASE-20708:
---

Should this be linked to another issue [~Apache9] that fills in context sir?

> Make sure there is no race between the RMP scheduled when start up and the SCP
> --
>
> Key: HBASE-20708
> URL: https://issues.apache.org/jira/browse/HBASE-20708
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Region Assignment
>Reporter: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20672) Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at every monitoring interval

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507655#comment-16507655
 ] 

stack commented on HBASE-20672:
---

[~jain.ankit] Can you say why we need these extra counters (I'm wary adding 
counters because we already spend a bunch of our processing time counting)? 
What is "monitoring interval" (from RN)? How is it set? Is it an hbase thing? 
Or is it a monitoring system thing? Thanks.

> Create new HBase metrics ReadRequestRate and WriteRequestRate that reset at 
> every monitoring interval
> -
>
> Key: HBASE-20672
> URL: https://issues.apache.org/jira/browse/HBASE-20672
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Ankit Jain
>Assignee: Ankit Jain
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20672.branch-1.001.patch, 
> HBASE-20672.master.001.patch, HBASE-20672.master.002.patch, 
> HBASE-20672.master.003.patch
>
>
> Hbase currently provides counter read/write requests (ReadRequestCount, 
> WriteRequestCount). That said it is not easy to use counter that reset only 
> after a restart of the service, we would like to expose 2 new metrics in 
> HBase to provide ReadRequestRate and WriteRequestRate at region server level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20698) Master don't record right server version until new started region server call regionServerReport method

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507653#comment-16507653
 ] 

stack commented on HBASE-20698:
---

+1

> Master don't record right server version until new started region server call 
> regionServerReport method
> ---
>
> Key: HBASE-20698
> URL: https://issues.apache.org/jira/browse/HBASE-20698
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.0.1
>
> Attachments: HBASE-20698.master.001.patch, 
> HBASE-20698.master.002.patch, HBASE-20698.master.003.patch, 
> HBASE-20698.master.addendum.patch
>
>
> When a new region server started, it will call regionServerStartup first. 
> Master will record this server as a new online server and may dispath 
> RemoteProcedure to the new server. But master only record the server version 
> when the new region server call regionServerReport method. Dispatch a new 
> RemoteProcedure to this new regionserver will fail if version is not right.
> {code:java}
>   @Override
>   protected void remoteDispatch(final ServerName serverName,
>   final Set remoteProcedures) {
> final int rsVersion = 
> master.getAssignmentManager().getServerVersion(serverName);
> if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) {
>   LOG.trace("Using procedure batch rpc execution for serverName={} 
> version={}",
> serverName, rsVersion);
>   submitTask(new ExecuteProceduresRemoteCall(serverName, 
> remoteProcedures));
> } else {
>   LOG.info(String.format(
> "Fallback to compat rpc execution for serverName=%s version=%s",
> serverName, rsVersion));
>   submitTask(new CompatRemoteProcedureResolver(serverName, 
> remoteProcedures));
> }
>   }
> {code}
> The above code use version to resolve compatibility problem. So dispatch will 
> work right for old version region server. But for RefreshPeerProcedure, it is 
> new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new 
> region server version is not right, it will use CompatRemoteProcedureResolver 
> for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed 
> rightly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507650#comment-16507650
 ] 

stack commented on HBASE-20700:
---

None. You answered my concerns. Skimmed patch +1 (+1 for branch-2.0 too. 
Thanks).

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700-v2.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20711) Save on a Cell iteration when writing

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507648#comment-16507648
 ] 

stack commented on HBASE-20711:
---

Good point [~chia7712]

On the bugfix, makes sense to you sir?

> Save on a Cell iteration when writing
> -
>
> Key: HBASE-20711
> URL: https://issues.apache.org/jira/browse/HBASE-20711
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Attachments: HBASE-20711.branch-2.0.001.patch
>
>
> This is a minor savings. We were doing a spin through all Cells on receipt 
> just to check their size when subsequently, we were doing an iteration of all 
> Cells to insert. It manifest as a little spike in perf output. This change 
> removes the purposed spin through Cells and just does size check as part of 
> the general Cell insert (perf spike no longer shows but the cost of the size 
> check still remains).
> There is also a minor bug fix where on receipt we were using the Puts row 
> rather than the Cells row; client may have succeeded in submitting a Cell 
> that disagreed with the hosting Mutation and it would have been written as 
> something else altogether -- with the Puts row -- rather than being rejected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread zhaoyuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507647#comment-16507647
 ] 

zhaoyuan commented on HBASE-20697:
--

I am not familiar with region replica too …… and during debugging,When I put 
all regions of one table into the RegionLocations constructor,I call 
RegionLocations.size() and it return 1 and seems only hold the last 
regionLocation of the list. It confused me.

So For each region location I choose to call connection.cacheLocation() and it 
works.

IFY [~zghaobac]

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507637#comment-16507637
 ] 

Guanghao Zhang commented on HBASE-20697:


bq. connection.cacheLocation(tableName, new RegionLocations(regionLocation));
It should be a map which key is region name and value is a list of all region 
replica of same region? I am not familiar with region replica... All region 
replica has same region name?

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread zhaoyuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507634#comment-16507634
 ] 

zhaoyuan commented on HBASE-20697:
--

[~zghaobac] Hi What's wrong with docker?Should I resubmit the patch to trigger 
QA again or do something to solve this problem?

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20569) NPE in RecoverStandbyProcedure.execute

2018-06-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507628#comment-16507628
 ] 

Hadoop QA commented on HBASE-20569:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-19064 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
41s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
50s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} HBASE-19064 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 1 new + 212 
unchanged - 0 fixed = 213 total (was 212) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
54s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m  7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 19s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.master.TestRecoverStandbyProcedure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20569 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927236/HBASE-20569.HBASE-19064.013.patch
 |
| Optional Tests |  asflicense  cc  unit  hbaseprotoc  javac  javadoc  findbugs 
 shadedjars  

[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507625#comment-16507625
 ] 

Hadoop QA commented on HBASE-20697:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m  
2s{color} | {color:red} Docker failed to build yetus/hbase:e77c578. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-20697 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927243/HBASE-20697-branch-1.2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13182/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong 

[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread zhaoyuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyuan updated HBASE-20697:
-
Fix Version/s: 1.3.3
   1.2.7
   Attachment: HBASE-20697-branch-1.2.patch
   Status: Patch Available  (was: In Progress)

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.6, 1.3.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Fix For: 1.2.7, 1.3.3
>
> Attachments: HBASE-20697-branch-1.2.patch, 
> HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20331) clean up shaded packaging for 2.1

2018-06-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507610#comment-16507610
 ] 

Hudson commented on HBASE-20331:


Results for branch HBASE-20331
[build #38 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/38/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/38//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/38//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/38//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/38//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> clean up shaded packaging for 2.1
> -
>
> Key: HBASE-20331
> URL: https://issues.apache.org/jira/browse/HBASE-20331
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
>
> polishing pass on shaded modules for 2.0 based on trying to use them in more 
> contexts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-06-10 Thread zhaoyuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyuan updated HBASE-20697:
-
Attachment: HBASE-20697-branch-1.2.patch

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Minor
> Attachments: HBASE-20697-branch-1.2.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20711) Save on a Cell iteration when writing

2018-06-10 Thread Chia-Ping Tsai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507562#comment-16507562
 ] 

Chia-Ping Tsai commented on HBASE-20711:


If we do the size check in converting the proto to mutation, we have to record 
the index of CellScanner in order to ignore the correct number of cells when 
encountering the exception.
{code:java}
int processedMutationIndex = 0;
for (Action mutation : mutations) {
  // The non-null mArray[i] means the cell scanner has been read.
  if (mArray[processedMutationIndex++] == null) {
skipCellsForMutation(mutation, cells);
  }
}{code}

> Save on a Cell iteration when writing
> -
>
> Key: HBASE-20711
> URL: https://issues.apache.org/jira/browse/HBASE-20711
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Attachments: HBASE-20711.branch-2.0.001.patch
>
>
> This is a minor savings. We were doing a spin through all Cells on receipt 
> just to check their size when subsequently, we were doing an iteration of all 
> Cells to insert. It manifest as a little spike in perf output. This change 
> removes the purposed spin through Cells and just does size check as part of 
> the general Cell insert (perf spike no longer shows but the cost of the size 
> check still remains).
> There is also a minor bug fix where on receipt we were using the Puts row 
> rather than the Cells row; client may have succeeded in submitting a Cell 
> that disagreed with the hosting Mutation and it would have been written as 
> something else altogether -- with the Puts row -- rather than being rejected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20569) NPE in RecoverStandbyProcedure.execute

2018-06-10 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507541#comment-16507541
 ] 

Guanghao Zhang commented on HBASE-20569:


Retry for Hadoop QA.

> NPE in RecoverStandbyProcedure.execute
> --
>
> Key: HBASE-20569
> URL: https://issues.apache.org/jira/browse/HBASE-20569
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-20569.HBASE-19064.001.patch, 
> HBASE-20569.HBASE-19064.002.patch, HBASE-20569.HBASE-19064.003.patch, 
> HBASE-20569.HBASE-19064.004.patch, HBASE-20569.HBASE-19064.005.patch, 
> HBASE-20569.HBASE-19064.006.patch, HBASE-20569.HBASE-19064.007.patch, 
> HBASE-20569.HBASE-19064.008.patch, HBASE-20569.HBASE-19064.009.patch, 
> HBASE-20569.HBASE-19064.010.patch, HBASE-20569.HBASE-19064.011.patch, 
> HBASE-20569.HBASE-19064.012.patch, HBASE-20569.HBASE-19064.013.patch, 
> HBASE-20569.HBASE-19064.013.patch
>
>
> We call ReplaySyncReplicationWALManager.initPeerWorkers in INIT_WORKERS state 
> and then use it in DISPATCH_TASKS. But if we restart the master and the 
> procedure is restarted from state DISPATCH_TASKS, no one will call the 
> initPeerWorkers method and we will get NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20569) NPE in RecoverStandbyProcedure.execute

2018-06-10 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-20569:
---
Attachment: HBASE-20569.HBASE-19064.013.patch

> NPE in RecoverStandbyProcedure.execute
> --
>
> Key: HBASE-20569
> URL: https://issues.apache.org/jira/browse/HBASE-20569
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-20569.HBASE-19064.001.patch, 
> HBASE-20569.HBASE-19064.002.patch, HBASE-20569.HBASE-19064.003.patch, 
> HBASE-20569.HBASE-19064.004.patch, HBASE-20569.HBASE-19064.005.patch, 
> HBASE-20569.HBASE-19064.006.patch, HBASE-20569.HBASE-19064.007.patch, 
> HBASE-20569.HBASE-19064.008.patch, HBASE-20569.HBASE-19064.009.patch, 
> HBASE-20569.HBASE-19064.010.patch, HBASE-20569.HBASE-19064.011.patch, 
> HBASE-20569.HBASE-19064.012.patch, HBASE-20569.HBASE-19064.013.patch, 
> HBASE-20569.HBASE-19064.013.patch
>
>
> We call ReplaySyncReplicationWALManager.initPeerWorkers in INIT_WORKERS state 
> and then use it in DISPATCH_TASKS. But if we restart the master and the 
> procedure is restarted from state DISPATCH_TASKS, no one will call the 
> initPeerWorkers method and we will get NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20331) clean up shaded packaging for 2.1

2018-06-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507525#comment-16507525
 ] 

Hudson commented on HBASE-20331:


Results for branch HBASE-20331
[build #37 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/37/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/37//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/37//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/37//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/37//artifacts/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> clean up shaded packaging for 2.1
> -
>
> Key: HBASE-20331
> URL: https://issues.apache.org/jira/browse/HBASE-20331
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
>
> polishing pass on shaded modules for 2.0 based on trying to use them in more 
> contexts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20569) NPE in RecoverStandbyProcedure.execute

2018-06-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507472#comment-16507472
 ] 

Hadoop QA commented on HBASE-20569:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-19064 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
24s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
29s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} HBASE-19064 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} HBASE-19064 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
59s{color} | {color:red} hbase-server: The patch generated 1 new + 212 
unchanged - 0 fixed = 213 total (was 212) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m  2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}164m 14s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}210m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.master.TestRecoverStandbyProcedure |
|   | hadoop.hbase.replication.TestSyncReplicationMoreLogsInLocalCopyToRemote |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20569 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927216/HBASE-20569.HBASE-19064.013.patch
 |
| Optional 

[jira] [Commented] (HBASE-20709) CompatRemoteProcedureResolver should call remoteCallFailed method instead of throw UnsupportedOperationException

2018-06-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507446#comment-16507446
 ] 

Hadoop QA commented on HBASE-20709:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 16s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}109m 
29s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20709 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927214/HBASE-20709.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5650b84baa99 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / cc7aefe0bb |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13179/testReport/ |
| Max. process+thread count | 4339 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13179/console |
| Powered by | 

[jira] [Updated] (HBASE-20569) NPE in RecoverStandbyProcedure.execute

2018-06-10 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-20569:
---
Attachment: HBASE-20569.HBASE-19064.013.patch

> NPE in RecoverStandbyProcedure.execute
> --
>
> Key: HBASE-20569
> URL: https://issues.apache.org/jira/browse/HBASE-20569
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-20569.HBASE-19064.001.patch, 
> HBASE-20569.HBASE-19064.002.patch, HBASE-20569.HBASE-19064.003.patch, 
> HBASE-20569.HBASE-19064.004.patch, HBASE-20569.HBASE-19064.005.patch, 
> HBASE-20569.HBASE-19064.006.patch, HBASE-20569.HBASE-19064.007.patch, 
> HBASE-20569.HBASE-19064.008.patch, HBASE-20569.HBASE-19064.009.patch, 
> HBASE-20569.HBASE-19064.010.patch, HBASE-20569.HBASE-19064.011.patch, 
> HBASE-20569.HBASE-19064.012.patch, HBASE-20569.HBASE-19064.013.patch
>
>
> We call ReplaySyncReplicationWALManager.initPeerWorkers in INIT_WORKERS state 
> and then use it in DISPATCH_TASKS. But if we restart the master and the 
> procedure is restarted from state DISPATCH_TASKS, no one will call the 
> initPeerWorkers method and we will get NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20709) CompatRemoteProcedureResolver should call remoteCallFailed method instead of throw UnsupportedOperationException

2018-06-10 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-20709:
---
Attachment: HBASE-20709.master.002.patch

> CompatRemoteProcedureResolver should call remoteCallFailed method instead of 
> throw UnsupportedOperationException
> 
>
> Key: HBASE-20709
> URL: https://issues.apache.org/jira/browse/HBASE-20709
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 2.0.0
> Environment: # 
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-20709.master.001.patch, 
> HBASE-20709.master.002.patch
>
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
> {code:java}
> @Override
> public void dispatchServerOperations(MasterProcedureEnv env, 
> List operations) {
>   throw new UnsupportedOperationException();
> }
> {code}
> As the procedure request not send to remote server, remoteOperationFailed and 
> remoteOperationCompleted can't be called. So the procedure will stuck and no 
> log to show this. The new patach will call remoteCallFailed method and make 
> the procedure failed. It is easy to find problem by the new exception message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20709) CompatRemoteProcedureResolver should call remoteCallFailed method instead of throw UnsupportedOperationException

2018-06-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507408#comment-16507408
 ] 

Duo Zhang commented on HBASE-20709:
---

Then let's add something like 'This should not happen, there must be bugs in 
your code, please check!'?

> CompatRemoteProcedureResolver should call remoteCallFailed method instead of 
> throw UnsupportedOperationException
> 
>
> Key: HBASE-20709
> URL: https://issues.apache.org/jira/browse/HBASE-20709
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 2.0.0
> Environment: # 
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Attachments: HBASE-20709.master.001.patch
>
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
> {code:java}
> @Override
> public void dispatchServerOperations(MasterProcedureEnv env, 
> List operations) {
>   throw new UnsupportedOperationException();
> }
> {code}
> As the procedure request not send to remote server, remoteOperationFailed and 
> remoteOperationCompleted can't be called. So the procedure will stuck and no 
> log to show this. The new patach will call remoteCallFailed method and make 
> the procedure failed. It is easy to find problem by the new exception message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20698) Master don't record right server version until new started region server call regionServerReport method

2018-06-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507393#comment-16507393
 ] 

Hudson commented on HBASE-20698:


Results for branch master
[build #361 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/361/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/361//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/361//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/361//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Master don't record right server version until new started region server call 
> regionServerReport method
> ---
>
> Key: HBASE-20698
> URL: https://issues.apache.org/jira/browse/HBASE-20698
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.0.1
>
> Attachments: HBASE-20698.master.001.patch, 
> HBASE-20698.master.002.patch, HBASE-20698.master.003.patch, 
> HBASE-20698.master.addendum.patch
>
>
> When a new region server started, it will call regionServerStartup first. 
> Master will record this server as a new online server and may dispath 
> RemoteProcedure to the new server. But master only record the server version 
> when the new region server call regionServerReport method. Dispatch a new 
> RemoteProcedure to this new regionserver will fail if version is not right.
> {code:java}
>   @Override
>   protected void remoteDispatch(final ServerName serverName,
>   final Set remoteProcedures) {
> final int rsVersion = 
> master.getAssignmentManager().getServerVersion(serverName);
> if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) {
>   LOG.trace("Using procedure batch rpc execution for serverName={} 
> version={}",
> serverName, rsVersion);
>   submitTask(new ExecuteProceduresRemoteCall(serverName, 
> remoteProcedures));
> } else {
>   LOG.info(String.format(
> "Fallback to compat rpc execution for serverName=%s version=%s",
> serverName, rsVersion));
>   submitTask(new CompatRemoteProcedureResolver(serverName, 
> remoteProcedures));
> }
>   }
> {code}
> The above code use version to resolve compatibility problem. So dispatch will 
> work right for old version region server. But for RefreshPeerProcedure, it is 
> new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new 
> region server version is not right, it will use CompatRemoteProcedureResolver 
> for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed 
> rightly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20331) clean up shaded packaging for 2.1

2018-06-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507358#comment-16507358
 ] 

Hudson commented on HBASE-20331:


Results for branch HBASE-20331
[build #36 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/36/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/36//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/36//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/36//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/36//artifacts/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> clean up shaded packaging for 2.1
> -
>
> Key: HBASE-20331
> URL: https://issues.apache.org/jira/browse/HBASE-20331
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
>
> polishing pass on shaded modules for 2.0 based on trying to use them in more 
> contexts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20698) Master don't record right server version until new started region server call regionServerReport method

2018-06-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507326#comment-16507326
 ] 

Hudson commented on HBASE-20698:


Results for branch branch-2.0
[build #410 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/410/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/410//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/410//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/410//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Master don't record right server version until new started region server call 
> regionServerReport method
> ---
>
> Key: HBASE-20698
> URL: https://issues.apache.org/jira/browse/HBASE-20698
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.0.1
>
> Attachments: HBASE-20698.master.001.patch, 
> HBASE-20698.master.002.patch, HBASE-20698.master.003.patch, 
> HBASE-20698.master.addendum.patch
>
>
> When a new region server started, it will call regionServerStartup first. 
> Master will record this server as a new online server and may dispath 
> RemoteProcedure to the new server. But master only record the server version 
> when the new region server call regionServerReport method. Dispatch a new 
> RemoteProcedure to this new regionserver will fail if version is not right.
> {code:java}
>   @Override
>   protected void remoteDispatch(final ServerName serverName,
>   final Set remoteProcedures) {
> final int rsVersion = 
> master.getAssignmentManager().getServerVersion(serverName);
> if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) {
>   LOG.trace("Using procedure batch rpc execution for serverName={} 
> version={}",
> serverName, rsVersion);
>   submitTask(new ExecuteProceduresRemoteCall(serverName, 
> remoteProcedures));
> } else {
>   LOG.info(String.format(
> "Fallback to compat rpc execution for serverName=%s version=%s",
> serverName, rsVersion));
>   submitTask(new CompatRemoteProcedureResolver(serverName, 
> remoteProcedures));
> }
>   }
> {code}
> The above code use version to resolve compatibility problem. So dispatch will 
> work right for old version region server. But for RefreshPeerProcedure, it is 
> new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new 
> region server version is not right, it will use CompatRemoteProcedureResolver 
> for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed 
> rightly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)