[jira] [Updated] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-18036: --- Attachment: HBASE-18036.v2-branch-1.1.patch > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch, HBASE-18036.v2-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009145#comment-16009145 ] Anoop Sam John commented on HBASE-18043: bq."hbase.server.keyvalue.maxsize"; Call it cell.maxsize? We try avoid KV naming HRegion /** +hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * Licensed to the Apache Software Foundation (ASF) under one This is by some mistake I believe. This is per cell size limiting. Do we have some checks before accepting RPC requests itself on the size? Avoiding service down by many large requests. We would have read those request bytes making bad GC issues.. Just asking Andy. Else +1 > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043-branch-1.patch, > HBASE-18043.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009140#comment-16009140 ] Anoop Sam John commented on HBASE-18042: Know which jira changed this behave? > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009139#comment-16009139 ] Anoop Sam John commented on HBASE-17887: I guess Lars say HBASE-14970 and this one should not ideally be committed to minor versions. So both should go have been going into 2.0 only. Correct Lars? > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18027) HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
[ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009136#comment-16009136 ] Lars Hofhansl commented on HBASE-18027: --- (And hence perhaps this is just checking the the replication batch size limit is <= the RPC size limit) > HBaseInterClusterReplicationEndpoint should respect RPC size limits when > batching edits > --- > > Key: HBASE-18027 > URL: https://issues.apache.org/jira/browse/HBASE-18027 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18027-branch-1.patch, HBASE-18027.patch, > HBASE-18027.patch > > > In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in > batches. We create N lists. N is the minimum of configured replicator > threads, number of 100-waledit batches, or number of current sinks. Every > pending entry in the replication context is then placed in order by hash of > encoded region name into one of these N lists. Each of the N lists is then > sent all at once in one replication RPC. We do not test if the sum of data in > each N list will exceed RPC size limits. This code presumes each individual > edit is reasonably small. Not checking for aggregate size while assembling > the lists into RPCs is an oversight and can lead to replication failure when > that assumption is violated. > We can fix this by generating as many replication RPC calls as we need to > drain a list, keeping each RPC under limit, instead of assuming the whole > list will fit in one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009135#comment-16009135 ] Lars Hofhansl commented on HBASE-18043: --- Thanks! +1 > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043-branch-1.patch, > HBASE-18043.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18027) HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
[ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009134#comment-16009134 ] Lars Hofhansl edited comment on HBASE-18027 at 5/13/17 4:47 AM: So looking at the code... In the original code I assume that the caller do the size enforcement. And indeed I see that happening in the code. {{HBaseInterClusterReplicationEndpoint.replicate}} is called from {{ReplicationSourceWorkerThread.shipEdits}}, which is called from {{ReplicationSourceWorkerThread.run}} after the call to {{ReplicationSourceWorkerThread.readAllEntriesToReplicateOrNextFile}} which reads the next batch _and_ - crucially - enforces the replication batch size limit. So any single batch issued from within {{replicate}} cannot be larger than the overall batch size enforced (which defaults to 64MB). So I don't seen how this cause a problem (but as usually, it is entirely possible that I missed a piece of the puzzle here) was (Author: lhofhansl): So looking at the code... In the original code I assume that the caller do the size enforcement. And indeed I see that happening in the code. {{HBaseInterClusterReplicationEndpoint.replicate}} is called from {{ReplicationSourceWorkerThread.shipEdits}}, which is called from {{ReplicationSourceWorkerThread.run}} after the call to {{ReplicationSourceWorkerThread.readAllEntriesToReplicateOrNextFile}} which reads the next batch _and_ crucially enforces the replication batch size limit. So any single batch issued from within {{replicate}} can be larger than the overall batch size enforced (which defaults to 64MB). So I don't seen how this cause a problem (but as usually, it is entirely possible that I missed a piece of the puzzle here) > HBaseInterClusterReplicationEndpoint should respect RPC size limits when > batching edits > --- > > Key: HBASE-18027 > URL: https://issues.apache.org/jira/browse/HBASE-18027 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18027-branch-1.patch, HBASE-18027.patch, > HBASE-18027.patch > > > In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in > batches. We create N lists. N is the minimum of configured replicator > threads, number of 100-waledit batches, or number of current sinks. Every > pending entry in the replication context is then placed in order by hash of > encoded region name into one of these N lists. Each of the N lists is then > sent all at once in one replication RPC. We do not test if the sum of data in > each N list will exceed RPC size limits. This code presumes each individual > edit is reasonably small. Not checking for aggregate size while assembling > the lists into RPCs is an oversight and can lead to replication failure when > that assumption is violated. > We can fix this by generating as many replication RPC calls as we need to > drain a list, keeping each RPC under limit, instead of assuming the whole > list will fit in one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18027) HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
[ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009134#comment-16009134 ] Lars Hofhansl commented on HBASE-18027: --- So looking at the code... In the original code I assume that the caller do the size enforcement. And indeed I see that happening in the code. {{HBaseInterClusterReplicationEndpoint.replicate}} is called from {{ReplicationSourceWorkerThread.shipEdits}}, which is called from {{ReplicationSourceWorkerThread.run}} after the call to {{ReplicationSourceWorkerThread.readAllEntriesToReplicateOrNextFile}} which reads the next batch _and_ crucially enforces the replication batch size limit. So any single batch issued from within {{replicate}} can be larger than the overall batch size enforced (which defaults to 64MB). So I don't seen how this cause a problem (but as usually, it is entirely possible that I missed a piece of the puzzle here) > HBaseInterClusterReplicationEndpoint should respect RPC size limits when > batching edits > --- > > Key: HBASE-18027 > URL: https://issues.apache.org/jira/browse/HBASE-18027 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18027-branch-1.patch, HBASE-18027.patch, > HBASE-18027.patch > > > In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in > batches. We create N lists. N is the minimum of configured replicator > threads, number of 100-waledit batches, or number of current sinks. Every > pending entry in the replication context is then placed in order by hash of > encoded region name into one of these N lists. Each of the N lists is then > sent all at once in one replication RPC. We do not test if the sum of data in > each N list will exceed RPC size limits. This code presumes each individual > edit is reasonably small. Not checking for aggregate size while assembling > the lists into RPCs is an oversight and can lead to replication failure when > that assumption is violated. > We can fix this by generating as many replication RPC calls as we need to > drain a list, keeping each RPC under limit, instead of assuming the whole > list will fit in one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009130#comment-16009130 ] stack commented on HBASE-18044: --- +1 Try it [~appy] I'd say just go ahead and commit this kinda infra fix sir. Keep committing addendums till it right (smile). > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17959) Canary timeout should be configurable on a per-table basis
[ https://issues.apache.org/jira/browse/HBASE-17959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated HBASE-17959: - Attachment: HBASE-17959.patch Added support for configuring read/write timeouts on a per-table basis when in region mode. Added unit test for per-table timeout checks. > Canary timeout should be configurable on a per-table basis > -- > > Key: HBASE-17959 > URL: https://issues.apache.org/jira/browse/HBASE-17959 > Project: HBase > Issue Type: Improvement > Components: canary >Reporter: Andrew Purtell >Assignee: Chinmay Kulkarni >Priority: Minor > Attachments: HBASE-17959.patch > > > The Canary read and write timeouts should be configurable on a per-table > basis, for cases where different tables have different latency SLAs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18027) HBaseInterClusterReplicationEndpoint should respect RPC size limits when batching edits
[ https://issues.apache.org/jira/browse/HBASE-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009126#comment-16009126 ] Lars Hofhansl commented on HBASE-18027: --- Looking > HBaseInterClusterReplicationEndpoint should respect RPC size limits when > batching edits > --- > > Key: HBASE-18027 > URL: https://issues.apache.org/jira/browse/HBASE-18027 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18027-branch-1.patch, HBASE-18027.patch, > HBASE-18027.patch > > > In HBaseInterClusterReplicationEndpoint#replicate we try to replicate in > batches. We create N lists. N is the minimum of configured replicator > threads, number of 100-waledit batches, or number of current sinks. Every > pending entry in the replication context is then placed in order by hash of > encoded region name into one of these N lists. Each of the N lists is then > sent all at once in one replication RPC. We do not test if the sum of data in > each N list will exceed RPC size limits. This code presumes each individual > edit is reasonably small. Not checking for aggregate size while assembling > the lists into RPCs is an oversight and can lead to replication failure when > that assumption is violated. > We can fix this by generating as many replication RPC calls as we need to > drain a list, keeping each RPC under limit, instead of assuming the whole > list will fit in one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009118#comment-16009118 ] Hadoop QA commented on HBASE-18043: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 33m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 19s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 56s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 57m 2s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 9s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 164m 22s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 20s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 295m 31s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestBlockEvictionFromClient | | | hadoop.hbase.client.TestAsyncSnapshotAdminApi | | | hadoop.hbase.client.TestAsyncProcedureAdminApi | | | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | | hadoop.hbase.client.TestAsyncRegionAdminApi | | Timed out junit tests | org.apache.hadoop.hbase.util.TestHBaseFsckOneRS | | | org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe | | | org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase | | | org.apache.hadoop.hbase.TestIOFencing | | | org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867880/HBASE-18043.patch | | JIRA Issue | HBASE-18043 | | Optional Tests | asflicense javac javadoc unit findbugs
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009116#comment-16009116 ] Chia-Ping Tsai commented on HBASE-17887: bq. it might have been better to be contained in 2.x. Do your mean that we shouldn't commit it to branch1-3? If yes, I am fine with revert. But we should have a note in mail list to remind dev about this issue. > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009106#comment-16009106 ] Hudson commented on HBASE-18014: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK8 #177 (See [https://builds.apache.org/job/HBase-1.3-JDK8/177/]) HBASE-18014 A case of Region remain unassigned when table enabled (Allan (apurtell: rev 36ebe05fc9013fe27ba0eca410ed11e5c5b112cb) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0, 1.3.2 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009105#comment-16009105 ] Hudson commented on HBASE-18014: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK7 #163 (See [https://builds.apache.org/job/HBase-1.3-JDK7/163/]) HBASE-18014 A case of Region remain unassigned when table enabled (Allan (apurtell: rev 36ebe05fc9013fe27ba0eca410ed11e5c5b112cb) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0, 1.3.2 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009085#comment-16009085 ] Hudson commented on HBASE-18014: SUCCESS: Integrated in Jenkins build HBase-1.4 #734 (See [https://builds.apache.org/job/HBase-1.4/734/]) HBASE-18014 A case of Region remain unassigned when table enabled (Allan (apurtell: rev 0a4528225c71cf515b69ab194779107d24de9852) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0, 1.3.2 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009074#comment-16009074 ] Lars Hofhansl commented on HBASE-17887: --- Wow. Changes like HBASE-14970 and this one really do not belong into a minor release of HBase. This is scary stuff. I cannot convince myself from just looking at the code that these two do not introduce more subtle issues. Sorry for the whining. :) Of course this is great stuff. Just that in hindsight it might have been better to be contained in 2.x. [~apurtell], for your consideration w.r.t. to our planned upgrade to 1.3.x. > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009048#comment-16009048 ] Alex Leblang commented on HBASE-18041: -- Well I just took the yetus pylintrc file so I can't be sure, but I know that linters can be pretty strict. We could start from another spot, but the things disabled weren't had selected to make our current codebase pass. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009048#comment-16009048 ] Alex Leblang edited comment on HBASE-18041 at 5/13/17 2:13 AM: --- Well I just took the yetus pylintrc file, but I know that linters can be pretty strict. We could start from another spot. The things disabled weren't hand selected to make our current codebase pass. was (Author: awleblang): Well I just took the yetus pylintrc file so I can't be sure, but I know that linters can be pretty strict. We could start from another spot, but the things disabled weren't had selected to make our current codebase pass. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009045#comment-16009045 ] Hadoop QA commented on HBASE-18043: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 16s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 14s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 14s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 54s {color} | {color:red} The patch causes 17 errors with Hadoop v2.6.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 5m 34s {color} | {color:red} The patch causes 17 errors with Hadoop v2.6.2. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 8m 14s {color} | {color:red} The patch causes 17 errors with Hadoop v2.6.3. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 10m 50s {color} | {color:red} The patch causes 17 errors with Hadoop v2.6.4. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 13m 34s {color} | {color:red} The patch causes 17 errors with Hadoop v2.6.5. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 16m 13s {color} | {color:red} The patch causes 17 errors with Hadoop v2.7.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 18m 52s {color} | {color:red} The patch causes 17 errors with Hadoop v2.7.2. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 21m 31s {color} | {color:red} The patch causes 17 errors with Hadoop v2.7.3. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 24m 9s {color} | {color:red} The patch causes 17 errors with Hadoop v3.0.0-alpha2. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 46s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 4s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 12s {color} | {color:red} hbase-server in the patch failed.
[jira] [Commented] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009036#comment-16009036 ] Appy commented on HBASE-18029: -- If we don't want to break coproc compat either, we can use StoreFile.Reader (instead of StoreFileReader) in cp interfaces too. That way, we might not have to typecast either. Not sure about binary compat, but it'll be source-code compat at the least. > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > Attachments: HBASE-18029.branch-1.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009025#comment-16009025 ] Hudson commented on HBASE-18014: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #45 (See [https://builds.apache.org/job/HBase-1.3-IT/45/]) HBASE-18014 A case of Region remain unassigned when table enabled (Allan (apurtell: rev 36ebe05fc9013fe27ba0eca410ed11e5c5b112cb) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0, 1.3.2 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Attachment: HBASE-18043.patch HBASE-18043-branch-1.patch Patches with updated test case that checks if a near to the limit cell will pass. Unit test still passes. > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043-branch-1.patch, > HBASE-18043.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009000#comment-16009000 ] Andrew Purtell commented on HBASE-18043: bq. Looks good. Should we add a test with Put size of 9K, or 9.5K, to make sure the size estimation is not completely off? Sure, ok > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18014: --- Fix Version/s: 1.3.2 > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0, 1.3.2 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18014: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.0 Status: Resolved (was: Patch Available) > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Fix For: 1.4.0 > > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18018) Support abort for all procedures by default
[ https://issues.apache.org/jira/browse/HBASE-18018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008978#comment-16008978 ] Hadoop QA commented on HBASE-18018: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 43s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 50s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 0s {color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 122m 34s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 175m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867857/HBASE-18018.master.003.patch | | JIRA Issue | HBASE-18018 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 578fd32699cd 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 305ffcb | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/6774/testReport/ | | modules | C: hbase-procedure hbase-server U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6774/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Support abort for all procedures by
[jira] [Commented] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008967#comment-16008967 ] Hadoop QA commented on HBASE-18036: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 30s {color} | {color:green} branch-1.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s {color} | {color:green} branch-1.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} branch-1.1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} branch-1.1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 3s {color} | {color:red} hbase-server in branch-1.1 has 80 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 2s {color} | {color:red} hbase-server in branch-1.1 failed. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 49s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 45s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 158m 27s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 213m 47s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationSmallTests | | | hadoop.hbase.replication.TestReplicationEndpoint | | | hadoop.hbase.client.TestMultiParallel | | | hadoop.hbase.master.TestAssignmentManager | | Timed out junit tests | org.apache.hadoop.hbase.mapreduce.TestRowCounter | | | org.apache.hadoop.hbase.snapshot.TestExportSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:de9b245 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867851/HBASE-18036.v1-branch-1.1.patch | | JIRA Issue | HBASE-18036 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 101aa0c262d2 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/hbase.sh | | git revision | branch-1.1 / 7d820db | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/6773/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html | | javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/6773/artifact/patchprocess/branch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/6773/artifact/patchprocess/patch-javadoc-hbase-server.txt | | unit |
[jira] [Commented] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008944#comment-16008944 ] Hadoop QA commented on HBASE-18044: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 38m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 23s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 18s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867876/HBASE-18044.master.001.patch | | JIRA Issue | HBASE-18044 | | Optional Tests | asflicense | | uname | Linux 929dcace3764 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 305ffcb | | modules | C: . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6777/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008936#comment-16008936 ] Lars Hofhansl commented on HBASE-18043: --- Looks good. Should we add a test with Put size of 9K, or 9.5K, to make sure the size estimation is not completely off? > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18045) Add ' -o ConnectTimeout=10' to the ssh command we use in ITBLL chaos monkeys
stack created HBASE-18045: - Summary: Add ' -o ConnectTimeout=10' to the ssh command we use in ITBLL chaos monkeys Key: HBASE-18045 URL: https://issues.apache.org/jira/browse/HBASE-18045 Project: HBase Issue Type: Improvement Components: integration tests Reporter: stack Priority: Trivial Monkeys hang on me in long running tests. I've not spent too much time on it since it rare enough but I just went through a spate of them. When monkey kill ssh hangs, all killing stops which can give a false sense of victory when you wake up in the morning and your job 'passed'. I also see monkeys kill all servers in a cluster and fail to bring them back which causes job fail as no one is serving data. The latter may actually be another issue but for the former, I've had some success adding -o ConnectTimeout=10 as an option on ssh. You can do it easily enough via config but this issue is to suggest that we add it in code. Here is how you add it via config if interested: hbase.it.clustermanager.ssh.opts -o ConnectTimeout=10 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18042: Affects Version/s: 1.3.1 > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18014) A case of Region remain unassigned when table enabled
[ https://issues.apache.org/jira/browse/HBASE-18014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008903#comment-16008903 ] Andrew Purtell commented on HBASE-18014: lgtm Let me do some local checks and then commit > A case of Region remain unassigned when table enabled > - > > Key: HBASE-18014 > URL: https://issues.apache.org/jira/browse/HBASE-18014 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18014-branch-1.patch, HBASE-18014-branch-1.v2.patch > > > Reproduce procedure: > 1. Create a table, say the regions of this table are opened on RS1 > 2. Disable this table > 3. Abort RS1 and wait for SSH to complete > 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in > {{RegionState}} to store processed dead servers) > 5. Enable the table, then the region of the table will remain unassigned > until master restarts. > Why? > When assigning regions after the table enabled, AssignmentManager will check > whether those regions are on servers which are dead but not processed, since > RS1 already have deleted from the map of 'processedServers'. Then the > AssignmentManager think this region is on a dead but not processed server. So > it will skip assign, let the region be handled by SSH. > {code:java} > case OFFLINE: > if (useZKForAssignment > && regionStates.isServerDeadAndNotProcessed(sn) > && wasRegionOnDeadServerByMeta(region, sn)) { > if (!regionStates.isRegionInTransition(region)) { > LOG.info("Updating the state to " + State.OFFLINE + " to allow to > be reassigned by SSH"); > regionStates.updateRegionState(region, State.OFFLINE); > } > LOG.info("Skip assigning " + region.getRegionNameAsString() > + ", it is on a dead but not processed yet server: " + sn); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008899#comment-16008899 ] Hadoop QA commented on HBASE-18044: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 10s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 10s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867876/HBASE-18044.master.001.patch | | JIRA Issue | HBASE-18044 | | Optional Tests | asflicense | | uname | Linux aa30c5a1e081 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 305ffcb | | modules | C: . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6775/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008896#comment-16008896 ] Andrew Purtell edited comment on HBASE-18029 at 5/12/17 11:21 PM: -- How can we do this refactor without breaking coprocessors on branch-1 ? I guess by this: {quote} - Create dummy classes. For eg. class Reader extends StoreFileReader - Return StoreFile.Reader from fns in StoreFile (keeps compat) {quote} Is this binary compatible? Not saying it's an absolute necessity. was (Author: apurtell): How can we do this refactor without breaking coprocessors on branch-1 ? > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > Attachments: HBASE-18029.branch-1.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008896#comment-16008896 ] Andrew Purtell commented on HBASE-18029: How can we do this refactor without breaking coprocessors on branch-1 ? > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > Attachments: HBASE-18029.branch-1.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008893#comment-16008893 ] Andrew Purtell commented on HBASE-18043: [~lhofhansl] > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Status: Patch Available (was: Open) > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18042: --- Assignee: Karan Mehta > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-15930) Make IntegrationTestReplication's waitForReplication() smarter
[ https://issues.apache.org/jira/browse/HBASE-15930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dima Spivak reassigned HBASE-15930: --- Assignee: Mike Drob (was: Dima Spivak) Please go ahead. Sorry, I've fallen off the face of the HBase planet. :) > Make IntegrationTestReplication's waitForReplication() smarter > -- > > Key: HBASE-15930 > URL: https://issues.apache.org/jira/browse/HBASE-15930 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Dima Spivak >Assignee: Mike Drob > Fix For: 2.0.0 > > > {{IntegrationTestReplication}} is a great test, but can improved by changing > how we handle waiting between generation of the linked list on the source > cluster and verifying the linked list on the destination cluster. [Even the > code suggests this should be > done|https://github.com/apache/hbase/blob/master/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestReplication.java#L251-252], > so I'd like to take it on. [~mbertozzi] and [~busbey] have both suggested a > simple solution wherein we write a row into each region on the source cluster > after the linked list generation and then assume replication has gone through > once these rows are detected on the destination cluster. > Since you lads at Facebook are some of the heaviest users, [~eclark], would > you prefer I maintain the API and add a new command line option (say {{\-c | > \-\-check-replication}}) that would run before any {{--generateVerifyGap}} > sleep is carried out as it is now? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Attachment: HBASE-18043.patch > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator
[ https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008862#comment-16008862 ] Enis Soztutar commented on HBASE-17707: --- [~kahliloppenheimer] any update? > New More Accurate Table Skew cost function/generator > > > Key: HBASE-17707 > URL: https://issues.apache.org/jira/browse/HBASE-17707 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.0 > Environment: CentOS Derivative with a derivative of the 3.18.43 > kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches. >Reporter: Kahlil Oppenheimer >Assignee: Kahlil Oppenheimer >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, > HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, > HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, > HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, > HBASE-17707-11.patch, HBASE-17707-12.patch, test-balancer2-13617.out > > > This patch includes new version of the TableSkewCostFunction and a new > TableSkewCandidateGenerator. > The new TableSkewCostFunction computes table skew by counting the minimal > number of region moves required for a given table to perfectly balance the > table across the cluster (i.e. as if the regions from that table had been > round-robin-ed across the cluster). This number of moves is computer for each > table, then normalized to a score between 0-1 by dividing by the number of > moves required in the absolute worst case (i.e. the entire table is stored on > one server), and stored in an array. The cost function then takes a weighted > average of the average and maximum value across all tables. The weights in > this average are configurable to allow for certain users to more strongly > penalize situations where one table is skewed versus where every table is a > little bit skewed. To better spread this value more evenly across the range > 0-1, we take the square root of the weighted average to get the final value. > The new TableSkewCandidateGenerator generates region moves/swaps to optimize > the above TableSkewCostFunction. It first simply tries to move regions until > each server has the right number of regions, then it swaps regions around > such that each region swap improves table skew across the cluster. > We tested the cost function and generator in our production clusters with > 100s of TBs of data and 100s of tables across dozens of servers and found > both to be very performant and accurate. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Attachment: (was: HBASE-18043.patch) > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17786) Create LoadBalancer perf-tests (test balancer algorithm decoupled from workload)
[ https://issues.apache.org/jira/browse/HBASE-17786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008866#comment-16008866 ] Umesh Agashe commented on HBASE-17786: -- Thanks [~busbey] for reviewing and committing the changes. Thanks [~stack] for reviewing the changes. > Create LoadBalancer perf-tests (test balancer algorithm decoupled from > workload) > > > Key: HBASE-17786 > URL: https://issues.apache.org/jira/browse/HBASE-17786 > Project: HBase > Issue Type: Sub-task > Components: Balancer, proc-v2 >Reporter: stack >Assignee: Umesh Agashe > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-17786.001.patch, HBASE-17786.002.patch, > HBASE-17786.002.patch > > > (Below is a quote from [~mbertozzi] taken from an internal issue that I'm > moving out here) > Add perf tools and keep monitored balancer performance (a BalancerPE-type > thing). > Most of the balancers should be instantiable without requiring a > mini-cluster, and it easy to create tons of RegionInfo and ServerNames with a > for loop. > The balancer is just creating a map RegionInfo:ServerName. > There are two methods to test roundRobinAssignment() and retainAssignment() > {code} > MaproundRobinAssignment( > List regions, > List servers > ) throws HBaseIOException; > Map retainAssignment( > Map regions, > List servers > ) throws HBaseIOException; > {code} > There are a bunch of obvious optimization that everyone can see just by > looking at the code. (like replacing array with set when we do > contains/remove operations). It will be nice to have a baseline and start > improving from there. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008860#comment-16008860 ] Dima Spivak commented on HBASE-18041: - Are we disabling a bunch of Pylint checks because the current code sucks? That seems like a bad idea. We should let it raise all kinds of errors and then open separate JIRAs to address them, I'd think. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work started] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-18044 started by Appy. > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work stopped] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-18044 stopped by Appy. > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Attachment: HBASE-18043.patch > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch, HBASE-18043.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-18044: - Status: Patch Available (was: Open) > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18044) Bug fix in flaky-dashboard-template
[ https://issues.apache.org/jira/browse/HBASE-18044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-18044: - Attachment: HBASE-18044.master.001.patch > Bug fix in flaky-dashboard-template > --- > > Key: HBASE-18044 > URL: https://issues.apache.org/jira/browse/HBASE-18044 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-18044.master.001.patch > > > Due to some scoping issues, counters don't work in jinja templates like we > expect them to (in java/c++,etc). > (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). > Due to this, clicking show/hide button for tests which fail in multiple urls > will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Leblang updated HBASE-18041: - Status: Patch Available (was: In Progress) > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18044) Bug fix in flaky-dashboard-template
Appy created HBASE-18044: Summary: Bug fix in flaky-dashboard-template Key: HBASE-18044 URL: https://issues.apache.org/jira/browse/HBASE-18044 Project: HBase Issue Type: Bug Reporter: Appy Assignee: Appy Priority: Minor Due to some scoping issues, counters don't work in jinja templates like we expect them to (in java/c++,etc). (http://stackoverflow.com/questions/7537439/how-to-increment-a-variable-on-a-for-loop-in-jinja-template). Due to this, clicking show/hide button for tests which fail in multiple urls will not work sometime. Fixing it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work started] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-18041 started by Alex Leblang. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Leblang updated HBASE-18041: - Attachment: HBASE-18041.branch-1.2.001.patch > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > Attachments: HBASE-18041.branch-1.2.001.patch > > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
[ https://issues.apache.org/jira/browse/HBASE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18043: --- Attachment: HBASE-18043-branch-1.patch > Institute a hard limit for individual cell size that cannot be overridden by > clients > > > Key: HBASE-18043 > URL: https://issues.apache.org/jira/browse/HBASE-18043 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, regionserver >Affects Versions: 2.0.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18043-branch-1.patch > > > For sake of service protection we should not give absolute trust to clients > regarding resource limits that can impact stability, like cell size limits. > We should add a server side configuration that sets a hard limit for > individual cell size that cannot be overridden by the client. We can keep the > client side check, because it's expensive to reject a RPC that has already > come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008848#comment-16008848 ] Appy edited comment on HBASE-18029 at 5/12/17 10:46 PM: I haven't thought this completely through, but throwing it out for discussion: - Refactor out current nested classes to StoreFileReader / StoreFileWriter. - Create dummy classes. For eg. {{class Reader extends StoreFileReader}} - Return StoreFile.Reader from fns in StoreFile (keeps compat) - return StoreFileReader from CP functions - caveat being doing explicit casting in non-public parts of code. In this case, {{private Reader open(...)}}. But if we do se, we should explicitly comment dummy classes saying what they are and why. (uploading a quick patch - 001) +1 for interfaces, but that's orthogonal to refactoring change, right? (unless am missing something) [~Apache9]. Took me time to get back since i was on leave for majority of week. was (Author: appy): I haven't thought this completely through, but throwing it out for discussion: - Refactor out current nested classes to StoreFileReader / StoreFileWriter. - Create dummy classes. For eg. {{class Reader extends StoreFileReader}} - Return StoreFile.Reader from fns in StoreFile (keeps compat) - return StoreFileReader from CP functions - caveat being doing explicit casting in non-public parts of code. In this case, {{private Reader open(...)}}. But if we do se, we should explicitly comment dummy classes saying what they are and why. (uploading a quick patch) +1 for interfaces, but that's orthogonal, right? (unless am missing something) [~Apache9]. Took me time to get back since i was on leave for majority of week. > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > Attachments: HBASE-18029.branch-1.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18043) Institute a hard limit for individual cell size that cannot be overridden by clients
Andrew Purtell created HBASE-18043: -- Summary: Institute a hard limit for individual cell size that cannot be overridden by clients Key: HBASE-18043 URL: https://issues.apache.org/jira/browse/HBASE-18043 Project: HBase Issue Type: Improvement Components: IPC/RPC, regionserver Affects Versions: 2.0.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.4.0 For sake of service protection we should not give absolute trust to clients regarding resource limits that can impact stability, like cell size limits. We should add a server side configuration that sets a hard limit for individual cell size that cannot be overridden by the client. We can keep the client side check, because it's expensive to reject a RPC that has already come in. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-18029: - Attachment: HBASE-18029.branch-1.001.patch > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > Attachments: HBASE-18029.branch-1.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18029) Backport HBASE-15296 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-18029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008848#comment-16008848 ] Appy commented on HBASE-18029: -- I haven't thought this completely through, but throwing it out for discussion: - Refactor out current nested classes to StoreFileReader / StoreFileWriter. - Create dummy classes. For eg. {{class Reader extends StoreFileReader}} - Return StoreFile.Reader from fns in StoreFile (keeps compat) - return StoreFileReader from CP functions - caveat being doing explicit casting in non-public parts of code. In this case, {{private Reader open(...)}}. But if we do se, we should explicitly comment dummy classes saying what they are and why. (uploading a quick patch) +1 for interfaces, but that's orthogonal, right? (unless am missing something) [~Apache9]. Took me time to get back since i was on leave for majority of week. > Backport HBASE-15296 to branch-1 > > > Key: HBASE-18029 > URL: https://issues.apache.org/jira/browse/HBASE-18029 > Project: HBase > Issue Type: Improvement > Components: regionserver >Reporter: Duo Zhang > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-15930) Make IntegrationTestReplication's waitForReplication() smarter
[ https://issues.apache.org/jira/browse/HBASE-15930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008827#comment-16008827 ] Mike Drob commented on HBASE-15930: --- Hey [~dimaspivak] - I recently ran into issues with this test and was looking at making some improvements. Do you mind if I take this on while I'm in the area? > Make IntegrationTestReplication's waitForReplication() smarter > -- > > Key: HBASE-15930 > URL: https://issues.apache.org/jira/browse/HBASE-15930 > Project: HBase > Issue Type: Improvement > Components: integration tests >Reporter: Dima Spivak >Assignee: Dima Spivak > Fix For: 2.0.0 > > > {{IntegrationTestReplication}} is a great test, but can improved by changing > how we handle waiting between generation of the linked list on the source > cluster and verifying the linked list on the destination cluster. [Even the > code suggests this should be > done|https://github.com/apache/hbase/blob/master/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestReplication.java#L251-252], > so I'd like to take it on. [~mbertozzi] and [~busbey] have both suggested a > simple solution wherein we write a row into each region on the source cluster > after the linked list generation and then assume replication has gone through > once these rows are detected on the destination cluster. > Since you lads at Facebook are some of the heaviest users, [~eclark], would > you prefer I maintain the API and add a new command line option (say {{\-c | > \-\-check-replication}}) that would run before any {{--generateVerifyGap}} > sleep is carried out as it is now? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18018) Support abort for all procedures by default
[ https://issues.apache.org/jira/browse/HBASE-18018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008736#comment-16008736 ] Hadoop QA commented on HBASE-18018: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 56s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 33s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 48s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 58s {color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 113m 44s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 161m 46s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12867827/HBASE-18018.master.002.patch | | JIRA Issue | HBASE-18018 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux a28158f9d67e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 305ffcb | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/6772/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/6772/testReport/ | | modules | C: hbase-procedure hbase-server U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6772/console | | Powered by | Apache Yetus 0.3.0
[jira] [Updated] (HBASE-18018) Support abort for all procedures by default
[ https://issues.apache.org/jira/browse/HBASE-18018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-18018: - Attachment: HBASE-18018.master.003.patch Retained the current behavior for DeleteTableProcedure and added TODO. > Support abort for all procedures by default > --- > > Key: HBASE-18018 > URL: https://issues.apache.org/jira/browse/HBASE-18018 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0 > > Attachments: HBASE-18018.001.patch, HBASE-18018.master.001.patch, > HBASE-18018.master.002.patch, HBASE-18018.master.003.patch > > > Changes the default behavior of StateMachineProcedure to support aborting all > procedures even if rollback is not supported. On abort, procedure is treated > as failed and rollback is called but for procedures which cannot be rolled > back abort is ignored currently. This sometime causes procedure to get stuck > in waiting state forever. User should have an option to abort any stuck > procedure and clean up manually. Please refer to HBASE-18016 and discussion > there. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008684#comment-16008684 ] Stephen Yuan Jiang edited comment on HBASE-18036 at 5/12/17 9:01 PM: - The V1 patch has minor change based on [~elserj]'s feedback. Also add some logging to make the change clear. The V1 change was tested in a small cluster. I used Ambari to restart cluster and saw the new code path got hit and regions assigned back to its original region server and locality is preserved. Next up: I will use the same logic in branch-1 and other child branches. Base on [~devaraj]'s offline feedback, I will remove the newly introduced "hbase.master.retain.assignment" config in branch-1; but keep the config in other branches (this config is just for in case of regression, user has a way to revert back to original round robin behavior; as patch releases usually don't have full testing) was (Author: syuanjiang): The V1 patch has minor change based on [~elserj]'s feedback. Also add some logging to make the change clear. Next up: I will use the same logic in branch-1 and other child branches. Base on [~devaraj]'s offline feedback, I will remove the newly introduced "hbase.master.retain.assignment" config in branch-1; but keep the config in other branches (this config is just for in case of regression, user has a way to revert back to original round robin behavior; as patch releases usually don't have full testing) > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008684#comment-16008684 ] Stephen Yuan Jiang commented on HBASE-18036: The V1 patch has minor change based on [~elserj]'s feedback. Also add some logging to make the change clear. Next up: I will use the same logic in branch-1 and other child branches. Base on [~devaraj]'s offline feedback, I will remove the newly introduced "hbase.master.retain.assignment" config in branch-1; but keep the config in other branches (this config is just for in case of regression, user has a way to revert back to original round robin behavior; as patch releases usually don't have full testing) > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008681#comment-16008681 ] Stephen Yuan Jiang commented on HBASE-18036: [~stack], thanks for the review. For master, I am not going to make any change, as the proc-v2 change would overwrite anyway. I plan to makes the same change in ServerCrashProcedure in branch-1 and other child branches. > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-18036: --- Status: Patch Available (was: Open) > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.1.10, 1.2.5, 1.3.1, 1.4.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-18036: --- Attachment: HBASE-18036.v1-branch-1.1.patch > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch, > HBASE-18036.v1-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18035) Meta replica does not give any primaryOperationTimeout to primary meta region
[ https://issues.apache.org/jira/browse/HBASE-18035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-18035: --- Summary: Meta replica does not give any primaryOperationTimeout to primary meta region (was: Meta replica does not give any primaryOperationTimeout to primary mete region) > Meta replica does not give any primaryOperationTimeout to primary meta region > - > > Key: HBASE-18035 > URL: https://issues.apache.org/jira/browse/HBASE-18035 > Project: HBase > Issue Type: Bug >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Critical > > I was working on my unittest and it failed with TableNotFoundException. I > debugged a bit and found out that for meta scan, it does not give any > primaryOperationTimeout to primary meta region. This will be an issue as the > meta replica will contain stale data and it is possible that the meta replica > will return back first than primary. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L823 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008644#comment-16008644 ] Dima Spivak commented on HBASE-18041: - Yep, no problem. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dima Spivak reassigned HBASE-18041: --- Assignee: Alex Leblang > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang >Assignee: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008623#comment-16008623 ] Sean Busbey commented on HBASE-18041: - well, as much as I'd like to have consistent spacing across the project (which would be 2), I like us following an established language default even better. [~dimaspivak] you mind handling this particular review? > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008619#comment-16008619 ] Alex Leblang commented on HBASE-18041: -- Sure, I'll make a first attempt at this. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008616#comment-16008616 ] Dima Spivak commented on HBASE-18041: - Default is 4 spaces (that's a PEP 8 thing). > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-18041: Description: Yetus runs all commits with python files through a linter. I think that the HBase community should add a pylintrc file to actively choose the project's python style instead of just relying on yetus defaults. As an argument for this, the yetus project itself doesn't even use the default python linter for its own commits. was: Yetis runs all commits with python files through a linter. I think that the HBase community should add a pylintrc file to actively choose the project's python style instead of just relying on yetis defaults. As an argument for this, the yetis project itself doesn't even use the default python linter for its own commits. > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetus runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetus defaults. > As an argument for this, the yetus project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008611#comment-16008611 ] Sean Busbey commented on HBASE-18041: - [~awleblang] you game for putting up a first-cut? [~dimaspivak] I thought the default in pylint was 4 spaces? Anyone else have Opinions about Python? > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetis runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetis defaults. > As an argument for this, the yetis project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-18041: Component/s: community > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement > Components: community >Reporter: Alex Leblang > > Yetis runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetis defaults. > As an argument for this, the yetis project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-11013) Clone Snapshots on Secure Cluster Should provide option to apply Retained User Permissions
[ https://issues.apache.org/jira/browse/HBASE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008575#comment-16008575 ] Hadoop QA commented on HBASE-11013: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 1s {color} | {color:blue} rubocop was not available. {color} | | {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 1s {color} | {color:blue} Ruby-lint was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 16 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 36s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 18m 19s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 58s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s {color} | {color:red} hbase-protocol-shaded in master has 24 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 19m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 76 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 53s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 4s {color} | {color:red} hbase-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 108m 46s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s {color} | {color:green} hbase-rsgroup in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 54s {color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 26s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 215m 31s {color} | {color:black} {color} | \\ \\ || Reason || Tests || |
[jira] [Commented] (HBASE-18026) ProtobufUtil seems to do extra array copying
[ https://issues.apache.org/jira/browse/HBASE-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008564#comment-16008564 ] Vincent Poon commented on HBASE-18026: -- [~anoop.hbase] Could you clarify your concern? Before, toByteArray() was copying from offset 0 up to the entire length of the proto ByteString array. So how do we "loose the offset and length of this row bytes" ? One concern I think you might be raising is that Get constructor, unlike KeyValue/Put/Delete/Increment constructors, doesn't do a copy of the passed in byte[] but just assigns it to a field. I think that might be a legitimate concern, although then the question becomes, if the row in Get can be modified (and I don't know why we'd be doing an in-place modification?), why don't we do a protective copy in the constructor instead like in Put, etc? For the others though, it seems the code was copying twice. > ProtobufUtil seems to do extra array copying > > > Key: HBASE-18026 > URL: https://issues.apache.org/jira/browse/HBASE-18026 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.2 >Reporter: Vincent Poon >Assignee: Vincent Poon >Priority: Minor > Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.1.11 > > Attachments: HBASE-18026.branch-1.v1.patch, > HBASE-18026.master.v1.patch > > > In ProtobufUtil, the protobuf fields are copied into an array using > toByteArray(). These are then passed into the KeyValue constructor which > does another copy. > It seems like we can avoid a copy here by using > HBaseZeroCopyByteString#zeroCopyGetBytes() ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008562#comment-16008562 ] stack commented on HBASE-18036: --- [~syuanjiang] +1 on patch. It is an improvement. HBASE-17791 is a description of the more general case. We need to fix it too. For master and versions of hbase newer than what you were looking at, what you thinking? Thanks. > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18018) Support abort for all procedures by default
[ https://issues.apache.org/jira/browse/HBASE-18018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-18018: - Attachment: HBASE-18018.master.002.patch Fixed unit test master.procedure.TestProcedureAdmin.testAbortProcedureFailure by overriding abort() in DeleteTableProcedure. > Support abort for all procedures by default > --- > > Key: HBASE-18018 > URL: https://issues.apache.org/jira/browse/HBASE-18018 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0 > > Attachments: HBASE-18018.001.patch, HBASE-18018.master.001.patch, > HBASE-18018.master.002.patch > > > Changes the default behavior of StateMachineProcedure to support aborting all > procedures even if rollback is not supported. On abort, procedure is treated > as failed and rollback is called but for procedures which cannot be rolled > back abort is ignored currently. This sometime causes procedure to get stuck > in waiting state forever. User should have an option to abort any stuck > procedure and clean up manually. Please refer to HBASE-18016 and discussion > there. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18036) Data locality is not maintained after cluster restart or SSH
[ https://issues.apache.org/jira/browse/HBASE-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008538#comment-16008538 ] Stephen Yuan Jiang commented on HBASE-18036: [~stack], could you help review the change? > Data locality is not maintained after cluster restart or SSH > > > Key: HBASE-18036 > URL: https://issues.apache.org/jira/browse/HBASE-18036 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 1.4.0, 1.3.1, 1.2.5, 1.1.10 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Attachments: HBASE-18036.v0-branch-1.1.patch > > > After HBASE-2896 / HBASE-4402, we think data locality is maintained after > cluster restart. However, we have seem some complains about data locality > loss when cluster restart (eg. HBASE-17963). > Examining the AssignmentManager#processDeadServersAndRegionsInTransition() > code, for cluster start, I expected to hit the following code path: > {code} > if (!failover) { > // Fresh cluster startup. > LOG.info("Clean cluster startup. Assigning user regions"); > assignAllUserRegions(allRegions); > } > {code} > where assignAllUserRegions would use retainAssignment() call in LoadBalancer; > however, from master log, we usually hit the failover code path: > {code} > // If we found user regions out on cluster, its a failover. > if (failover) { > LOG.info("Found regions out on cluster or in RIT; presuming failover"); > // Process list of dead servers and regions in RIT. > // See HBASE-4580 for more information. > processDeadServersAndRecoverLostRegions(deadServers); > } > {code} > where processDeadServersAndRecoverLostRegions() would put dead servers in SSH > and SSH uses roundRobinAssignment() in LoadBalancer. That is why we would > see loss locality more often than retaining locality during cluster restart. > Note: the code I was looking at is close to branch-1 and branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17786) Create LoadBalancer perf-tests (test balancer algorithm decoupled from workload)
[ https://issues.apache.org/jira/browse/HBASE-17786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008529#comment-16008529 ] Hudson commented on HBASE-17786: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #2998 (See [https://builds.apache.org/job/HBase-Trunk_matrix/2998/]) HBASE-17786 Create LoadBalancer perf-test tool to benchmark Load (busbey: rev da68537ae63ffbfc784de4dd6159d5d24f23f262) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/BalancerTestBase.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestFavoredStochasticLoadBalancer.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/LoadBalancerPerformanceEvaluation.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java > Create LoadBalancer perf-tests (test balancer algorithm decoupled from > workload) > > > Key: HBASE-17786 > URL: https://issues.apache.org/jira/browse/HBASE-17786 > Project: HBase > Issue Type: Sub-task > Components: Balancer, proc-v2 >Reporter: stack >Assignee: Umesh Agashe > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-17786.001.patch, HBASE-17786.002.patch, > HBASE-17786.002.patch > > > (Below is a quote from [~mbertozzi] taken from an internal issue that I'm > moving out here) > Add perf tools and keep monitored balancer performance (a BalancerPE-type > thing). > Most of the balancers should be instantiable without requiring a > mini-cluster, and it easy to create tons of RegionInfo and ServerNames with a > for loop. > The balancer is just creating a map RegionInfo:ServerName. > There are two methods to test roundRobinAssignment() and retainAssignment() > {code} > MaproundRobinAssignment( > List regions, > List servers > ) throws HBaseIOException; > Map retainAssignment( > Map regions, > List servers > ) throws HBaseIOException; > {code} > There are a bunch of obvious optimization that everyone can see just by > looking at the code. (like replacing array with set when we do > contains/remove operations). It will be nice to have a baseline and start > improving from there. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17938) General fault - tolerance framework for backup/restore operations
[ https://issues.apache.org/jira/browse/HBASE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008530#comment-16008530 ] Hudson commented on HBASE-17938: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #2998 (See [https://builds.apache.org/job/HBase-Trunk_matrix/2998/]) HBASE-17938 General fault - tolerance framework for backup/restore (tedyu: rev 305ffcb04025ea6f7880e9961120d309f55bf8ba) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupClientFactory.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupRestoreConstants.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupDriver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestFullBackupWithFailures.java > General fault - tolerance framework for backup/restore operations > - > > Key: HBASE-17938 > URL: https://issues.apache.org/jira/browse/HBASE-17938 > Project: HBase > Issue Type: Sub-task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > Attachments: HBASE-17938-v1.patch, HBASE-17938-v2.patch, > HBASE-17938-v3.patch, HBASE-17938-v4.patch, HBASE-17938-v5.patch, > HBASE-17938-v6.patch, HBASE-17938-v7.patch, HBASE-17938-v8.patch > > > The framework must take care of all general types of failures during backup/ > restore and restore system to the original state in case of a failure. > That won't solve all the possible issues but we have a separate JIRAs for > them as a sub-tasks of HBASE-15277 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008497#comment-16008497 ] Karan Mehta commented on HBASE-18042: - A simple solution for this is to not close the scanner even if there are no more results in region. I am not sure about other implications due to this though. {{RSRpcServices.java}} {code} addResults(builder, results, (PayloadCarryingRpcController) controller, RegionReplicaUtil.isDefaultReplica(region.getRegionInfo())); - if (!moreResults || !moreResultsInRegion || closeScanner) { + if (!moreResults || closeScanner) { scannerClosed = true; closeScanner(region, scanner, scannerName, context); {code} > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Reporter: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
Karan Mehta created HBASE-18042: --- Summary: Client Compatibility breaks between versions 1.2 and 1.3 Key: HBASE-18042 URL: https://issues.apache.org/jira/browse/HBASE-18042 Project: HBase Issue Type: Bug Reporter: Karan Mehta OpenTSDB uses AsyncHBase as its client, rather than using the traditional HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been changed. Newer fields are added to {{ScanResponse}} proto. For a typical Scan request in 1.2, would require caller to make an OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on {{more_rows}} boolean field in the {{ScanResponse}} proto. However, from 1.3, new parameter {{more_results_in_region}} was added, which limits the results per region. Therefore the client has to now manage sending all the requests for each region. Further more, if the results are exhausted from a particular region, the {{ScanResponse}} will set {{more_results_in_region}} to false, but {{more_results}} can still be true. Whenever the former is set to false, the {{RegionScanner}} will also be closed. OpenTSDB makes an OpenScanner Request and receives all its results in the first {{ScanResponse}} itself, thus creating a condition as described in above paragraph. Since {{more_rows}} is true, it will proceed to send next request at which point the {{RSRpcServices}} will throw {{UnknownScannerException}}. The protobuf client compatibility is maintained but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18041) Add pylintrc file to HBase
[ https://issues.apache.org/jira/browse/HBASE-18041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008467#comment-16008467 ] Alex Leblang commented on HBASE-18041: -- This was prompted by some discussion in https://issues.apache.org/jira/browse/HBASE-18020. One issue that came up there was 2 versus 4 space indents in python. HBase uses both currently, though there aren't that many python files. Clusterdock uses 4 spaces, hadoop uses 2 spaces, and yetis uses two spaces. [~dimaspivak] expressed a desire to use 4 spaces > Add pylintrc file to HBase > -- > > Key: HBASE-18041 > URL: https://issues.apache.org/jira/browse/HBASE-18041 > Project: HBase > Issue Type: Improvement >Reporter: Alex Leblang > > Yetis runs all commits with python files through a linter. I think that the > HBase community should add a pylintrc file to actively choose the project's > python style instead of just relying on yetis defaults. > As an argument for this, the yetis project itself doesn't even use the > default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18041) Add pylintrc file to HBase
Alex Leblang created HBASE-18041: Summary: Add pylintrc file to HBase Key: HBASE-18041 URL: https://issues.apache.org/jira/browse/HBASE-18041 Project: HBase Issue Type: Improvement Reporter: Alex Leblang Yetis runs all commits with python files through a linter. I think that the HBase community should add a pylintrc file to actively choose the project's python style instead of just relying on yetis defaults. As an argument for this, the yetis project itself doesn't even use the default python linter for its own commits. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008416#comment-16008416 ] Chia-Ping Tsai commented on HBASE-17887: I will check it asap. Thanks for the reminder. [~yuzhih...@gmail.com] > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008412#comment-16008412 ] Ted Yu commented on HBASE-17887: https://builds.apache.org/job/HBASE-Flaky-Tests/16032/testReport/junit/org.apache.hadoop.hbase/TestAcidGuarantees/testMixedAtomicity_1_/ : {code} java.lang.RuntimeException: Deferred at org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:79) at org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.waitFor(MultithreadedTestUtil.java:72) at org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:421) at org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:357) at org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:348) at org.apache.hadoop.hbase.TestAcidGuarantees.testMixedAtomicity(TestAcidGuarantees.java:461) Caused by: java.lang.ArrayIndexOutOfBoundsException: 7609 at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:1208) at org.apache.hadoop.hbase.KeyValue.toString(KeyValue.java:1153) at org.apache.hadoop.hbase.TestAcidGuarantees$AtomicScanReader.gotFailure(TestAcidGuarantees.java:334) at org.apache.hadoop.hbase.TestAcidGuarantees$AtomicScanReader.doAnAction(TestAcidGuarantees.java:309) at org.apache.hadoop.hbase.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:149) at org.apache.hadoop.hbase.MultithreadedTestUtil$TestThread.run(MultithreadedTestUtil.java:124) {code} The above failure was with commit b34ab5980ea7a21fd750537476027f9a8665eacc > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008343#comment-16008343 ] Hudson commented on HBASE-17887: SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2997 (See [https://builds.apache.org/job/HBase-Trunk_matrix/2997/]) HBASE-17887 Row-level consistency is broken for read (chia7712: rev b34ab5980ea7a21fd750537476027f9a8665eacc) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactingMemStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17938) General fault - tolerance framework for backup/restore operations
[ https://issues.apache.org/jira/browse/HBASE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-17938: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch, Vlad. > General fault - tolerance framework for backup/restore operations > - > > Key: HBASE-17938 > URL: https://issues.apache.org/jira/browse/HBASE-17938 > Project: HBase > Issue Type: Sub-task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > Attachments: HBASE-17938-v1.patch, HBASE-17938-v2.patch, > HBASE-17938-v3.patch, HBASE-17938-v4.patch, HBASE-17938-v5.patch, > HBASE-17938-v6.patch, HBASE-17938-v7.patch, HBASE-17938-v8.patch > > > The framework must take care of all general types of failures during backup/ > restore and restore system to the original state in case of a failure. > That won't solve all the possible issues but we have a separate JIRAs for > them as a sub-tasks of HBASE-15277 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17786) Create LoadBalancer perf-tests (test balancer algorithm decoupled from workload)
[ https://issues.apache.org/jira/browse/HBASE-17786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-17786: Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for this [~uagashe] (thanks for the additional review [~stack]) > Create LoadBalancer perf-tests (test balancer algorithm decoupled from > workload) > > > Key: HBASE-17786 > URL: https://issues.apache.org/jira/browse/HBASE-17786 > Project: HBase > Issue Type: Sub-task > Components: Balancer, proc-v2 >Reporter: stack >Assignee: Umesh Agashe > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-17786.001.patch, HBASE-17786.002.patch, > HBASE-17786.002.patch > > > (Below is a quote from [~mbertozzi] taken from an internal issue that I'm > moving out here) > Add perf tools and keep monitored balancer performance (a BalancerPE-type > thing). > Most of the balancers should be instantiable without requiring a > mini-cluster, and it easy to create tons of RegionInfo and ServerNames with a > for loop. > The balancer is just creating a map RegionInfo:ServerName. > There are two methods to test roundRobinAssignment() and retainAssignment() > {code} > MaproundRobinAssignment( > List regions, > List servers > ) throws HBaseIOException; > Map retainAssignment( > Map regions, > List servers > ) throws HBaseIOException; > {code} > There are a bunch of obvious optimization that everyone can see just by > looking at the code. (like replacing array with set when we do > contains/remove operations). It will be nice to have a baseline and start > improving from there. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-11013) Clone Snapshots on Secure Cluster Should provide option to apply Retained User Permissions
[ https://issues.apache.org/jira/browse/HBASE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Hu updated HBASE-11013: - Attachment: HBASE-11013.v3.patch > Clone Snapshots on Secure Cluster Should provide option to apply Retained > User Permissions > -- > > Key: HBASE-11013 > URL: https://issues.apache.org/jira/browse/HBASE-11013 > Project: HBase > Issue Type: Improvement > Components: snapshots >Reporter: Ted Yu >Assignee: Zheng Hu > Fix For: 2.0.0 > > Attachments: HBASE-11013.master.addendum.patch, HBASE-11013.v1.patch, > HBASE-11013.v2.patch, HBASE-11013.v3.patch, HBASE-11013.v3.patch > > > Currently, > {code} > sudo su - test_user > create 't1', 'f1' > sudo su - hbase > snapshot 't1', 'snap_one' > clone_snapshot 'snap_one', 't2' > {code} > In this scenario the user - test_user would not have permissions for the > clone table t2. > We need to add improvement feature such that the permissions of the original > table are recorded in snapshot metadata and an option is provided for > applying them to the new table as part of the clone process. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18040) TestBlockEvictionFromClient fails in master branch
[ https://issues.apache.org/jira/browse/HBASE-18040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008280#comment-16008280 ] Ted Yu commented on HBASE-18040: The test passes based on commit c5cc81d8e31ba76833adf25b6c357205745c23ad, the one prior to 0ae0edcd630aa1dcb6c47ea11fa4367ae0a5baa8 > TestBlockEvictionFromClient fails in master branch > -- > > Key: HBASE-18040 > URL: https://issues.apache.org/jira/browse/HBASE-18040 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu > > According to > https://builds.apache.org/job/HBASE-Flaky-Tests/16000/#showFailuresLink , the > test first failed with commit 0ae0edcd630aa1dcb6c47ea11fa4367ae0a5baa8 > On Linux, I got the following failures: > {code} > testParallelGetsAndScans(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) > Time elapsed: 3.016 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<6> > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.checkForBlockEviction(TestBlockEvictionFromClient.java:1308) > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testParallelGetsAndScans(TestBlockEvictionFromClient.java:293) > testBlockRefCountAfterSplits(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) > Time elapsed: 7.786 sec <<< FAILURE! > java.lang.AssertionError: expected:<0> but was:<1> > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.iterateBlockCache(TestBlockEvictionFromClient.java:1215) > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testBlockRefCountAfterSplits(TestBlockEvictionFromClient.java:607) > testParallelGetsAndScanWithWrappedRegionScanner(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) > Time elapsed: 2.631 sec <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<6> > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.checkForBlockEviction(TestBlockEvictionFromClient.java:1322) > at > org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testParallelGetsAndScanWithWrappedRegionScanner(TestBlockEvictionFromClient.java:839) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008261#comment-16008261 ] Hudson commented on HBASE-17887: SUCCESS: Integrated in Jenkins build HBase-1.4 #733 (See [https://builds.apache.org/job/HBase-1.4/733/]) HBASE-17887 Row-level consistency is broken for read (chia7712: rev f81486445c072096022cca77eb0a53f1594ff204) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultMemStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreChunkPool.java > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18032) Hbase MOB
[ https://issues.apache.org/jira/browse/HBASE-18032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008262#comment-16008262 ] Jean-Marc Spaggiari commented on HBASE-18032: - C'est passé cette fois. On peut abandonner la conversation ici. > Hbase MOB > - > > Key: HBASE-18032 > URL: https://issues.apache.org/jira/browse/HBASE-18032 > Project: HBase > Issue Type: Task > Components: mob >Affects Versions: hbase-11339 > Environment: debian >Reporter: Fred T. > > Hi all, > I spent a lot of of time to try to use MOB in Hbase (1.2.3). I read every > where that the patch 11339 can fix it but I can't find help on how to install > this patch. Also, Apache official web site refers to a HBASE.2.0.0 version. I > can't find it anywhere. So please help me to be able to MOB > Thanks for your help -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008256#comment-16008256 ] Hudson commented on HBASE-17887: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK8 #176 (See [https://builds.apache.org/job/HBase-1.3-JDK8/176/]) HBASE-17887 Row-level consistency is broken for read (chia7712: rev 72edf521c1effe3afe6ce6b39aaf843b8651a4a6) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreChunkPool.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultMemStore.java > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18032) Hbase MOB
[ https://issues.apache.org/jira/browse/HBASE-18032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008253#comment-16008253 ] Fred T. commented on HBASE-18032: - Salut, Je l'ai posté de nouveau à l'instant. Fred > Hbase MOB > - > > Key: HBASE-18032 > URL: https://issues.apache.org/jira/browse/HBASE-18032 > Project: HBase > Issue Type: Task > Components: mob >Affects Versions: hbase-11339 > Environment: debian >Reporter: Fred T. > > Hi all, > I spent a lot of of time to try to use MOB in Hbase (1.2.3). I read every > where that the patch 11339 can fix it but I can't find help on how to install > this patch. Also, Apache official web site refers to a HBASE.2.0.0 version. I > can't find it anywhere. So please help me to be able to MOB > Thanks for your help -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18040) TestBlockEvictionFromClient fails in master branch
Ted Yu created HBASE-18040: -- Summary: TestBlockEvictionFromClient fails in master branch Key: HBASE-18040 URL: https://issues.apache.org/jira/browse/HBASE-18040 Project: HBase Issue Type: Bug Reporter: Ted Yu According to https://builds.apache.org/job/HBASE-Flaky-Tests/16000/#showFailuresLink , the test first failed with commit 0ae0edcd630aa1dcb6c47ea11fa4367ae0a5baa8 On Linux, I got the following failures: {code} testParallelGetsAndScans(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) Time elapsed: 3.016 sec <<< FAILURE! java.lang.AssertionError: expected:<3> but was:<6> at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.checkForBlockEviction(TestBlockEvictionFromClient.java:1308) at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testParallelGetsAndScans(TestBlockEvictionFromClient.java:293) testBlockRefCountAfterSplits(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) Time elapsed: 7.786 sec <<< FAILURE! java.lang.AssertionError: expected:<0> but was:<1> at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.iterateBlockCache(TestBlockEvictionFromClient.java:1215) at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testBlockRefCountAfterSplits(TestBlockEvictionFromClient.java:607) testParallelGetsAndScanWithWrappedRegionScanner(org.apache.hadoop.hbase.client.TestBlockEvictionFromClient) Time elapsed: 2.631 sec <<< FAILURE! java.lang.AssertionError: expected:<3> but was:<6> at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.checkForBlockEviction(TestBlockEvictionFromClient.java:1322) at org.apache.hadoop.hbase.client.TestBlockEvictionFromClient.testParallelGetsAndScanWithWrappedRegionScanner(TestBlockEvictionFromClient.java:839) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17707) New More Accurate Table Skew cost function/generator
[ https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-17707: --- Status: Open (was: Patch Available) > New More Accurate Table Skew cost function/generator > > > Key: HBASE-17707 > URL: https://issues.apache.org/jira/browse/HBASE-17707 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.0 > Environment: CentOS Derivative with a derivative of the 3.18.43 > kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches. >Reporter: Kahlil Oppenheimer >Assignee: Kahlil Oppenheimer >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, > HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, > HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, > HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, > HBASE-17707-11.patch, HBASE-17707-12.patch, test-balancer2-13617.out > > > This patch includes new version of the TableSkewCostFunction and a new > TableSkewCandidateGenerator. > The new TableSkewCostFunction computes table skew by counting the minimal > number of region moves required for a given table to perfectly balance the > table across the cluster (i.e. as if the regions from that table had been > round-robin-ed across the cluster). This number of moves is computer for each > table, then normalized to a score between 0-1 by dividing by the number of > moves required in the absolute worst case (i.e. the entire table is stored on > one server), and stored in an array. The cost function then takes a weighted > average of the average and maximum value across all tables. The weights in > this average are configurable to allow for certain users to more strongly > penalize situations where one table is skewed versus where every table is a > little bit skewed. To better spread this value more evenly across the range > 0-1, we take the square root of the weighted average to get the final value. > The new TableSkewCandidateGenerator generates region moves/swaps to optimize > the above TableSkewCostFunction. It first simply tries to move regions until > each server has the right number of regions, then it swaps regions around > such that each region swap improves table skew across the cluster. > We tested the cost function and generator in our production clusters with > 100s of TBs of data and 100s of tables across dozens of servers and found > both to be very performant and accurate. -- This message was sent by Atlassian JIRA (v6.3.15#6346)