[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841012#comment-16841012
 ] 

Hudson commented on HBASE-20305:


Results for branch branch-2
[build #1894 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1894/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1894//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1894//console].


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1894//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1894//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 2.1.5, 2.2.1
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840084#comment-16840084
 ] 

Hudson commented on HBASE-20305:


Results for branch branch-2.2
[build #260 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/260/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/260//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/260//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/260//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/260//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 2.1.5, 2.2.1
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840048#comment-16840048
 ] 

Hudson commented on HBASE-20305:


Results for branch branch-2.1
[build #1143 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1143/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1143//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1143//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1143//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 2.1.5, 2.2.1
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839987#comment-16839987
 ] 

Hudson commented on HBASE-20305:


Results for branch branch-1.4
[build #796 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/796/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/796//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/796//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/796//console].




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 2.1.5, 2.2.1
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839960#comment-16839960
 ] 

Hudson commented on HBASE-20305:


Results for branch branch-1
[build #824 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/824/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/824//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/824//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/824//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 2.1.5, 2.2.1
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-14 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839829#comment-16839829
 ] 

Andrew Purtell commented on HBASE-20305:


Thanks, committing now

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-13 Thread HBase QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838620#comment-16838620
 ] 

HBase QA commented on HBASE-20305:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} branch-1 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} branch-1 passed with JDK v1.7.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
45s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-1 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-1 passed with JDK v1.7.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed with JDK v1.7.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
22s{color} | {color:red} hbase-server: The patch generated 2 new + 10 unchanged 
- 4 fixed = 12 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.7.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}127m  7s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.coprocessor.TestMetaTableMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HBASE-Build/296/artifact/patchprocess/Dockerfile
 |
| JIRA Issue | HBASE-20305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968547/HBASE-20305.branch-1.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  fi

[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-05-13 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838512#comment-16838512
 ] 

Wellington Chevreuil commented on HBASE-20305:
--

Hi [~apurtell], sorry for the late reply. Found a surge in demand for this 
feature on earlier release, then realised this only went to master branch. 
Attached patches for branch-2 and branch-1, respectively.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.branch-1.001.patch, HBASE-20305.branch-2.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2019-04-11 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815617#comment-16815617
 ] 

Andrew Purtell commented on HBASE-20305:


Any progress here? Or unschedule it? Or close it?

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542248#comment-16542248
 ] 

Andrew Purtell commented on HBASE-20305:


No concerns here

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-07-12 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541892#comment-16541892
 ] 

Sean Busbey commented on HBASE-20305:
-

tentatively adding to scope of next minor releases. If I don't hear an 
objection I'll backport this later this week.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430073#comment-16430073
 ] 

Hudson commented on HBASE-20305:


Results for branch HBASE-19064
[build #90 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426890#comment-16426890
 ] 

Hudson commented on HBASE-20305:


Results for branch master
[build #284 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/284/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/284//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/284//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/284//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-05 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426639#comment-16426639
 ] 

Wellington Chevreuil commented on HBASE-20305:
--

Thanks for reviewing it [~yuzhih...@gmail.com] [~davelatham] !

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425932#comment-16425932
 ] 

Ted Yu commented on HBASE-20305:


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.2:install (default-install) 
on project hbase-thrift: Failed to install metadata 
org.apache.hbase:hbase-thrift:3.0.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hbase/hbase-thrift/3.0.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got / (position: END_TAG 
seen ...\n/... @25:2)  -> [Help 1]
{code}
The above occurred in other QA runs too - not related to the patch.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-04 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425900#comment-16425900
 ] 

Dave Latham commented on HBASE-20305:
-

+1 assuming the hadoopcheck errors are worked out.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424548#comment-16424548
 ] 

Ted Yu commented on HBASE-20305:


lgtm

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424450#comment-16424450
 ] 

Hadoop QA commented on HBASE-20305:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} hbase-mapreduce: The patch generated 0 new + 7 
unchanged - 1 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  7m  
2s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.5. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  9m 
11s{color} | {color:red} The patch causes 10 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 11m 
35s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
20s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917413/HBASE-20305.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 129c4a6a4fc9 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / bf29a1fee9 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| hadoopcheck | 
https://builds.apache.org/j

[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-03 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424382#comment-16424382
 ] 

Wellington Chevreuil commented on HBASE-20305:
--

Added new patch with the suggested the changes and additional tests for doPuts. 

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-20305.master.001.patch, 
> HBASE-20305.master.002.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-04-03 Thread Wellington Chevreuil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424058#comment-16424058
 ] 

Wellington Chevreuil commented on HBASE-20305:
--

Thanks for the review and suggestions [~davelatham]! Indeed, *doDeletes* sounds 
more intuitive. I'm gonna work on this and the additional *doPuts,* together 
with reviews for checkstyle violations.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-20305.master.001.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-03-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420735#comment-16420735
 ] 

Hadoop QA commented on HBASE-20305:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} hbase-mapreduce: The patch generated 1 new + 7 
unchanged - 1 fixed = 8 total (was 8) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
22m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
30s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20305 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916692/0001-HBASE-20305.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 50812b01c7f5 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 
14:24:03 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d57001ee2d |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12245/artifact/patchprocess/diff-checkstyle-hbase-mapreduce.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12245/testReport/ |
| Max. process+thread count | 4987 (vs. ulimit of 1) |
| modules | C: hbase-mapreduce U: hbase-mapreduce |
|

[jira] [Commented] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster

2018-03-30 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420663#comment-16420663
 ] 

Dave Latham commented on HBASE-20305:
-

Thanks for the patch, Wellington.

Looks like it also fixes a nasty bug where dryRun doesn't actually appear to 
work.   I hope that did not bite you in using the tool.

I think I'd prefer calling the option something like doDeletes (default true) 
rather than insertsOnly (default false).  Could also have a similar option for 
doPuts to allow people to do deletes but not puts if they preferred.

+0.9 as is.

I'm going to hit the Submit Patch button to try to get Hadoop QA to take a pass.

> Add option to SyncTable that skip deletes on target cluster
> ---
>
> Key: HBASE-20305
> URL: https://issues.apache.org/jira/browse/HBASE-20305
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0-alpha-4
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: 0001-HBASE-20305.master.001.patch
>
>
> We had a situation where two clusters with active-active replication got out 
> of sync, but both had data that should be kept. The tables in question never 
> have data deleted, but ingestion had happened on the two different clusters, 
> some rows had been even updated.
> In this scenario, a cell that is present in one of the table clusters should 
> not be deleted, but replayed on the other. Also, for cells with same 
> identifier but different values, the most recent value should be kept. 
> Current version of SyncTable would not be applicable here, because it would 
> simply copy the whole state from source to target, then losing any additional 
> rows that might be only in target, as well as cell values that got most 
> recent update. This could be solved by adding an option to skip deletes for 
> SyncTable. This way, the additional cells not present on source would still 
> be kept. For cells with same identifier but different values, it would just 
> perform a Put for the cell version from source, but client scans would still 
> fetch the most recent timestamp.
> I'm attaching a patch with this additional option shortly. Please share your 
> thoughts.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)