[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-30 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.21.txt

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-30 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668830#comment-16668830
 ] 

Ted Yu commented on HBASE-21246:


Thanks for detailed comments, Josh.

bq. This still banks on the assumption that recovered.edits are always on the 
storefile FS?

To be more precise, the assumption is that recovered.edits are always on the 
FileSystem specified by hbase.wal.dir config.
See HBASE-20734

bq. Is there a reason to keep the static "helper" methods around?

The number of static methods for creating Reader has been reduced to 1.
Without this, many classes would have to create WALFactory, get hold of 
WALProvider and call createReader on the provider.
I think having 1 static method is equivalent to the above.

bq. You couldn't possibly know that this is a SyncReplicationWALProvider?

The getFullPath and getWalFromArchivePath methods are only defined by 
SyncReplicationWALProvider.
Let me think more about how to abstract these methods. Probably when 
replication part of the design doc is refreshed, we would have different 
perspectives.

bq. Missed push-down into WALProvider

Looking at the init method for the FSHLogProvider.Writer:
{code}
  public void init(FileSystem fs, Path path, Configuration conf, boolean 
overwritable,
  long blocksize) throws IOException, StreamLacksCapabilityException {
{code}
There are several things specific to distributed FileSystem (such as blocksize).

StreamLacksCapabilityException extends Exception. Probably it can extend IOE so 
that the init method can declare to throw IOE, instead of exception tied to 
distributed FileSystem.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (RATIS-377) Tolerate partially written log header

2018-10-29 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/RATIS-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667813#comment-16667813
 ] 

Ted Yu commented on RATIS-377:
--

In LogSegment#loadSegment :
{code}
if (entryCount == 0) {
  // The segment does not have any entries, delete the file.
  FileUtils.deleteFile(file);
{code}
Call to deleteFile may throw IOE.
However, I think for loadSegment, such IOE can be caught and logged without 
surfacing to the caller.

> Tolerate partially written log header
> -
>
> Key: RATIS-377
> URL: https://issues.apache.org/jira/browse/RATIS-377
> Project: Ratis
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: Nilotpal Nandi
>Assignee: Tsz Wo Nicholas Sze
>Priority: Blocker
> Fix For: 0.3.0
>
> Attachments: r377_20181028c.patch
>
>
> steps taken :
> --
>  # wrote 5GB files through ozonefs
>  # stopped datanodes, scm , om.
>  # started all services.
>  # Tried to read the file.
> One of the datanodes failed to start. Throwing 
> "java.lang.IllegalStateException: Corrupted log header" 
>  
> {noformat}
> 2018-10-26 10:26:01,317 ERROR org.apache.ratis.server.storage.LogInputStream: 
> caught exception initializing log_inprogress_293
> java.lang.IllegalStateException: Corrupted log header: ^@^@^@^@^@^@^@^@
>  at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
>  at 
> org.apache.ratis.server.storage.LogInputStream.init(LogInputStream.java:93)
>  at 
> org.apache.ratis.server.storage.LogInputStream.nextEntry(LogInputStream.java:120)
>  at 
> org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:111)
>  at 
> org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:133)
>  at 
> org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:110)
>  at 
> org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:151)
>  at 
> org.apache.ratis.server.storage.SegmentedRaftLog.open(SegmentedRaftLog.java:120)
>  at org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:191)
>  at org.apache.ratis.server.impl.ServerState.(ServerState.java:114)
>  at 
> org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:106)
>  at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:196)
>  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
>  at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>  at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>  at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>  at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> 2018-10-26 10:26:03,671 INFO 
> org.apache.hadoop.ozone.web.netty.ObjectStoreRestHttpServer: Listening HDDS 
> REST traffic on /0.0.0.0:9880
> 2018-10-26 10:26:03,672 INFO org.apache.hadoop.ozone.HddsDatanodeService: 
> Started plug-in org.apache.hadoop.ozone.web.OzoneHddsDatanodeService@1e411d81
> 2018-10-26 10:26:03,676 INFO 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer: Attempting to 
> start container services.
> 2018-10-26 10:26:03,676 INFO 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
>  Starting XceiverServerRatis 0d7f5327-df16-40fe-ac88-7ed06e76a20f at port 9858
> 2018-10-26 10:26:03,702 ERROR 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine: 
> Unable to start the DatanodeState Machine
> java.io.IOException: java.lang.IllegalStateException: Corrupted log header: 
> ^@^@^@^@^@^@^@^@
>  at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:51)
>  at 
> org.apache.ratis.server.storage.LogInputStream.nextEntry(LogInputStream.java:123)
>  at 
> org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:111)
>  at 
> org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:133)
>  at 
> org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:110)
>  at 
> org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:151)
>  at 
> org.apache.ratis.server.storage.SegmentedRaftLog.open(SegmentedRaftLog.java:120)
>  at org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:191)
>  at org.apache.ratis.server.impl.ServerState.(ServerState.java

[jira] [Created] (RATIS-380) RaftGroup#hashCode should take peers into account

2018-10-29 Thread Ted Yu (JIRA)
Ted Yu created RATIS-380:


 Summary: RaftGroup#hashCode should take peers into account
 Key: RATIS-380
 URL: https://issues.apache.org/jira/browse/RATIS-380
 Project: Ratis
  Issue Type: Improvement
Reporter: Ted Yu


Currently RaftGroup#hashCode produces hash code only based on groupId.

This is unsymmetrical with {{equals}} where peers is also considered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-29 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667748#comment-16667748
 ] 

Ted Yu commented on HBASE-21247:


Gentle ping [~busbey]

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21180.

Resolution: Cannot Reproduce

> findbugs incurs DataflowAnalysisException for hbase-server module
> -
>
> Key: HBASE-21180
> URL: https://issues.apache.org/jira/browse/HBASE-21180
> Project: HBase
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Minor
>
> Running findbugs, I noticed the following in hbase-server module:
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server 
> ---
> [INFO] Fork Value is true
>  [java] The following errors occurred during analysis:
>  [java]   Error generating derefs for 
> org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
>  [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
> position -1 of stack
>  [java]   At 
> edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
>  [java]   At 
> edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
>  [java]   At 
> edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
>  [java]   At 
> edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
>  [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
>  [java] The following classes needed for analysis were missing:
>  [java]   accept
>  [java]   apply
>  [java]   run
>  [java]   test
>  [java]   call
>  [java]   exec
>  [java]   getAsInt
>  [java]   applyAsLong
>  [java]   storeFile
>  [java]   get
>  [java]   visit
>  [java]   compare
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21180) findbugs incurs DataflowAnalysisException for hbase-server module

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21180.

Resolution: Cannot Reproduce

> findbugs incurs DataflowAnalysisException for hbase-server module
> -
>
> Key: HBASE-21180
> URL: https://issues.apache.org/jira/browse/HBASE-21180
> Project: HBase
>  Issue Type: Task
>    Reporter: Ted Yu
>Priority: Minor
>
> Running findbugs, I noticed the following in hbase-server module:
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.4:findbugs (default-cli) @ hbase-server 
> ---
> [INFO] Fork Value is true
>  [java] The following errors occurred during analysis:
>  [java]   Error generating derefs for 
> org.apache.hadoop.hbase.generated.master.table_jsp._jspService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
>  [java] edu.umd.cs.findbugs.ba.DataflowAnalysisException: can't get 
> position -1 of stack
>  [java]   At 
> edu.umd.cs.findbugs.ba.Frame.getStackValue(Frame.java:250)
>  [java]   At 
> edu.umd.cs.findbugs.ba.Hierarchy.resolveMethodCallTargets(Hierarchy.java:743)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.DerefFinder.getAnalysis(DerefFinder.java:141)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:50)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.engine.bcel.UsagesRequiringNonNullValuesFactory.analyze(UsagesRequiringNonNullValuesFactory.java:31)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.analyzeMethod(AnalysisCache.java:369)
>  [java]   At 
> edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getMethodAnalysis(AnalysisCache.java:322)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getMethodAnalysis(ClassContext.java:1005)
>  [java]   At 
> edu.umd.cs.findbugs.ba.ClassContext.getUsagesRequiringNonNullValues(ClassContext.java:325)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.foundGuaranteedNullDeref(FindNullDeref.java:1510)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.reportBugs(NullDerefAndRedundantComparisonFinder.java:361)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.examineNullValues(NullDerefAndRedundantComparisonFinder.java:266)
>  [java]   At 
> edu.umd.cs.findbugs.ba.npe.NullDerefAndRedundantComparisonFinder.execute(NullDerefAndRedundantComparisonFinder.java:164)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.analyzeMethod(FindNullDeref.java:278)
>  [java]   At 
> edu.umd.cs.findbugs.detect.FindNullDeref.visitClassContext(FindNullDeref.java:209)
>  [java]   At 
> edu.umd.cs.findbugs.DetectorToDetector2Adapter.visitClass(DetectorToDetector2Adapter.java:76)
>  [java]   At 
> edu.umd.cs.findbugs.FindBugs2.analyzeApplication(FindBugs2.java:1089)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:283)
>  [java]   At edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:393)
>  [java]   At edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1200)
>  [java] The following classes needed for analysis were missing:
>  [java]   accept
>  [java]   apply
>  [java]   run
>  [java]   test
>  [java]   call
>  [java]   exec
>  [java]   getAsInt
>  [java]   applyAsLong
>  [java]   storeFile
>  [java]   get
>  [java]   visit
>  [java]   compare
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.20.txt

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: (was: 21246.20.txt)

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-29 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667447#comment-16667447
 ] 

Ted Yu commented on HBASE-21246:


Patch v20 is intended to show how WALIdentity is used in various WAL related 
classes for both read and write path.
Changes to test code were not included so that it is easier for reviewers to 
review.

Note: in WALFactory, there is no longer import of FileSystem or Path.

I can put v20 on reviewboard if needed.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Status: Open  (was: Patch Available)

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-29 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.20.txt

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-29 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667300#comment-16667300
 ] 

Ted Yu commented on HBASE-21379:


Was ROW_INDEX_V1 used in any of the related table(s) ?

Thanks

> RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication 
> enabled
> ---
>
> Key: HBASE-21379
> URL: https://issues.apache.org/jira/browse/HBASE-21379
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: justice
>Assignee: Zheng Hu
>Priority: Major
> Attachments: hbase-wal-p.tgz, log.tgz, wal-20181026.tgz, wal.tgz
>
>
>  log as follow:
> {code:java}
> //代码占位符
> 2018-10-24 09:22:42,381 INFO  [regionserver/11-3-19-10:16020] 
> wal.AbstractFSWAL: New WAL 
> /hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>          │
> 2018-10-24 09:23:05,151 ERROR 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.ReplicationSource: Unexpected exception in 
> regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540│
> 344155469,2 
> currentPath=hdfs://11-3-18-67.JD.LOCAL:9000/hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>  │
> java.lang.ArrayIndexOutOfBoundsException: 8830 │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) ┤
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:114) │
> at 
> org.apache.hadoop.hbase.replication.ScopeWALEntryFilter.filterCell(ScopeWALEntryFilter.java:54)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filterCells(ChainWALEntryFilter.java:90)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filter(ChainWALEntryFilter.java:77)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.filterEntry(ReplicationSourceWALReader.java:234)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:170)
>  │ at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:133)
>  │
> 2018-10-24 09:23:05,153 INFO 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.HRegionServer: * STOPPING region server 
> '11-3-19-10.jd.local,16020,1540344155469' *
> {code}
> hbase wal -p output
> {code:java}
> //代码占位符
> writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
> Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
> Sequence=15 , region=fee7a9465ced6ce9e319d37e9d71c63c at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=8000, column=METAFAMILY:HBASE::REGION_EVENT
> value: \x08\x00\x12\x1Cmlaas:ump_host_second_181029\x1A 
> fee7a9465ced6ce9e319d37e9d71c63c 
> \x0E*\x06\x0A\x01f\x12\x01f2\x1F\x0A\x1311-3-19-10.JD.LOCAL\x10\x94}\x18\xCD\x9A\xAA\x9D\xEA,:Umlaas:ump_host_second_181029,8000,1540271129253.fee7a9465ced6ce9e319d37e9d71c63c.
> Sequence=9 , region=ba6684888d826328a6373435124dc1cd at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=9100, column=METAFAMILY:HBASE::REGION_EVENT
> ...
> row=34975#00, column=f:\x09,
> value: 
> {"tp50":1,"avg":2,"min":0,"tp90":1,"max":3,"count":13,"tp99":2,"tp999":2,"error":0}
> row=349824#00, column=f:\x08\xFA
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":98,"count":957,"tp99":3,"tp999":34,"error":0}
> row=349824#00, column=f:\x08\xD2
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":43,"count":1842,"tp99":2,"tp999":31,"error":0}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8830
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:336)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:290)
> at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:421)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:356)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: SIGBUS (0xa) when using DataFrameWriter.insertInto

2018-10-27 Thread Ted Yu
I don't seem to find the log.
Can you double check ?
Thanks
 Original message From: alexzautke 
 Date: 10/27/18  8:54 AM  (GMT-08:00) To: 
user@spark.apache.org Subject: Re: SIGBUS (0xa) when using 
DataFrameWriter.insertInto 
Please also find attached a complete error log.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21175:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Artem.

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Fix For: 3.0.0
>
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v07.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665402#comment-16665402
 ] 

Ted Yu commented on HBASE-21175:


Very close.
{code}
+   * @throws IOException throws an IOException if there's problem creating a 
table
{code}
Is the description accurate ?
Is there other scenario where IOE is thrown ?
{code}
+// Avoid passing a null master to CleanerChore, see HBASE-21175
{code}
nit:
Please move the start of comment two spaces to the left.



> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664301#comment-16664301
 ] 

Ted Yu commented on HBASE-21175:


lgtm

Pending QA

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: 21387.v1.txt

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: (was: 21387.v1.txt)

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky

2018-10-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21216:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Dup of HBASE-21387

> TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
> --
>
> Key: HBASE-21216
> URL: https://issues.apache.org/jira/browse/HBASE-21216
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21216.v1.txt
>
>
> From 
> https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/
>  :
> {code}
> java.lang.AssertionError: Archived hfiles [] and table hfiles 
> [9ca09392705f425f9c916beedc10d63c] is missing snapshot 
> file:6739a09747e54189a4112a6d8f37e894
>   at 
> org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370)
> {code}
> The file appeared in archive dir before hfile cleaners were run:
> {code}
> 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-archive/
> 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |data/
> 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |---default/
> 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |--test/
> 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-1237d57b63a7bdf067a930441a02514a/
> 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |recovered.edits/
> 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---4.seqid
> 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-29e1700e09b51223ad2f5811105a4d51/
> 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |fam/
> 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---2c66a18f6c1a4074b84ffbb3245268c4
> 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---45bb396c6a5e49629e45a4d56f1e9b14
> 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---6739a09747e54189a4112a6d8f37e894
> {code}
> However, the archive dir became empty after hfile cleaners were run:
> {code}
> 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-archive/
> 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-corrupt/
> {code}
> Leading to the assertion failure.
> This test is one of the top flaky tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Status: Patch Available  (was: Open)

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: 21387.v1.txt

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21387:
--

 Summary: Race condition in snapshot cache refreshing leads to loss 
of snapshot files
 Key: HBASE-21387
 URL: https://issues.apache.org/jira/browse/HBASE-21387
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


During recent report from customer where ExportSnapshot failed:
{code}
2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
snapshot.SnapshotReferenceUtil: Can't find hfile: 
44f6c3c646e84de6a63fe30da4fcb3aa in the real 
(hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 or archive 
(hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 directory for the primary table. 
{code}
We found the following in log:
{code}
2018-10-09 18:54:23,675 DEBUG 
[00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
cleaner.HFileCleaner: Removing: 
hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
from archive
{code}
The root cause is race condition surrounding SnapshotFileCache#refreshCache().
There are two callers of refreshCache: one from RefreshCacheTask#run and the 
other from SnapshotHFileCleaner.
Let's look at the code of refreshCache:
{code}
// if the snapshot directory wasn't modified since we last check, we are 
done
if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;

// 1. update the modified time
this.lastModifiedTime = dirStatus.getModificationTime();

// 2.clear the cache
this.cache.clear();
{code}
Suppose the RefreshCacheTask runs past the if check and sets 
this.lastModifiedTime
The cleaner executes refreshCache and returns immediately since 
this.lastModifiedTime matches the modification time of the directory.
Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
lookup, the cache is empty.
Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21387:
--

 Summary: Race condition in snapshot cache refreshing leads to loss 
of snapshot files
 Key: HBASE-21387
 URL: https://issues.apache.org/jira/browse/HBASE-21387
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


During recent report from customer where ExportSnapshot failed:
{code}
2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
snapshot.SnapshotReferenceUtil: Can't find hfile: 
44f6c3c646e84de6a63fe30da4fcb3aa in the real 
(hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 or archive 
(hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 directory for the primary table. 
{code}
We found the following in log:
{code}
2018-10-09 18:54:23,675 DEBUG 
[00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
cleaner.HFileCleaner: Removing: 
hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
from archive
{code}
The root cause is race condition surrounding SnapshotFileCache#refreshCache().
There are two callers of refreshCache: one from RefreshCacheTask#run and the 
other from SnapshotHFileCleaner.
Let's look at the code of refreshCache:
{code}
// if the snapshot directory wasn't modified since we last check, we are 
done
if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;

// 1. update the modified time
this.lastModifiedTime = dirStatus.getModificationTime();

// 2.clear the cache
this.cache.clear();
{code}
Suppose the RefreshCacheTask runs past the if check and sets 
this.lastModifiedTime
The cleaner executes refreshCache and returns immediately since 
this.lastModifiedTime matches the modification time of the directory.
Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
lookup, the cache is empty.
Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20966) RestoreTool#getTableInfoPath should look for completed snapshot only

2018-10-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663904#comment-16663904
 ] 

Ted Yu commented on HBASE-20966:


RestoreTool is in hbase-backup module which is in master only.

> RestoreTool#getTableInfoPath should look for completed snapshot only
> 
>
> Key: HBASE-20966
> URL: https://issues.apache.org/jira/browse/HBASE-20966
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 20966.v1.txt
>
>
> [~gubjanos] reported seeing the following error when running backup / restore 
> test on Azure:
> {code}
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
>  Couldn't read snapshot info 
> from:wasb://hbase3-m30wub1711kond-115...@humbtesting8wua.blob.core.windows.net/user/hbase/backup_loc/backup_1532538064246/default/table_fnfawii1za/.hbase-snapshot/.tmp/.
> snapshotinfo
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:328)
> 2018-07-25 17:03:56,661|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.getTableDesc(RestoreServerUtil.java:237)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.restoreTableAndCreate(RestoreServerUtil.java:351)
> 2018-07-25 17:03:56,662|INFO|MainThread|machine.py:167 - 
> run()||GUID=e7de7672-ebfd-402d-8f1f-68e7e8444cb1|at 
> org.apache.hadoop.hbase.backup.util.RestoreServerUtil.fullRestoreTable(RestoreServerUtil.java:186)
> {code}
> Here is related code in master branch:
> {code}
>   Path getTableInfoPath(TableName tableName) throws IOException {
> Path tableSnapShotPath = getTableSnapshotPath(backupRootPath, tableName, 
> backupId);
> Path tableInfoPath = null;
> // can't build the path directly as the timestamp values are different
> FileStatus[] snapshots = fs.listStatus(tableSnapShotPath);
> {code}
> In the above code, we don't exclude incomplete snapshot, leading to exception 
> later when reading snapshot info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21318:
---
   Resolution: Fixed
Fix Version/s: 1.5.0
   Status: Resolved  (was: Patch Available)

> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21318.branch-1.001.patch, 
> HBASE-21318.master.001.patch, HBASE-21318.master.002.patch, 
> HBASE-21318.master.003.patch, HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-13468) hbase.zookeeper.quorum supports ipv6 address

2018-10-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662816#comment-16662816
 ] 

Ted Yu commented on HBASE-13468:


lgtm

[~mdrob]:
Mind taking another look ?

> hbase.zookeeper.quorum supports ipv6 address
> 
>
> Key: HBASE-13468
> URL: https://issues.apache.org/jira/browse/HBASE-13468
> Project: HBase
>  Issue Type: Bug
>Reporter: Mingtao Zhang
>Assignee: maoling
>Priority: Major
> Attachments: HBASE-13468.master.001.patch, 
> HBASE-13468.master.002.patch, HBASE-13468.master.003.patch, 
> HBASE-13468.master.004.patch
>
>
> I put ipv6 address in hbase.zookeeper.quorum, by the time this string went to 
> zookeeper code, the address is messed up, i.e. only '[1234' left. 
> I started using pseudo mode with embedded zk = true.
> I downloaded 1.0.0, not sure which affected version should be here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21318:
---
Fix Version/s: 2.2.0

> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662591#comment-16662591
 ] 

Ted Yu commented on HBASE-21318:


You can attach branch-1 patch for QA run here.

> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21318:


> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21318:


> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] 2.1.0 RC0

2018-10-24 Thread Ted Yu
+1

InternalTopicIntegrationTest failed during test suite run but passed with
rerun.

On Wed, Oct 24, 2018 at 3:48 AM Andras Beni 
wrote:

> +1 (non-binding)
>
> Verified signatures and checksums of release artifacts
> Performed quickstart steps on rc artifacts (both scala 2.11 and 2.12) and
> one built from tag 2.1.0-rc0
>
> Andras
>
> On Wed, Oct 24, 2018 at 10:17 AM Dong Lin  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for feature release of Apache Kafka 2.1.0.
> >
> > This is a major version release of Apache Kafka. It includes 28 new KIPs
> > and
> >
> > critical bug fixes. Please see the Kafka 2.1.0 release plan for more
> > details:
> >
> > *
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044*
> > <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044
> > >
> >
> > Here are a few notable highlights:
> >
> > - Java 11 support
> > - Support for Zstandard, which achieves compression comparable to gzip
> with
> > higher compression and especially decompression speeds(KIP-110)
> > - Avoid expiring committed offsets for active consumer group (KIP-211)
> > - Provide Intuitive User Timeouts in The Producer (KIP-91)
> > - Kafka's replication protocol now supports improved fencing of zombies.
> > Previously, under certain rare conditions, if a broker became partitioned
> > from Zookeeper but not the rest of the cluster, then the logs of
> replicated
> > partitions could diverge and cause data loss in the worst case (KIP-320)
> > - Streams API improvements (KIP-319, KIP-321, KIP-330, KIP-353, KIP-356)
> > - Admin script and admin client API improvements to simplify admin
> > operation (KIP-231, KIP-308, KIP-322, KIP-324, KIP-338, KIP-340)
> > - DNS handling improvements (KIP-235, KIP-302)
> >
> > Release notes for the 2.1.0 release:
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote ***
> >
> > * Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~lindong/kafka-2.1.0-rc0/javadoc/
> >
> > * Tag to be voted upon (off 2.1 branch) is the 2.1.0-rc0 tag:
> > https://github.com/apache/kafka/tree/2.1.0-rc0
> >
> > * Documentation:
> > *http://kafka.apache.org/21/documentation.html*
> > 
> >
> > * Protocol:
> > http://kafka.apache.org/21/protocol.html
> >
> > * Successful Jenkins builds for the 2.1 branch:
> > Unit/integration tests: *
> https://builds.apache.org/job/kafka-2.1-jdk8/38/
> > *
> >
> > Please test and verify the release artifacts and submit a vote for this
> RC,
> > or report any issues so we can fix them and get a new RC out ASAP.
> Although
> > this release vote requires PMC votes to pass, testing, votes, and bug
> > reports are valuable and appreciated from everyone.
> >
> > Cheers,
> > Dong
> >
>


[jira] [Updated] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21318:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch.

> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch, 
> HBASE-21318.master.004.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662514#comment-16662514
 ] 

Ted Yu commented on HBASE-21149:


The test would pass for hadoop 2 build:
{code}
2.7.7
{code}
I think we can keep the test.
After HADOOP-15850 is in released 3.x version, we can upgrade the 
hadoop-three.version to catch up.

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21149.

   Resolution: Duplicate
Fix Version/s: (was: 3.0.0)

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662451#comment-16662451
 ] 

Ted Yu commented on HBASE-21149:


Logged HBASE-21381 for refguide work.

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-21149.

   Resolution: Duplicate
Fix Version/s: (was: 3.0.0)

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21381) Document the hadoop versions using which backup and restore feature works

2018-10-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21381:
--

 Summary: Document the hadoop versions using which backup and 
restore feature works
 Key: HBASE-21381
 URL: https://issues.apache.org/jira/browse/HBASE-21381
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


HADOOP-15850 fixes a bug where CopyCommitter#concatFileChunks unconditionally 
tried to concatenate the files being DistCp'ed to target cluster (though the 
files are independent).

Following is the log snippet of the failed concatenation attempt:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
   
2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
{code}
Backup and Restore uses DistCp to transfer files between clusters.
Without the fix from HADOOP-15850, the transfer would fail.

This issue is to document the hadoop versions which contain HADOOP-15850 so 
that user of Backup and Restore feature knows which hadoop versions they can 
use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21381) Document the hadoop versions using which backup and restore feature works

2018-10-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21381:
--

 Summary: Document the hadoop versions using which backup and 
restore feature works
 Key: HBASE-21381
 URL: https://issues.apache.org/jira/browse/HBASE-21381
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


HADOOP-15850 fixes a bug where CopyCommitter#concatFileChunks unconditionally 
tried to concatenate the files being DistCp'ed to target cluster (though the 
files are independent).

Following is the log snippet of the failed concatenation attempt:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
   
2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
{code}
Backup and Restore uses DistCp to transfer files between clusters.
Without the fix from HADOOP-15850, the transfer would fail.

This issue is to document the hadoop versions which contain HADOOP-15850 so 
that user of Backup and Restore feature knows which hadoop versions they can 
use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662407#comment-16662407
 ] 

Ted Yu commented on HBASE-21247:


[~busbey]:
Mind taking another look ?

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HADOOP-15876) Use keySet().removeAll() to remove multiple keys from Map in AzureBlobFileSystemStore

2018-10-23 Thread Ted Yu (JIRA)
Ted Yu created HADOOP-15876:
---

 Summary: Use keySet().removeAll() to remove multiple keys from Map 
in AzureBlobFileSystemStore
 Key: HADOOP-15876
 URL: https://issues.apache.org/jira/browse/HADOOP-15876
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Looking at 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
 , {{removeDefaultAcl}} in particular:
{code}
for (Map.Entry defaultAclEntry : 
defaultAclEntries.entrySet()) {
  aclEntries.remove(defaultAclEntry.getKey());
}
{code}
The above operation can be written this way:
{code}
aclEntries.keySet().removeAll(defaultAclEntries.keySet());
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15876) Use keySet().removeAll() to remove multiple keys from Map in AzureBlobFileSystemStore

2018-10-23 Thread Ted Yu (JIRA)
Ted Yu created HADOOP-15876:
---

 Summary: Use keySet().removeAll() to remove multiple keys from Map 
in AzureBlobFileSystemStore
 Key: HADOOP-15876
 URL: https://issues.apache.org/jira/browse/HADOOP-15876
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Looking at 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
 , {{removeDefaultAcl}} in particular:
{code}
for (Map.Entry defaultAclEntry : 
defaultAclEntries.entrySet()) {
  aclEntries.remove(defaultAclEntry.getKey());
}
{code}
The above operation can be written this way:
{code}
aclEntries.keySet().removeAll(defaultAclEntries.keySet());
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-23 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661256#comment-16661256
 ] 

Ted Yu commented on HBASE-21342:


Mike:
Please go ahead.

Thanks

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, HBASE-21342.002.patch, 
> HBASE-21342.003.patch, HBASE-21342.004.patch, HBASE-21342.005.patch, 
> HBASE-21342.006.patch, HBASE-21342.007.patch, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-23 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661038#comment-16661038
 ] 

Ted Yu commented on HBASE-21246:


Modified previous comment with more background information.

bq. concern about using a WALIdentity was having NoSuchLogException (or 
similar) littered all over the place.

NoSuchLogException can be thought of as analogous to FileNotFoundException in 
the FileSystem world. Through NoSuchLogException, we would consolidate WAL 
system error handling with abstraction suitable for multiple WAL backends 
(Kafka, Distributed Log, Ratis and distributed FileSystem).
The handling of NoSuchLogException would occur in no more than what current 
codebase handles FileNotFoundException.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-23 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659867#comment-16659867
 ] 

Ted Yu edited comment on HBASE-21246 at 10/23/18 5:03 PM:
--

bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in {{refreshSources}} method of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
The {{wal}} variable above is the String representation of WAL.

In handling failed replication queue(s), for each queue Id, there are Set of 
WALs represented using String (persisted form on zookeeper).
In order to handle the String WAL name (from zookeeper), we would need 
{{walProvider.createWALIdentity}} which accepts String parameter.
So {{walProvider.createWALIdentity}} call is deserialization from String form 
of WAL name to WALIdentity.
Here is corresponding code from current master branch:
{code}
  walsByGroup.forEach(wal -> src.enqueueLog(new Path(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String when 
WAL is backed by hdfs. For other WAL provider, different WALIdentity instance 
would be created.



was (Author: yuzhih...@gmail.com):
bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in {{refreshSources}} method of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
The {{wal}} variable above is the String representation of WAL.

In handling failed replication queue(s), for each queue Id, there are Set of 
WALs represented using String (persisted form on zookeeper).
So {{walProvider.createWALIdentity}} call is deserialization from String form 
of WAL name to WALIdentity.
Here is corresponding code from current master branch:
{code}
  walsByGroup.forEach(wal -> src.enqueueLog(new Path(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String when 
WAL is backed by hdfs. For other WAL provider, different WALIdentity instance 
would be created.


> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659867#comment-16659867
 ] 

Ted Yu edited comment on HBASE-21246 at 10/23/18 3:33 AM:
--

bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in {{refreshSources}} method of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
The {{wal}} variable above is the String representation of WAL.

In handling failed replication queue(s), for each queue Id, there are Set of 
WALs represented using String (persisted form on zookeeper).
So {{walProvider.createWALIdentity}} call is deserialization from String form 
of WAL name to WALIdentity.
Here is corresponding code from current master branch:
{code}
  walsByGroup.forEach(wal -> src.enqueueLog(new Path(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String when 
WAL is backed by hdfs. For other WAL provider, different WALIdentity instance 
would be created.



was (Author: yuzhih...@gmail.com):
bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in refreshSources of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659978#comment-16659978
 ] 

Ted Yu commented on HBASE-21246:


I ran failed tests with patch v8 which passed.
{code}
[ERROR] Crashed tests:
[ERROR] org.apache.hadoop.hbase.master.TestSplitLogManager
[ERROR] ExecutionException The forked VM terminated without properly saying 
goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /testptch/hbase/hbase-server && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -enableassertions 
-Dhbase.build.id=2018-10-22T21:39:24Z -Xmx2800m 
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true 
-Djava.awt.headless=true -jar 
/testptch/hbase/hbase-server/target/surefire/surefirebooter6128037881631299296.jar
 /testptch/hbase/hbase-server/target/surefire 2018-10-22T21-40-02_474-jvmRun4 
surefire7294898340571802335tmp surefire_16957884523968105888244tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
{code}
It seems the environment of Jenkins had some issue.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659867#comment-16659867
 ] 

Ted Yu commented on HBASE-21246:


bq. Why do we have a String constructor anyways?

One example usage for the ctor taking String is in refreshSources of 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
 :
{code}
for (SortedSet walsByGroup : 
walsByIdRecoveredQueues.get(queueId).values()) {
  walsByGroup.forEach(wal -> 
src.enqueueLog(this.walProvider.createWALIdentity(wal)));
{code}
createWALIdentity() ends up calling FSWALIdentity ctor accepting String (when 
WAL is backed by hdfs).

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659842#comment-16659842
 ] 

Ted Yu commented on HBASE-21246:


bq. is the 'design' the thread attached to the document or do we read the body 
of the doc to find the design?

The design is outlined starting from second paragraph of section 3, 'Write 
Path' subsection.

The thread beside the document provides background on how design decision is 
reached.

bq. Has it been integrated

The current form of design for WALIdentity has integrated both comments from 
design doc and comments on this JIRA.

bq. is the design ongoing?

I would say, if there is no objection on the current form, we can consider the 
current form as consensus.
I left the thread on design doc open to give reviewers more time in evaluation.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659691#comment-16659691
 ] 

Ted Yu commented on HBASE-21246:


I have updated the link to point to the actual thread on the design doc.

bq. Is there a 'design' for WI?
Quoting my response from the discussion (for people who didn't have time to 
read the full thread):
{code}
>From my experiments over HBASE-21246, I think it is preferable to have 
>WALIdentity (please see the patch attached on HBASE-21246).
In hbase, we have the following related pairs:
HTable <-> TableName
HRegion <-> HRegionInfo

The pair of WAL instance and WALIdentity is similar.

* WALIdentity is light weight - the classes referencing WAL by WALIdentity 
wouldn't hold onto much resource
* WALProvider handles deserialization from String form of WAL name to 
WALIdentity
{code}
bq. Where is the 'outline'?

There is code snippet for the paragraph where the discussion thread was 
centered around.
Please also see subsequent paragraph below the code snippet.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659666#comment-16659666
 ] 

Ted Yu edited comment on HBASE-21246 at 10/22/18 9:03 PM:
--

Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit?disco=COODUIA

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.


was (Author: yuzhih...@gmail.com):
Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659666#comment-16659666
 ] 

Ted Yu commented on HBASE-21246:


Stack's questions seem to be at higher level. So allow me to respond to them 
first.
bq. Where is WALIdentity in the design doc?

See this thread on the doc:
https://docs.google.com/document/d/1o552MkKq9wF3BXY2nVcsCXBAImUH6r132Cxv9WHL3D8/edit

bq. Designed or are we trying variations? Or a particular variation?

What is presented in the patch on this JIRA follows what is outlined in the 
design doc.
w.r.t. the concrete fields / methods of WALIdentity (and implementing classes), 
we're open to discussion and will respect the community's consensus.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659601#comment-16659601
 ] 

Ted Yu commented on HBASE-21246:


Addressed above comment in patch v8.

Waiting for more review.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.008.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch, 
> 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Attachment: 21247.v11.txt

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Status: Patch Available  (was: Open)

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v11.txt, 
> 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 
> 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Attachment: 21247.v10.txt

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v2.txt, 21247.v3.txt, 
> 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 
> 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659176#comment-16659176
 ] 

Ted Yu commented on HBASE-21247:


Patch v10 would show the exception for testCustomProvider.

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v10.txt, 21247.v2.txt, 21247.v3.txt, 
> 21247.v4.tst, 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 
> 21247.v8.txt, 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Status: Open  (was: Patch Available)

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 
> 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 
> 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659175#comment-16659175
 ] 

Ted Yu commented on HBASE-21247:


If we go the above route, there is another issue we need to solve.
{code}
testCustomProvider(org.apache.hadoop.hbase.wal.TestWALFactory)  Time elapsed: 
0.275 sec  <<< ERROR!
java.io.IOException: couldn't set up WALProvider
at 
org.apache.hadoop.hbase.wal.TestWALFactory.testCustomProvider(TestWALFactory.java:754)
Caused by: java.lang.NoSuchMethodException: 
org.apache.hadoop.hbase.wal.SyncReplicationWALProvider.()
at 
org.apache.hadoop.hbase.wal.TestWALFactory.testCustomProvider(TestWALFactory.java:754)
{code}
SyncReplicationWALProvider must wrap an existing provider.
When {{getMetaProvider}} calls {{createProvider}}, arg-less ctor would be 
attempted on the provider class, resulting in the above exception.

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 
> 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 
> 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659071#comment-16659071
 ] 

Ted Yu commented on HBASE-21247:


Since the added logic is for meta WAL Provider, how about moving that logic to 
{{getMetaProvider}} ?
Attached patch v9.

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 
> 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 
> 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21247) Custom WAL Provider cannot be specified by configuration whose value is outside the enums in Providers

2018-10-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21247:
---
Attachment: 21247.v9.txt

> Custom WAL Provider cannot be specified by configuration whose value is 
> outside the enums in Providers
> --
>
> Key: HBASE-21247
> URL: https://issues.apache.org/jira/browse/HBASE-21247
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.tst, 
> 21247.v4.txt, 21247.v5.txt, 21247.v6.txt, 21247.v7.txt, 21247.v8.txt, 
> 21247.v9.txt
>
>
> Currently all the WAL Providers acceptable to hbase are specified in 
> Providers enum of WALFactory.
> This restricts the ability for additional WAL Providers to be supplied - by 
> class name.
> This issue fixes the bug by allowing the specification of new WAL Provider 
> class name using the config "hbase.wal.provider".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21318) Make RefreshHFilesClient runnable

2018-10-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658404#comment-16658404
 ] 

Ted Yu commented on HBASE-21318:


We're close.
{code}
HBase clusters are sharing 
{code}
Remove 'are'.

Looks good otherwise.

> Make RefreshHFilesClient runnable
> -
>
> Key: HBASE-21318
> URL: https://issues.apache.org/jira/browse/HBASE-21318
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0, 1.5.0, 2.1.2
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Attachments: HBASE-21318.master.001.patch, 
> HBASE-21318.master.002.patch, HBASE-21318.master.003.patch
>
>
> Other than when user enables hbase.coprocessor.region.classes with 
> RefreshHFilesEndPoint, user can also run this client as tool runner class/CLI 
> and calls refresh HFiles directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658392#comment-16658392
 ] 

Ted Yu commented on HBASE-21342:


[~mdrob]:
Can you review the patch one more time ?

Thanks

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, HBASE-21342.002.patch, 
> HBASE-21342.003.patch, HBASE-21342.004.patch, HBASE-21342.005.patch, 
> race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658283#comment-16658283
 ] 

Ted Yu commented on HBASE-21342:


bq. It is in finally block.

Perhaps I was not very clear in the previous comment.

postBulkLoadHFile() throws IOE. When this happens, I don't think Java would run 
the rest of the code (decrementUgiReference in this case) in the finally block.
You can either move decrementUgiReference call to the beginning of finally 
block (since it doesn't throw IOE) or, use nested finally in the current 
finally block.



> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, HBASE-21342.002.patch, 
> HBASE-21342.003.patch, HBASE-21342.004.patch, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-13468) hbase.zookeeper.quorum supports ipv6 address

2018-10-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658267#comment-16658267
 ] 

Ted Yu commented on HBASE-13468:


We should be able to use commons validator.

[~mdrob] may have more insight on how to avoid the errors seen at the end of 
https://builds.apache.org/job/PreCommit-HBASE-Build/14701/artifact/patchprocess/patch-javac-3.0.0.txt

Thanks

> hbase.zookeeper.quorum supports ipv6 address
> 
>
> Key: HBASE-13468
> URL: https://issues.apache.org/jira/browse/HBASE-13468
> Project: HBase
>  Issue Type: Bug
>Reporter: Mingtao Zhang
>Assignee: maoling
>Priority: Major
> Attachments: HBASE-13468.master.001.patch, 
> HBASE-13468.master.002.patch, HBASE-13468.master.003.patch
>
>
> I put ipv6 address in hbase.zookeeper.quorum, by the time this string went to 
> zookeeper code, the address is messed up, i.e. only '[1234' left. 
> I started using pseudo mode with embedded zk = true.
> I downloaded 1.0.0, not sure which affected version should be here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21353) TestHBCKCommandLineParsing#testCommandWithOptions hangs on call to HBCK2#checkHBCKSupport

2018-10-20 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21353:
--

 Summary: TestHBCKCommandLineParsing#testCommandWithOptions hangs 
on call to HBCK2#checkHBCKSupport
 Key: HBASE-21353
 URL: https://issues.apache.org/jira/browse/HBASE-21353
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


I noticed the following when running 
TestHBCKCommandLineParsing#testCommandWithOptions :
{code}
"main" #1 prio=5 os_prio=31 tid=0x7f851c80 nid=0x1703 waiting on 
condition [0x70216000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00076d3055d8> (a 
java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:564)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.(ConnectionImplementation.java:297)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:229)
at 
org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$11/502838712.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:347)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:227)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:127)
at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:93)
at org.apache.hbase.HBCK2.run(HBCK2.java:352)
at 
org.apache.hbase.TestHBCKCommandLineParsing.testCommandWithOptions(TestHBCKCommandLineParsing.java:62)
{code}
The test doesn't spin up hbase cluster.
Hence the call to check hbck support hangs.

In HBCK2#run, we can refactor the code such that argument parsing is done prior 
to calling HBCK2#checkHBCKSupport .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21353) TestHBCKCommandLineParsing#testCommandWithOptions hangs on call to HBCK2#checkHBCKSupport

2018-10-20 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21353:
--

 Summary: TestHBCKCommandLineParsing#testCommandWithOptions hangs 
on call to HBCK2#checkHBCKSupport
 Key: HBASE-21353
 URL: https://issues.apache.org/jira/browse/HBASE-21353
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


I noticed the following when running 
TestHBCKCommandLineParsing#testCommandWithOptions :
{code}
"main" #1 prio=5 os_prio=31 tid=0x7f851c80 nid=0x1703 waiting on 
condition [0x70216000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00076d3055d8> (a 
java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:564)
at 
org.apache.hadoop.hbase.client.ConnectionImplementation.(ConnectionImplementation.java:297)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:229)
at 
org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$11/502838712.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:347)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:227)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:127)
at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:93)
at org.apache.hbase.HBCK2.run(HBCK2.java:352)
at 
org.apache.hbase.TestHBCKCommandLineParsing.testCommandWithOptions(TestHBCKCommandLineParsing.java:62)
{code}
The test doesn't spin up hbase cluster.
Hence the call to check hbck support hangs.

In HBCK2#run, we can refactor the code such that argument parsing is done prior 
to calling HBCK2#checkHBCKSupport .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException

2018-10-20 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657916#comment-16657916
 ] 

Ted Yu commented on HBASE-20690:


When I ran TestRSGroupsOfflineMode with patch v2, there were a lot of log lines 
in the following form:
{code}
2018-10-20 17:06:46,721 INFO  
[org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-cn012.l42scl.hortonworks.com,40283,1540055196293]
 rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker(827): RSGroup 
table=hbase:rsgroup isOnline=true, regionCount=0, assignCount=0, 
rootMetaFound=true
{code}
The test didn't seem to make progress.

> Moving table to target rsgroup needs to handle TableStateNotFoundException
> --
>
> Key: HBASE-20690
> URL: https://issues.apache.org/jira/browse/HBASE-20690
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-20690.master.001.patch, 
> HBASE-20690.master.002.patch
>
>
> This is related code:
> {code}
>   if (targetGroup != null) {
> for (TableName table: tables) {
>   if (master.getAssignmentManager().isTableDisabled(table)) {
> LOG.debug("Skipping move regions because the table" + table + " 
> is disabled.");
> continue;
>   }
> {code}
> In a stack trace [~rmani] showed me:
> {code}
> 2018-06-06 07:10:44,893 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] 
> master.TableStateManager: Unable to get table demo:tbl1 state
> org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: 
> demo:tbl1
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331)
> at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768)
> at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logic should take potential TableStateNotFoundException into account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21281) Update bouncycastle dependency.

2018-10-20 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21281:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21194) Add tests in TestCopyTable which exercise MOB feature

2018-10-20 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21194:
---
Summary: Add tests in TestCopyTable which exercise MOB feature  (was: Add 
tests in TestCopyTable which exercises MOB feature)

> Add tests in TestCopyTable which exercise MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Fix For: 3.0.0
>
> Attachments: 21194.v08.patch, HBASE-21194.v01.patch, 
> HBASE-21194.v02.patch, HBASE-21194.v03.patch, HBASE-21194.v06.patch, 
> HBASE-21194.v07.patch, HBASE-21194.v08.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21194) Add tests in TestCopyTable which exercises MOB feature

2018-10-20 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21194:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failure was not related.

Thanks for the patch, Artem.

> Add tests in TestCopyTable which exercises MOB feature
> --
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Fix For: 3.0.0
>
> Attachments: 21194.v08.patch, HBASE-21194.v01.patch, 
> HBASE-21194.v02.patch, HBASE-21194.v03.patch, HBASE-21194.v06.patch, 
> HBASE-21194.v07.patch, HBASE-21194.v08.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21194) Add tests in TestCopyTable which exercises MOB feature

2018-10-20 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21194:
---
Summary: Add tests in TestCopyTable which exercises MOB feature  (was: Add 
TestCopyTable which exercises MOB feature)

> Add tests in TestCopyTable which exercises MOB feature
> --
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: 21194.v08.patch, HBASE-21194.v01.patch, 
> HBASE-21194.v02.patch, HBASE-21194.v03.patch, HBASE-21194.v06.patch, 
> HBASE-21194.v07.patch, HBASE-21194.v08.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Priority: Critical  (was: Major)

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.1.2, 3.2.1
>
> Attachments: HADOOP-15850.branch-3.0.patch, HADOOP-15850.v2.patch, 
> HADOOP-15850.v3.patch, HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> HADOOP-15850.v6.patch, testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21194:
---
Attachment: 21194.v08.patch

> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: 21194.v08.patch, HBASE-21194.v01.patch, 
> HBASE-21194.v02.patch, HBASE-21194.v03.patch, HBASE-21194.v06.patch, 
> HBASE-21194.v07.patch, HBASE-21194.v08.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21149) TestIncrementalBackupWithBulkLoad may fail due to file copy failure

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657119#comment-16657119
 ] 

Ted Yu commented on HBASE-21149:


Once HADOOP-15850 is integrated, I will mark this dup of HADOOP-15850.

> TestIncrementalBackupWithBulkLoad may fail due to file copy failure
> ---
>
> Key: HBASE-21149
> URL: https://issues.apache.org/jira/browse/HBASE-21149
> Project: HBase
>  Issue Type: Test
>  Components: backuprestore
>    Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21149.v2.txt, HBASE-21149-v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/master/471/testReport/junit/org.apache.hadoop.hbase.backup/TestIncrementalBackupWithBulkLoad/TestIncBackupDeleteTable/
>  :
> {code}
> 2018-09-03 11:54:30,526 ERROR [Time-limited test] 
> impl.TableBackupClient(235): Unexpected Exception : Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
> java.io.IOException: Failed copy from 
> hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/0f626c66493649daaf84057b8dd71a30_SeqId_205_,hdfs://localhost:53075/user/jenkins/test-data/ecd40bd0-cb93-91e0-90b5-7bfd5bb2c566/data/default/test-1535975627781/773f5709b645b46bd3840f9cfb549c5a/f/ad8df6415bd9459d9b3df76c588d79df_SeqId_205_
>  to hdfs://localhost:53075/backupUT/backup_1535975655488
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:351)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.copyBulkLoadedFiles(IncrementalTableBackupClient.java:219)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:198)
>   at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:320)
>   at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:605)
>   at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable(TestIncrementalBackupWithBulkLoad.java:104)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}
> However, some part of the test output was lost:
> {code}
> 2018-09-03 11:53:36,793 DEBUG [RS:0;765c9ca5ea28:36357] regions
> ...[truncated 398396 chars]...
> 8)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657115#comment-16657115
 ] 

Ted Yu commented on HBASE-21194:


{code}
+  assertTrue("if the assertion fails, the message should give the mob row 
count (0)",
{code}
to:
{code}
+  assertTrue("The mob row count is 0 but should be > 0",
{code}
Please check the indentation level in several places:
e.g.
{code}
+  g = new Get(row2);
+  r = t2.get(g);
+  assertEquals(0, r.size());
+
+  } finally {
+  TEST_UTIL.deleteTable(tableName1);
{code}
The body of try should be indented by 2 space.
Same with body of finally.

We're very close.

> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>      Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: HBASE-21194.v01.patch, HBASE-21194.v02.patch, 
> HBASE-21194.v03.patch, HBASE-21194.v06.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21281:


> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21281:
---
Status: Patch Available  (was: Reopened)

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-21281:


> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657079#comment-16657079
 ] 

Ted Yu commented on HBASE-21281:


Turns out that one more module needs the addition of the test dependency:

https://builds.apache.org/job/HBase%20Nightly/job/master/555//testReport/junit/org.apache.hadoop.hbase.coprocessor/TestSecureExport/health_checks___yetus_jdk8_hadoop3_checks___org_apache_hadoop_hbase_coprocessor_TestSecureExport/

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21281) Update bouncycastle dependency.

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21281:
---
Attachment: 21281.addendum2.patch

> Update bouncycastle dependency.
> ---
>
> Key: HBASE-21281
> URL: https://issues.apache.org/jira/browse/HBASE-21281
> Project: HBase
>  Issue Type: Task
>  Components: dependencies, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 21281.addendum.patch, 21281.addendum2.patch, 
> HBASE-21281.001.branch-2.0.patch
>
>
> Looks like we still depend on bcprov-jdk16 for some x509 certificate 
> generation in our tests. Bouncycastle has moved beyond this in 1.47, changing 
> the artifact names.
> [http://www.bouncycastle.org/wiki/display/JA1/Porting+from+earlier+BC+releases+to+1.47+and+later]
> There are some API changes too, but it looks like we don't use any of these.
> It seems like we also have vestiges in the POMs from when we were depending 
> on a specific BC version that came in from Hadoop. We now have a 
> KeyStoreTestUtil class in HBase, which makes me think we can also clean up 
> some dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656980#comment-16656980
 ] 

Ted Yu commented on HBASE-21194:


Please also look at checkstyle warnings.

Thanks

> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: HBASE-21194.v01.patch, HBASE-21194.v02.patch, 
> HBASE-21194.v03.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656978#comment-16656978
 ] 

Ted Yu commented on HBASE-21194:


{code}
+  private final static HBaseTestingUtility TEST_UTIL = new 
HBaseTestingUtility();
{code}
Do we need to create testing util for the counting method ?
I tried the following which compiles:
{code}
diff --git 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
index e2e4aec..f99d666 100644
--- 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
+++ 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
@@ -2310,11 +2310,11 @@ public class HBaseTestingUtility extends 
HBaseZKTestingUtility {
   /**
* Return the number of rows in the given table.
*/
-  public int countRows(final Table table) throws IOException {
+  static public int countRows(final Table table) throws IOException {
 return countRows(table, new Scan());
   }

-  public int countRows(final Table table, final Scan scan) throws IOException {
+  static public int countRows(final Table table, final Scan scan) throws 
IOException {
 try (ResultScanner results = table.getScanner(scan)) {
   int count = 0;
   while (results.next() != null) {
{code}
{code}
+  assertTrue("mob row count > 0", MOB_TEST_UTIL.countMobRows(t2) > 0);
{code}
Please refine the message - if the assertion fails, the message should give the 
mob row count (0).

Please look at other messages and refine if needed.

As for the rest of the count() methods, it is up to you.
Normally such method relocation can be done when circumstance needs it - such 
as in this case where two tests can share the same code.

> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: HBASE-21194.v01.patch, HBASE-21194.v02.patch, 
> HBASE-21194.v03.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656904#comment-16656904
 ] 

Ted Yu commented on HBASE-21342:


{code}
+  interface InternalObserverForTest {
{code}
Since the interface has VisibleForTesting annotation, the interface name 
doesn't have to include ForTest.
How about naming the interface InternalObserverForFSOperation or something 
similar ?
{code}
+  decrementUgiReference(ugi);
{code}
Should the above be enclosed in finally block ?
{code}
+  public void testForRaceCondition() throws Exception {
{code}
There may be more test added to this class in the future. If including the race 
condition in test name makes the method name too long, can you add a comment 
describing the race ?
{code}
+  class MyExceptionToAvoidRetry extends DoNotRetryIOException {};
{code}
Put the right curly on separate line.
{code}
+  e.printStackTrace();
{code}
Please replace with a rethrow of the exception.
{code}
+Thread.sleep(3000); /// sleep 3s so the fs will be closed by the time 
we wakeup
{code}
Is it possible to use some other condition so that the wait is not hardcoded ?
Sometime down the road, what if 3 second is not long enough ?
{code}
+  } catch (InterruptedException e) {}
{code}
Please log and rethrow InterruptedIOException.

thanks for the great work.

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, HBASE-21342.002.patch, 
> HBASE-21342.003.patch, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656886#comment-16656886
 ] 

Ted Yu commented on HBASE-21194:


For the mob row counting method, can you add it to 
hbase-server/src/test/java/org/apache/hadoop/hbase/mob/MobTestUtil.java so that 
both test classes can utilize it ?

For the assertion, can you add one more where mob row count is > 0 ?

As for the threshold, we shouldn't set it to 0 since that is not how users 
normally use Mob feature.

thanks

> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: HBASE-21194.v01.patch, HBASE-21194.v02.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v6.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, HADOOP-15850.v6.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-10-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21246:
---
Attachment: 21246.HBASE-20952.007.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.HBASE-20952.001.patch, 
> 21246.HBASE-20952.002.patch, 21246.HBASE-20952.004.patch, 
> 21246.HBASE-20952.005.patch, 21246.HBASE-20952.007.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656447#comment-16656447
 ] 

Ted Yu commented on HBASE-21342:


Can you change the name of the test table to reflect the nature of the test ?

Thanks

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, HBASE-21342.002.patch, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656262#comment-16656262
 ] 

Ted Yu commented on HBASE-21342:


Added ClassRule to TestSecureBulkloadManager.

Let's see what QA bot says.

Zhenlin:
Please add Apache license header to TestSecureBulkloadManager in your next 
patch.

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21342:
---
Attachment: 21342.v1.txt

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21342:
---
Status: Patch Available  (was: Open)

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.7, 2.0.1, 1.4.4, 2.1.0, 3.0.0, 1.5.0, 1.3.3
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: 21342.v1.txt, race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-10-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656221#comment-16656221
 ] 

Ted Yu commented on HBASE-21194:


Thanks for working on this, Artem.
{code}
+cfd.setMobThreshold(102400);
{code}
Isn't the threshold high ? Looking at existing tests, the threshold is very low.

Please choose better threshold so that the data written would exceed the 
threshold (by size).

Please refer to TestMobCompactor#countMobRows and related method(s) so that 
more assertion can be written after the copy table session.


> Add TestCopyTable which exercises MOB feature
> -
>
> Key: HBASE-21194
> URL: https://issues.apache.org/jira/browse/HBASE-21194
> Project: HBase
>  Issue Type: Test
>    Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: mob
> Attachments: HBASE-21194.v01.patch
>
>
> Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.
> We should add variant that enables MOB on the table being copied and verify 
> that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656199#comment-16656199
 ] 

Ted Yu commented on HADOOP-15850:
-

Thanks for the review, looks like this bug could have been discovered sooner.

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v5.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HBASE-21342) FileSystem in use may get closed by other bulk load call in secure bulkLoad

2018-10-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655710#comment-16655710
 ] 

Ted Yu commented on HBASE-21342:


{code}
+import org.apache.commons.collections.map.HashedMap;
{code}
Did you intend to import HashMap instead ?
{code}
+ * Created by mazhenlin on 2018/10/18.
{code}
Please drop author.
{code}
+public class TestSecureBulkloadManager implements InternalObserverForTest {
{code}
Missing test category.
The test also misses @ClassRule e.g.
{code}
  @ClassRule
  public static final HBaseClassTestRule CLASS_RULE =
  HBaseClassTestRule.forClass(TestCopyTable.class);
{code}

> FileSystem in use may get closed by other bulk load call  in secure bulkLoad
> 
>
> Key: HBASE-21342
> URL: https://issues.apache.org/jira/browse/HBASE-21342
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>Reporter: mazhenlin
>Assignee: mazhenlin
>Priority: Major
> Attachments: race.patch
>
>
> As mentioned in [HBASE-15291|#HBASE-15291], there is a race condition.   If 
> Two secure bulkload calls  from the same UGI into two different regions and 
> one region finishes earlier, it will close the bulk load fs, and the other 
> region will fail.
>  
> Another case would be more serious. The FileSystem.close() function needs two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions

2018-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21341:
---
Attachment: 21341.v1.txt

> DeadServer shouldn't import unshaded Preconditions
> --
>
> Key: HBASE-21341
> URL: https://issues.apache.org/jira/browse/HBASE-21341
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21341.v1.txt
>
>
> DeadServer currently imports unshaded Preconditions :
> {code}
> import com.google.common.base.Preconditions;
> {code}
> We should import shaded version of Preconditions.
> This is the only place where unshaded class from com.google.common is imported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions

2018-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21341:
---
Status: Patch Available  (was: Open)

> DeadServer shouldn't import unshaded Preconditions
> --
>
> Key: HBASE-21341
> URL: https://issues.apache.org/jira/browse/HBASE-21341
> Project: HBase
>  Issue Type: Bug
>    Reporter: Ted Yu
>    Assignee: Ted Yu
>Priority: Major
> Attachments: 21341.v1.txt
>
>
> DeadServer currently imports unshaded Preconditions :
> {code}
> import com.google.common.base.Preconditions;
> {code}
> We should import shaded version of Preconditions.
> This is the only place where unshaded class from com.google.common is imported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions

2018-10-18 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21341:
--

 Summary: DeadServer shouldn't import unshaded Preconditions
 Key: HBASE-21341
 URL: https://issues.apache.org/jira/browse/HBASE-21341
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


DeadServer currently imports unshaded Preconditions :
{code}
import com.google.common.base.Preconditions;
{code}
We should import shaded version of Preconditions.

This is the only place where unshaded class from com.google.common is imported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21341) DeadServer shouldn't import unshaded Preconditions

2018-10-18 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21341:
--

 Summary: DeadServer shouldn't import unshaded Preconditions
 Key: HBASE-21341
 URL: https://issues.apache.org/jira/browse/HBASE-21341
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


DeadServer currently imports unshaded Preconditions :
{code}
import com.google.common.base.Preconditions;
{code}
We should import shaded version of Preconditions.

This is the only place where unshaded class from com.google.common is imported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   9   10   >