[
https://issues.apache.org/jira/browse/HBASE-22607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995824#comment-16995824
]
Mingliang Liu commented on HBASE-22607:
---------------------------------------
[~AK2019] Thanks for providing more context. This is interesting scenario. I
have not reproduced the UT failure ever. Your use case could be common so I
just dig further.
{quote}
The error seemd to be in TestExportSnapshotNoCluster class. Correct me if I am
wrong.
{quote}
The test error is coming from the {{TestExportSnapshot}} because
{{TestExportSnapshotNoCluster}} is using the static helper test method
{{TestExportSnapshot::testExportFileSystemState}}. So the v0 addendum patch was
to fix {{TestExportSnapshot}}..
{quote}
I changed hbase-mapreduce/target/test-classes/hbase-site.xml by replacing
'hdfs://localhost:35345' to
'file:/hbase/hbase-mapreduce/target/test-data/35120b7a-8ae0-1738-09a2-497820fe4ff9/.hbase-snapshot/tableWithRefsV1'
that solved the error.
{quote}
This is very interesting. First, I don't see the default FS was set in
{{hbase-mapreduce/src/test/resources/hbase-site.xml}} source code, so I'm not
sure who changes that file with HDFS value. If it's missing, each UT (mostly
{{HBaseTestingUtility}}) will have the default FS set with the MiniDFS cluster
path. However this UT expects no cluster, and I assume we don't need DFS
cluster either. So in the
{{hbase-mapreduce/target/test-classes/hbase-site.xml}}, some stale value might
be there created by other test classes?
To solve that, you can:
- delete {{hdfs://localhost:35345}} in your
{{{{hbase-mapreduce/target/test-classes/hbase-site.xml}} file, or
- add {{conf.set(FileSystem.FS_DEFAULT_NAME_KEY, testDir.toString());}} in
{{TestExportSnapshotNoCluster::setUpBaseConf}} method.
I have attached a new patch [^HBASE-22607.addendum.001.patch] which would fail
if we do not above steps, and succeed if we do either of the steps. Please try.
As I think this is a corner case when
{{hbase-mapreduce/target/test-classes/hbase-site.xml}} is somehow screwed, I
guess we can push this one-line fix in our code? We need to revert change in
{{hbase-mapreduce/src/test/resources/hbase-site.xml}} in above patch. I can
prepare a new JIRA if needed. [~stack]
> TestExportSnapshotNoCluster::testSnapshotWithRefsExportFileSystemState()
> fails intermittently
> ---------------------------------------------------------------------------------------------
>
> Key: HBASE-22607
> URL: https://issues.apache.org/jira/browse/HBASE-22607
> Project: HBase
> Issue Type: Bug
> Components: test
> Affects Versions: 3.0.0, 2.2.0, 2.0.6
> Reporter: Mingliang Liu
> Assignee: Mingliang Liu
> Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.9
>
> Attachments: HBASE-22607.000.patch, HBASE-22607.001.patch,
> HBASE-22607.002.patch, HBASE-22607.addendum.000.patch,
> HBASE-22607.addendum.001.patch
>
>
> In previous runs, test
> {{TestExportSnapshotNoCluster.testSnapshotWithRefsExportFileSystemState}}
> fails intermittently with {{java.net.ConnectException: Connection refused}}
> exception, see build
> [510|https://builds.apache.org/job/PreCommit-HBASE-Build/510/testReport/org.apache.hadoop.hbase.snapshot/TestExportSnapshotNoCluster/testSnapshotWithRefsExportFileSystemState/],
>
> [545|https://builds.apache.org/job/PreCommit-HBASE-Build/545/testReport/org.apache.hadoop.hbase.snapshot/TestExportSnapshotNoCluster/testSnapshotWithRefsExportFileSystemState/],
> and
> [556|https://builds.apache.org/job/PreCommit-HBASE-Build/556/testReport/org.apache.hadoop.hbase.snapshot/TestExportSnapshotNoCluster/testSnapshotWithRefsExportFileSystemState/].
> So one sample exception is like:
> {quote}
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
> at com.sun.proxy.$Proxy20.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1630)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1614)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:900)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:961)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1537)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1580)
> at
> org.apache.hadoop.hbase.util.CommonFSUtils.listStatus(CommonFSUtils.java:693)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getCurrentTableInfoStatus(FSTableDescriptors.java:448)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:429)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:410)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptorForTableDirectory(FSTableDescriptors.java:763)
> at
> org.apache.hadoop.hbase.snapshot.SnapshotTestingUtils$SnapshotMock.createTable(SnapshotTestingUtils.java:675)
> at
> org.apache.hadoop.hbase.snapshot.SnapshotTestingUtils$SnapshotMock.createSnapshot(SnapshotTestingUtils.java:653)
> at
> org.apache.hadoop.hbase.snapshot.SnapshotTestingUtils$SnapshotMock.createSnapshot(SnapshotTestingUtils.java:647)
> at
> org.apache.hadoop.hbase.snapshot.SnapshotTestingUtils$SnapshotMock.createSnapshotV2(SnapshotTestingUtils.java:637)
> at
> org.apache.hadoop.hbase.snapshot.TestExportSnapshotNoCluster.testSnapshotWithRefsExportFileSystemState(TestExportSnapshotNoCluster.java:80)
> {quote}
> This seems that, somehow the rootdir filesystem is not LocalFileSystem, but
> on HDFS. I have not dig deeper why this happens since it's failing
> intermittently and I can not reproduce it locally. Since this is testing
> export snapshot tool without cluster, we can enforce it using
> LocalFileSystem; no breaking change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)