[jira] [Created] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-30 Thread Jianfei Jiang (Jira)
Jianfei Jiang created HDFS-15251:


 Summary: Add new zookeeper event type case after zk updated to 
3.5.x
 Key: HDFS-15251
 URL: https://issues.apache.org/jira/browse/HDFS-15251
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.2.1
Reporter: Jianfei Jiang


In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add Close 
case to the swich as it is not an unexpected Zookeeper watch event state.
{code:java}
/** @deprecated */
 @Deprecated
 Unknown(-1),
 Disconnected(0),
 /** @deprecated */
 @Deprecated
 NoSyncConnected(1),
 SyncConnected(3),
 AuthFailed(4),
 ConnectedReadOnly(5),
 SaslAuthenticated(6),
 Expired(-112),
 Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15250) Setting `dfs.client.use.datanode.hostname` to true can crash the system because of unhandled UnresolvedAddressException

2020-03-30 Thread Ctest (Jira)
Ctest created HDFS-15250:


 Summary: Setting `dfs.client.use.datanode.hostname` to true can 
crash the system because of unhandled UnresolvedAddressException
 Key: HDFS-15250
 URL: https://issues.apache.org/jira/browse/HDFS-15250
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ctest


*Problem:*

`dfs.client.use.datanode.hostname` by default is set to false, which means the 
client will use the IP address of the datanode to connect to the datanode, 
rather than the hostname of the datanode.

In `org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer`:

 
{code:java}
 try {
   Peer peer = remotePeerFactory.newConnectedPeer(inetSocketAddress, token,
   datanode);
   LOG.trace("nextTcpPeer: created newConnectedPeer {}", peer);
   return new BlockReaderPeer(peer, false);
 } catch (IOException e) {
   LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
   + "{}", datanode);
   throw e;
 }
{code}
 

If `dfs.client.use.datanode.hostname` is false, then it will try to connect via 
IP address. If the IP address is illegal and the connection fails, IOException 
will be thrown from `newConnectedPeer` and be handled.

If `dfs.client.use.datanode.hostname` is true, then it will try to connect via 
hostname. If the hostname cannot be resolved, UnresolvedAddressException will 
be thrown from `newConnectedPeer`. However, UnresolvedAddressException is not a 
subclass of IOException so `nextTcpPeer` doesn’t handle this exception at all. 
This unhandled exception could crash the system.

 

*Solution:*

Since the method is handling the illegal IP address, then the illegal hostname 
should be also handled as well. One solution is to add the handling logic in 
`nextTcpPeer`:
{code:java}
 } catch (IOException e) {
   LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
   + "{}", datanode);
   throw e;
 } catch (UnresolvedAddressException e) {
   ... // handling logic 
 }{code}
I am very happy to provide a patch to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2020-03-30 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/

[Mar 29, 2020 3:17:02 PM] (inigoiri) HDFS-15239. Add button to go to the parent 
directory in the explorer.
[Mar 29, 2020 5:54:25 PM] (brahma) Preparing for 3.4.0 development
[Mar 29, 2020 6:14:20 PM] (brahma) upate the hadoop.version property in the 
root pom.xml and
[Mar 29, 2020 9:10:25 PM] (ayushsaxena) HDFS-15245. Improve JournalNode web UI. 
Contributed by Jianfei Jiang.




-1 overall


The following subsystems voted -1:
asflicense findbugs pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

FindBugs :

   module:hadoop-cloud-storage-project/hadoop-cos 
   Redundant nullcheck of dir, which is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:[line 66] 
   org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may 
expose internal representation by returning CosNInputStream$ReadBuffer.buffer 
At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At 
CosNInputStream.java:[line 87] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, 
byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, 
File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long):in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long): new String(byte[]) At 
CosNativeFileSystemStore.java:[line 178] 
   org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, 
String, String, int) may fail to clean up java.io.InputStream Obligation to 
clean up resource created at CosNativeFileSystemStore.java:fail to clean up 
java.io.InputStream Obligation to clean up resource created at 
CosNativeFileSystemStore.java:[line 252] is not discharged 

Failed junit tests :

   hadoop.mapreduce.TestMapreduceConfigFields 
   hadoop.mapred.TestNetworkedJob 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-compile-cc-root.txt
  [8.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-compile-javac-root.txt
  [428K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-checkstyle-root.txt
  [16M]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-patch-shellcheck.txt
  [16K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/diff-patch-shelldocs.txt
  [44K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/whitespace-eol.txt
  [9.9M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/whitespace-tabs.txt
  [1.1M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1454/artifact/out/xml.txt
  [20K]

   findbugs:

   

Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-03-30 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.util.TestDiskChecker 
   hadoop.util.TestReadWriteDiskValidator 
   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.registry.secure.TestSecureLogins 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [324K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-compile-cc-root-jdk1.8.0_242.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-compile-javac-root-jdk1.8.0_242.txt
  [304K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/pathlen.txt
  [12K]

   pylint:

   The source tree stderr: 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/patch-pylint-stderr.txt
  []

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_242.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [180K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [236K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/640/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   

[jira] [Created] (HDFS-15249) ThrottledAsyncChecker is not thread-safe.

2020-03-30 Thread Toshihiro Suzuki (Jira)
Toshihiro Suzuki created HDFS-15249:
---

 Summary: ThrottledAsyncChecker is not thread-safe.
 Key: HDFS-15249
 URL: https://issues.apache.org/jira/browse/HDFS-15249
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki


ThrottledAsyncChecker should be thread-safe because it can be used by multiple 
threads when we have multiple namespaces.

*checksInProgress* and *completedChecks* are respectively HashMap and 
WeakHashMap which are not tread-safe. So we need to put them in synchronized 
block whenever we access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14503) ThrottledAsyncChecker throws NPE during block pool initialization

2020-03-30 Thread Toshihiro Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki resolved HDFS-14503.
-
Resolution: Duplicate

> ThrottledAsyncChecker throws NPE during block pool initialization 
> --
>
> Key: HDFS-14503
> URL: https://issues.apache.org/jira/browse/HDFS-14503
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Yiqun Lin
>Priority: Major
>
> ThrottledAsyncChecker throws NPE during block pool initialization. The error 
> leads the block pool registration failure.
> The exception
> {noformat}
> 2019-05-20 01:02:36,003 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected exception in block pool Block pool  (Datanode Uuid 
> x) service to xx.xx.xx.xx/xx.xx.xx.xx
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker$LastCheckResult.access$000(ThrottledAsyncChecker.java:211)
> at 
> org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker.schedule(ThrottledAsyncChecker.java:129)
> at 
> org.apache.hadoop.hdfs.server.datanode.checker.DatasetVolumeChecker.checkAllVolumes(DatasetVolumeChecker.java:209)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3387)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1508)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:272)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:768)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looks like this error due to {{WeakHashMap}} type map {{completedChecks}} has 
> removed the target entry while we still get that entry. Although we have done 
> a check before we get it, there is still a chance the entry is got as null. 
> We met a corner case for this: A federation mode, two block pools in DN, 
> {{ThrottledAsyncChecker}} schedules two same health checks for same volume.
> {noformat}
> 2019-05-20 01:02:36,000 INFO 
> org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: 
> Scheduling a check for /hadoop/2/hdfs/data/current
> 2019-05-20 01:02:36,000 INFO 
> org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: 
> Scheduling a check for /hadoop/2/hdfs/data/current
> {noformat}
> {{completedChecks}} cleans up the entry for one successful check after called 
> {{completedChecks#get}}. However, after this, another check we get the null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org