[jira] [Created] (YARN-6589) Recover all resources when NM restart
Yang Wang created YARN-6589: --- Summary: Recover all resources when NM restart Key: YARN-6589 URL: https://issues.apache.org/jira/browse/YARN-6589 Project: Hadoop YARN Issue Type: Bug Reporter: Yang Wang When NM restart, containers will be recovered. However, only memory and vcores in capability have been recovered. All resources need to be recovered. {code:title=ContainerImpl.java} // resource capability had been updated before NM was down this.resource = Resource.newInstance(recoveredCapability.getMemorySize(), recoveredCapability.getVirtualCores()); {code} It should be like this. {code:title=ContainerImpl.java} // resource capability had been updated before NM was down // need to recover all resources, not onlythis.resource = Resources.clone(recoveredCapability); {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6588) Add native-service AM log4j properties in classpath
Jian He created YARN-6588: - Summary: Add native-service AM log4j properties in classpath Key: YARN-6588 URL: https://issues.apache.org/jira/browse/YARN-6588 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6587) Refactor of ResourceManager#startWebApp in a Util class
Giovanni Matteo Fumarola created YARN-6587: -- Summary: Refactor of ResourceManager#startWebApp in a Util class Key: YARN-6587 URL: https://issues.apache.org/jira/browse/YARN-6587 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola This jira tracks the refactor of ResourceManager#startWebApp in a util class since Router in YARN-5412 has to implement the same logic for Filtering and Authentication. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6586) YARN to facilitate HTTPS in AM web server
Haibo Chen created YARN-6586: Summary: YARN to facilitate HTTPS in AM web server Key: YARN-6586 URL: https://issues.apache.org/jira/browse/YARN-6586 Project: Hadoop YARN Issue Type: Improvement Components: yarn Affects Versions: 3.0.0-alpha2 Reporter: Haibo Chen Assignee: Haibo Chen MR AM today does not support HTTPS in its web server, so the traffic between RMWebproxy and MR AM is in clear text. MR cannot easily achieve this mainly because MR AMs are untrusted by YARN. A potential solution purely within MR, similar to what Spark has implemented, is to allow users, when they enable HTTPS in MR job, to provide their own keystore file, and then the file is uploaded to distributed cache and localized for MR AM container. The configuration users need to do is complex. More importantly, in typical deployments, however, web browsers go through RMWebProxy to indirectly access MR AM web server. In order to support MR AM HTTPs, RMWebProxy therefore needs to trust the user-provided keystore, which is problematic. Alternatively, we can add an endpoint in NM web server that acts as a proxy between AM web server and RMWebProxy. RMWebproxy, when configured to do so, will send requests in HTTPS to the NM on which the AM is running, and the NM then can communicate with the local AM web server in HTTP. This adds one hop between RMWebproxy and AM, but both MR and Spark can use such solution. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/311/ [May 10, 2017 1:47:48 PM] (jlowe) YARN-6552. Increase YARN test timeouts from 1 second to 10 seconds. [May 10, 2017 2:57:41 PM] (jlowe) MAPREDUCE-6882. Increase MapReduce test timeouts from 1 second to 10 [May 10, 2017 5:46:50 PM] (templedf) YARN-6475. Fix some long function checkstyle issues (Contributed by [May 10, 2017 6:02:31 PM] (jlowe) HDFS-11745. Increase HDFS test timeouts from 1 second to 10 seconds. [May 10, 2017 7:15:57 PM] (kihwal) HDFS-11755. Underconstruction blocks can be considered missing. [May 10, 2017 9:33:33 PM] (liuml07) HDFS-11800. Document output of 'hdfs count -u' should contain PATHNAME. [May 10, 2017 9:34:13 PM] (templedf) YARN-6571. Fix JavaDoc issues in SchedulingPolicy (Contributed by Weiwei [May 10, 2017 9:49:25 PM] (Carlo Curino) YARN-6473. Create ReservationInvariantChecker to validate [May 10, 2017 10:05:11 PM] (liuml07) HADOOP-14361. Azure: NativeAzureFileSystem.getDelegationToken() call [May 11, 2017 5:25:28 AM] (cdouglas) HDFS-11681. DatanodeStorageInfo#getBlockIterator() should return an -1 overall The following subsystems voted -1: compile mvninstall unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.ha.TestZKFailoverControllerStress hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 hadoop.hdfs.TestReadStripedFileWithMissingBlocks hadoop.hdfs.server.namenode.TestProcessCorruptBlocks hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 hadoop.hdfs.server.namenode.TestFSImage hadoop.hdfs.qjournal.server.TestJournalNode hadoop.hdfs.server.namenode.TestDecommissioningStatus hadoop.hdfs.TestSafeMode hadoop.hdfs.TestDFSUpgrade hadoop.hdfs.server.namenode.ha.TestBootstrapStandby hadoop.hdfs.TestDistributedFileSystem hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives hadoop.hdfs.server.datanode.metrics.TestDataNodeOutlierDetectionViaMetrics hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure hadoop.hdfs.TestDFSShell hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 hadoop.hdfs.TestRollingUpgrade hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.datanode.TestDataNodeUUID hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 hadoop.hdfs.TestFileAppend hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService hadoop.mapred.TestShuffleHandler hadoop.yarn.sls.TestSLSRunner hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.yarn.client.api.impl.TestAMRMClient hadoop.yarn.server.timeline.TestRollingLevelDB hadoop.yarn.server.timeline.TestTimelineDataManager hadoop.yarn.server.timeline.TestLeveldbTimelineStore hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore hadoop.yarn.server.resourcemanager.TestRMRestart hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService hadoop.yarn.server.TestDiskFailures
[jira] [Created] (YARN-6585) RM fails to start when upgrading from 2.7 to 2.8 for clusters with node labels.
Eric Payne created YARN-6585: Summary: RM fails to start when upgrading from 2.7 to 2.8 for clusters with node labels. Key: YARN-6585 URL: https://issues.apache.org/jira/browse/YARN-6585 Project: Hadoop YARN Issue Type: Bug Reporter: Eric Payne {noformat} Caused by: java.io.IOException: Not all labels being replaced contained by known label collections, please check, new labels=[abc] at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.checkReplaceLabelsOnNode(CommonNodeLabelsManager.java:718) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.replaceLabelsOnNode(CommonNodeLabelsManager.java:737) at org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager.replaceLabelsOnNode(RMNodeLabelsManager.java:189) at org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.loadFromMirror(FileSystemNodeLabelsStore.java:181) at org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:208) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:251) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:265) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5841) Report only local collectors on node upon resync with RM after RM fails over
[ https://issues.apache.org/jira/browse/YARN-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-5841. -- Resolution: Not A Problem > Report only local collectors on node upon resync with RM after RM fails over > > > Key: YARN-5841 > URL: https://issues.apache.org/jira/browse/YARN-5841 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Haibo Chen > > As per discussion on YARN-3359, we can potentially optimize reporting of > collectors to RM after RM fails over. > Currently NM would report all the collectors known to itself in HB after > resync with RM. This would mean many NMs' may pretty much report similar set > of collector infos in first NM HB on reconnection. > This JIRA is to explore how to optimize this flow and if possible, fix it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/400/ [May 10, 2017 10:29:47 AM] (aajisaka) HADOOP-14373. License error in org.apache.hadoop.metrics2.util.Servers. [May 10, 2017 10:57:12 AM] (aajisaka) HADOOP-14400. Fix warnings from spotbugs in hadoop-tools. Contributed by [May 10, 2017 1:47:48 PM] (jlowe) YARN-6552. Increase YARN test timeouts from 1 second to 10 seconds. [May 10, 2017 2:57:41 PM] (jlowe) MAPREDUCE-6882. Increase MapReduce test timeouts from 1 second to 10 [May 10, 2017 5:46:50 PM] (templedf) YARN-6475. Fix some long function checkstyle issues (Contributed by [May 10, 2017 6:02:31 PM] (jlowe) HDFS-11745. Increase HDFS test timeouts from 1 second to 10 seconds. [May 10, 2017 7:15:57 PM] (kihwal) HDFS-11755. Underconstruction blocks can be considered missing. [May 10, 2017 9:33:33 PM] (liuml07) HDFS-11800. Document output of 'hdfs count -u' should contain PATHNAME. [May 10, 2017 9:34:13 PM] (templedf) YARN-6571. Fix JavaDoc issues in SchedulingPolicy (Contributed by Weiwei [May 10, 2017 9:49:25 PM] (Carlo Curino) YARN-6473. Create ReservationInvariantChecker to validate [May 10, 2017 10:05:11 PM] (liuml07) HADOOP-14361. Azure: NativeAzureFileSystem.getDelegationToken() call [May 11, 2017 5:25:28 AM] (cdouglas) HDFS-11681. DatanodeStorageInfo#getBlockIterator() should return an -1 overall The following subsystems voted -1: findbugs unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-common-project/hadoop-minikdc Possible null pointer dereference in org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called method Dereferenced at MiniKdc.java:org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called method Dereferenced at MiniKdc.java:[line 368] FindBugs : module:hadoop-common-project/hadoop-auth org.apache.hadoop.security.authentication.server.MultiSchemeAuthenticationHandler.authenticate(HttpServletRequest, HttpServletResponse) makes inefficient use of keySet iterator instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:of keySet iterator instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:[line 192] FindBugs : module:hadoop-common-project/hadoop-common org.apache.hadoop.crypto.CipherSuite.setUnknownValue(int) unconditionally sets the field unknownValue At CipherSuite.java:unknownValue At CipherSuite.java:[line 44] org.apache.hadoop.crypto.CryptoProtocolVersion.setUnknownValue(int) unconditionally sets the field unknownValue At CryptoProtocolVersion.java:unknownValue At CryptoProtocolVersion.java:[line 67] Possible null pointer dereference in org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of called method Dereferenced at FileUtil.java:org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of called method Dereferenced at FileUtil.java:[line 118] Possible null pointer dereference in org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, File, Path, File) due to return value of called method Dereferenced at RawLocalFileSystem.java:org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, File, Path, File) due to return value of called method Dereferenced at RawLocalFileSystem.java:[line 387] Return value of org.apache.hadoop.fs.permission.FsAction.or(FsAction) ignored, but method has no side effect At FTPFileSystem.java:but method has no side effect At FTPFileSystem.java:[line 421] Useless condition:lazyPersist == true at this point At CommandWithDestination.java:[line 502] org.apache.hadoop.io.DoubleWritable.compareTo(DoubleWritable) incorrectly handles double value At DoubleWritable.java: At DoubleWritable.java:[line 78] org.apache.hadoop.io.DoubleWritable$Comparator.compare(byte[], int, int, byte[], int, int) incorrectly handles double value At DoubleWritable.java:int) incorrectly handles double value At DoubleWritable.java:[line 97] org.apache.hadoop.io.FloatWritable.compareTo(FloatWritable) incorrectly handles float value At FloatWritable.java: At FloatWritable.java:[line 71] org.apache.hadoop.io.FloatWritable$Comparator.compare(byte[], int, int, byte[], int, int) incorrectly handles float value At FloatWritable.java:int) incorrectly handles float value At FloatWritable.java:[line 89] Possible null pointer dereference in org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return value of called method Dereferenced at IOUtils.java:org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return value of called method
[jira] [Resolved] (YARN-6581) Function ContainersMonitorImpl.MonitoringThread#run() is too long
[ https://issues.apache.org/jira/browse/YARN-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu resolved YARN-6581. Resolution: Duplicate > Function ContainersMonitorImpl.MonitoringThread#run() is too long > -- > > Key: YARN-6581 > URL: https://issues.apache.org/jira/browse/YARN-6581 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Yufei Gu >Assignee: Yufei Gu > Labels: newbie > > It is almost 200 lines. It's hard to read and maintenance. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org