[jira] [Resolved] (MAPREDUCE-7478) [Decommission]Show Info Log for Repeated Useless refreshNode Operation
[ https://issues.apache.org/jira/browse/MAPREDUCE-7478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang resolved MAPREDUCE-7478. Resolution: Abandoned > [Decommission]Show Info Log for Repeated Useless refreshNode Operation > -- > > Key: MAPREDUCE-7478 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7478 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: wuchang >Priority: Major > > https://github.com/apache/hadoop/pull/6921 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7478) [Decommission]Show Info Log for Repeated Useless refreshNode Operation
wuchang created MAPREDUCE-7478: -- Summary: [Decommission]Show Info Log for Repeated Useless refreshNode Operation Key: MAPREDUCE-7478 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7478 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: wuchang https://github.com/apache/hadoop/pull/6921 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7475) Fix non-idempotent unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-7475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved MAPREDUCE-7475. - Hadoop Flags: Reviewed Resolution: Fixed > Fix non-idempotent unit tests > - > > Key: MAPREDUCE-7475 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7475 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 > Environment: Ubuntu 22.04, Java 17 >Reporter: Kaiyao Ke >Assignee: Kaiyao Ke >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > Original Estimate: 10m > Remaining Estimate: 10m > > 2 tests are not idempotent and fails upon repeated execution within the same > JVM instance due to self-induced state pollution. Specifically, these tests > try to make the directory TEST_ROOT_DIR and write to it. The tests do not > clean up (remove) the directory after execution. Therefore, in the second > execution, TEST_ROOT_DIR would already exist and the exception `Could not > create test dir` would be thrown. Below are the 2 non-idempotent tests: > * org.apache.hadoop.mapred.TestOldCombinerGrouping.testCombiner > * org.apache.hadoop.mapreduce.TestNewCombinerGrouping.testCombiner -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7474) [ABFS] Improve commit resilience and performance in Manifest Committer
[ https://issues.apache.org/jira/browse/MAPREDUCE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7474. --- Fix Version/s: 3.3.9 3.5.0 3.4.1 Resolution: Fixed > [ABFS] Improve commit resilience and performance in Manifest Committer > -- > > Key: MAPREDUCE-7474 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7474 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.4.0, 3.3.6 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > * Manifest committer is not resilient to rename failures on task commit > without HADOOP-18012 rename recovery enabled. > * large burst of delete calls noted: are they needed? > relates to HADOOP-19093 but takes a more minimal approach with goal of > changes in manifest committer only. > Initial proposed changes > * retry recovery on task commit rename, always (repeat save, delete, rename) > * audit delete use and see if it can be pruned -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7476) Follow up of https://issues.apache.org/jira/browse/MAPREDUCE-7475 - detected 5 more non-idempotent tests(pass in the first run but fails in repeated runs in the same
Kaiyao Ke created MAPREDUCE-7476: Summary: Follow up of https://issues.apache.org/jira/browse/MAPREDUCE-7475 - detected 5 more non-idempotent tests(pass in the first run but fails in repeated runs in the same JVM) Key: MAPREDUCE-7476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7476 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Kaiyao Ke Similar with https://issues.apache.org/jira/browse/MAPREDUCE-7475 , 5 more non-idempotent unit tests are detected. The following two tests below do not reset `NotificationServlet.counter`, so repeated runs throw assertion failures due to accumulation: * org.apache.hadoop.mapred.TestClusterMRNotification#testMR * apache.hadoop.mapred.TestLocalMRNotification#testMR The following test does not remove the key `AMParams.ATTEMPT_STATE`, so repeated runs of the test will not be missing the attempt-state at all: * org.apache.hadoop.mapreduce.v2.app.webapp.TestAppController.testAttempts The following test fully deletes `TEST_ROOT_DIR` after execution, so repeated runs will throw a`DiskErrorException`: * org.apache.hadoop.mapred.TestMapTask#testShufflePermissions The following test does not restore the static variable `statusUpdateTimes` after execution, so consecutive runs throws `AssertionError`: * org.apache.hadoop.mapred.TestTaskProgressReporter#testTaskProgress -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7475) 2 tests are non-idempotent (passes in the first run but fails in repeated runs in the same JVM):
Kaiyao Ke created MAPREDUCE-7475: Summary: 2 tests are non-idempotent (passes in the first run but fails in repeated runs in the same JVM): Key: MAPREDUCE-7475 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7475 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Kaiyao Ke -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7474) [ABFS] Improve commit resilience and performance in Manifest Committer
Steve Loughran created MAPREDUCE-7474: - Summary: [ABFS] Improve commit resilience and performance in Manifest Committer Key: MAPREDUCE-7474 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7474 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.3.6, 3.4.0 Reporter: Steve Loughran * Manifest committer is not resilient to rename failures on task commit without HADOOP-18012 rename recovery enabled. * large burst of delete calls noted: are they needed relates to HADOOP-19093 but takes a more minimal approach with goal of changes in manifest committer only. Initial proposed changes * retry recovery on task commit rename, always (repeat save, delete, rename) * audit delete use and see if it can be pruned * maybe: rate limit some IO internally, but not delegate to abfs -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7469) NNBench createControlFiles should use thread pool to improve performance.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved MAPREDUCE-7469. --- Fix Version/s: 3.4.1 Resolution: Fixed > NNBench createControlFiles should use thread pool to improve performance. > - > > Key: MAPREDUCE-7469 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7469 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: liuguanghua >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.1 > > > NNBench is a good tool for NN performance test. And with multiples maps it > will wait long time in createControlFiles. This can use thread pool to > increase concurrency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7470) multi-thread mapreduce committer
[ https://issues.apache.org/jira/browse/MAPREDUCE-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7470. --- Resolution: Duplicate > multi-thread mapreduce committer > > > Key: MAPREDUCE-7470 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7470 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Reporter: TianyiMa >Priority: Major > Labels: mapreduce, pull-request-available > Attachments: MAPREDUCE-7470.0.patch > > > In cloud environment, such as aws, aliyun etc., the internet delay is > non-trival when we commit thounds of files. > In our situation, the ping delay is about 0.03ms in IDC, but when move to > Coud, the ping delay is about 3ms, which is roughly 100x slower. We found > that, committing tens thounds of files will cost a few tens of minutes. The > more files there are, the logger it takes. > So we propose a new committer algorithm, which is a variant of committer > algorithm version 1, called 3. In this new algorithm 3, in order to decrease > the committer time, we use a thread pool to commit job's final output. > Our test result in Cloud production shows that, the new algorithm 3 has > decrease the committer time by serveral tens of times. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7473) Entity id/type not updated for HistoryEvent NORMALIZED_RESOURCE
Bilwa S T created MAPREDUCE-7473: Summary: Entity id/type not updated for HistoryEvent NORMALIZED_RESOURCE Key: MAPREDUCE-7473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7473 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T Getting below exception in MR AM logs: 2024-03-09 16:23:30,329 ERROR [Job ATS Event Dispatcher] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error putting entity null to TimelineServer org.apache.hadoop.yarn.exceptions.YarnException: Incomplete entity without entity id/type at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:88) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:187) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForTimelineServer(JobHistoryEventHandler.java:1129) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:745) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:241) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:156) at java.base/java.lang.Thread.run(Thread.java:840) 2024-03-09 16:23:30,332 ERROR [Job ATS Event Dispatcher] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error putting entity null to TimelineServer org.apache.hadoop.yarn.exceptions.YarnException: Incomplete entity without entity id/type at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:88) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:187) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForTimelineServer(JobHistoryEventHandler.java:1129) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:745) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:241) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:156) at java.base/java.lang.Thread.run(Thread.java:840) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-7402) In the merge phase, if configuration.set("mapreduce.task.io.sort.factor", "1"), it may lead to an infinite loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian Zhang reopened MAPREDUCE-7402: --- > In the merge phase, if configuration.set("mapreduce.task.io.sort.factor", > "1"), it may lead to an infinite loop > --- > > Key: MAPREDUCE-7402 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7402 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Jian Zhang >Priority: Minor > > function with infinite loop:Merger$MergeQueue#computeBytesInMerges(int > factor, int inMem), -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7472) decode value of hive.query.string for the job Confguration which was encoded by hive
wangzhongwei created MAPREDUCE-7472: --- Summary: decode value of hive.query.string for the job Confguration which was encoded by hive Key: MAPREDUCE-7472 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7472 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.3.3 Reporter: wangzhongwei Assignee: wangzhongwei Attachments: image-2024-02-02-09-44-57-503.png the value of hive.query.string in job Configuratio is URLEncoded by hive and written to hdfs,which shoud be decoded before rendered !image-2024-02-02-09-44-57-503.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7471) Hadoop mapred minicluster command line fail with class not found
Duo Zhang created MAPREDUCE-7471: Summary: Hadoop mapred minicluster command line fail with class not found Key: MAPREDUCE-7471 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7471 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.3.5 Reporter: Duo Zhang If you run ./bin/mapred minicluster It will fail with {noformat} Exception in thread "Listener at localhost/35325" java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2648) at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2662) at org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989) at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320) Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 9 more {noformat} This line https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java#L2648 This is because we rely on mockito in NameNodeAdapter but we do not have mockito on our classpath, at least in our published hadoop-3.3.5 binary. And there is another problem that, if we do not run the above command in the HADOOP_HOME directory, i.e, in another directory by typing the absolute path of the mapred command, it will fail with {noformat} Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert at org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:336) at org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:280) at org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:289) at org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:3069) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.(MiniDFSCluster.java:239) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:157) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320) Caused by: java.lang.ClassNotFoundException: org.junit.Assert at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 8 mor {noformat} This simply because this line https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L601 We should add the $HADOOP_TOOLS_HOME prefix for the default value of HADOOP_TOOLS_DIR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7470) hadoop MR multi-thread committer
TianyiMa created MAPREDUCE-7470: --- Summary: hadoop MR multi-thread committer Key: MAPREDUCE-7470 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7470 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: TianyiMa In cloud environment, such as aws, aliyun etc., the internet delay is non-trival when we commit thounds of files. In our situation, the ping delay is about 0.03ms in IDC, but when move to Coud, the ping delay is about 3ms, which is roughly 100x slower. We found that, committing tens thounds of files will cost a few tens of minutes. The more files there are, the logger it takes. So we propose a new committer algorithm, which is a variant of committer algorithm version 1, called 3. In this new algorithm 3, in order to decrease the committer time, we use a thread pool to commit job's final output. Our test result in Cloud production shows that, the new algorithm 3 has decrease the committer time by serveral tens of times. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7469) NNBench's createControlFiles should use thread pool to improve performance.
liuguanghua created MAPREDUCE-7469: -- Summary: NNBench's createControlFiles should use thread pool to improve performance. Key: MAPREDUCE-7469 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7469 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: liuguanghua NNBench is a good tool for NN performance test. And with multiples maps it will wait long time in createControlFiles. This can use thread pool to increase concurrency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7468) Change add-opens flag's default value from true to false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved MAPREDUCE-7468. -- Fix Version/s: 3.4.0 3.3.7 Hadoop Flags: Reviewed Resolution: Fixed > Change add-opens flag's default value from true to false > > > Key: MAPREDUCE-7468 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7468 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 3.4.0, 3.3.7 >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.7 > > > To support avoid issues when a newer JobClient is used with Hadoop versions > without MAPREDUCE-7449 the default value of > mapreduce.jvm.add-opens-as-default should be false. Currently it's true, this > can cause if a newer JobClient is used to submit apps, as the placeholder > replacement won't happen during a container launch, resulting in a failed > submission. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7468) Change add-opens flag's default value from true to false
Benjamin Teke created MAPREDUCE-7468: Summary: Change add-opens flag's default value from true to false Key: MAPREDUCE-7468 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7468 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Benjamin Teke Assignee: Benjamin Teke To support avoid issues when a newer JobClient is used with Hadoop versions without MAPREDUCE-7449 the default value of mapreduce.jvm.add-opens-as-default should be false. Currently it's true, this can cause if a newer JobClient is used to submit apps, as the placeholder replacement won't happen during a container launch, resulting in a failed submission. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7466) WordMedian example fails to compute the right median
Matthew Rossi created MAPREDUCE-7466: Summary: WordMedian example fails to compute the right median Key: MAPREDUCE-7466 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7466 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples, test Reporter: Matthew Rossi The WordMedian example does not correctly handle the case where the median falls exactly between two different values (e.g., the median word legth of "Hello Hadoop" should be 5.5). This affects both the example and its test (i.e., TestWordStats). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7465) performance problem in FileOutputCommiter for big list processed by single thread
Arnaud Nauwynck created MAPREDUCE-7465: -- Summary: performance problem in FileOutputCommiter for big list processed by single thread Key: MAPREDUCE-7465 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7465 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance Affects Versions: 3.3.6, 3.3.4, 3.3.3, 3.3.5, 3.2.4, 3.3.2, 3.2.3 Reporter: Arnaud Nauwynck when commiting a big hadoop job (for example via Spark) having many partitions, the class FileOutputCommiter process thousands of dirs/files to rename with a single Thread. This is performance issue, caused by lot of waits on FileStystem storage operations. Notice that sub-class instances of FileOutputCommiter are supposed to be created at runtime dependending of a configurable property ([https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitterFactory.java|PathOutputCommitterFactory.java]). But for example in Parquet + Spark, this is buggy and can not be changed at runtime. There is an ongoing Jira and PR to fix it in Parquet + Spark: [https://issues.apache.org/jira/browse/PARQUET-2416|https://issues.apache.org/jira/browse/PARQUET-2416] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7464) Make priority of mapreduce containers configurable
liu bin created MAPREDUCE-7464: -- Summary: Make priority of mapreduce containers configurable Key: MAPREDUCE-7464 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7464 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: liu bin When maps and reduces run simultaneously, if resources are insufficient, reduces will be preempted. Because the priority of reduce is higher than map, the preempted reduces will first obtain resources when rerun, and then be preempted again, falling into a loop. We need to configure the priority of map higher than reduce to avoid this situation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7463) Modify HistoryServerRest.html content,add comma for the /ws/v1/history/mapreduce/jobs Response Body
wangzhongwei created MAPREDUCE-7463: --- Summary: Modify HistoryServerRest.html content,add comma for the /ws/v1/history/mapreduce/jobs Response Body Key: MAPREDUCE-7463 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7463 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 3.3.6, 3.3.3 Reporter: wangzhongwei Attachments: image-2023-12-11-15-55-09-196.png the /ws/v1/history/mapreduce/jobs Response Body is missing a comma !image-2023-12-11-15-55-09-196.png|width=448,height=336! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7462) Use thread pool to improve the speed of creating control files in TestDFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] farmmamba resolved MAPREDUCE-7462. -- Resolution: Invalid > Use thread pool to improve the speed of creating control files in TestDFSIO > --- > > Key: MAPREDUCE-7462 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7462 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: examples, test >Affects Versions: 3.3.6 >Reporter: farmmamba >Priority: Major > > When we use TestDFSIO tool to test the throughouts of HDFS clusters, we found > it is so slow in the creating controll files stage. > After refering to the source code, we found that method createControlFile try > to create control files serially. It can be improved by using thread pool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7462) Use thread pool to improve the speed of creating control files in TestDFSIO
farmmamba created MAPREDUCE-7462: Summary: Use thread pool to improve the speed of creating control files in TestDFSIO Key: MAPREDUCE-7462 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7462 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples, test Affects Versions: 3.3.6 Reporter: farmmamba When we use TestDFSIO tool to test the throughouts of HDFS clusters, we found it is so slow in the creating controll files stage. After refering to the source code, we found that method createControlFile try to create control files serially. It can be improved by using thread pool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7459) Fixed TestHistoryViewerPrinter flakiness during string comparison
[ https://issues.apache.org/jira/browse/MAPREDUCE-7459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved MAPREDUCE-7459. - Fix Version/s: 3.4.0 (was: 3.3.6) Hadoop Flags: Reviewed Resolution: Fixed > Fixed TestHistoryViewerPrinter flakiness during string comparison > -- > > Key: MAPREDUCE-7459 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7459 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.3.6 > Environment: Java version: openjdk 11.0.20.1 > Maven version: Apache Maven 3.6.3 >Reporter: Rajiv Ramachandran >Assignee: Rajiv Ramachandran >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > The test > {{_org.apache.hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter#testHumanPrinterAll_}} > can fail due to flakiness. These flakiness occurs because the test utilizes > Hashmaps values and converts the values to string to perform the comparision > and the order of the objects returned may not be necessarily maintained. > The stack trace is as follows: > testHumanPrinterAll(org.apache.hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter) > Time elapsed: 0.297 s <<< FAILURE! > org.junit.ComparisonFailure: > expected:<...8501754_0001_m_0[7 6-Oct-2011 19:15:09 6-Oct-2011 > 19:15:16 (7sec) > SUCCEEDED MAP task list for job_1317928501754_0001 > TaskId StartTime FinishTime Error InputSplits > > task_1317928501754_0001_m_06 6-Oct-2011 19:15:08 6-Oct-2011 > 19:15:14 (6sec) > ... > /tasklog?attemptid=attempt_1317928501754_0001_m_03]_1 > REDUCE task list...> but was:<...8501754_0001_m_0[5 6-Oct-2011 > 19:15:07 6-Oct-2011 19:15:12 (5sec) > SUCCEEDED MAP task list for job_1317928501754_0001 > TaskId StartTime FinishTime Error InputSplits > > task_1317928501754_0001_m_06 6-Oct-2011 19:15:08 6-Oct-2011 > 19:15:14 (6sec) > SUCCEEDED MAP task list for job_1317928501754_0001 > TaskId StartTime FinishTime Error InputSplits > > task_1317928501754_0001_m_04 6-Oct-2011 19:15:06 6-Oct-2011 > 19:15:10 (4sec) > SUCCEEDED MAP task list for job_1317928501754_0001 > TaskId StartTime FinishTime Error InputSplits > > task_1317928501754_0001_m_07 6-Oct-2011 19:15:09 6-Oct-2011 > 19:15:16 (7sec) > ... > /tasklog?attemptid=attempt_1317928501754_0001_m_06]_1 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7461) Fixed assertionComparision failure by resolving xml paths for child elements correctly
Rajiv Ramachandran created MAPREDUCE-7461: - Summary: Fixed assertionComparision failure by resolving xml paths for child elements correctly Key: MAPREDUCE-7461 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7461 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 3.3.6 Reporter: Rajiv Ramachandran Fix For: 3.3.6 The following tests depend on underlying implementation orders which are not guarenteed, while comparing the contents of the generated XML response. _org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs#testJobIdXML_ _org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs#testJobsXML_ The test attempts to send a HTTP GET request to a specific URL and expects a response in XML format. However, XML response order is not necessarily guaranteed. When comparing the the XML contents, The `` tag occurs in multiple places inside field. However, the root element within is not always compared due to non-deterministic order. The [getXmlString utility|https://github.com/kavvya97/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/WebServicesTestUtils.java#L78] always takes the first tag irrespective of whether it is nested or in root which causes the test to become flaky. [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.377 s <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs [ERROR] testJobIdXML(org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs) Time elapsed: 8.361 s <<< FAILURE! java.lang.AssertionError: [name] Expecting: "mapreduce.job.acl-view-job" to match pattern: "RandomWriter" -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7457) Limit number of spill files getting created
[ https://issues.apache.org/jira/browse/MAPREDUCE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved MAPREDUCE-7457. --- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Target Version/s: 3.4.0 Resolution: Fixed > Limit number of spill files getting created > --- > > Key: MAPREDUCE-7457 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7457 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mudit Sharma >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > Hi, > > We have been facing some issues where many of our cluster node disks go full > because of some rogue applications creating a lot of spill data > We wanted to fail the app if more than a threshold amount of spill files are > written > Please let us know if any such capability is supported > > If the capability is not there, we are proposing it to support it via a > config, we have added a PR for the same: > [https://github.com/apache/hadoop/pull/6155] please let us know your > thoughts on it -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7460) When "yarn.nodemanager.resource.memory-mb" and "mapreduce.map.memory.mb" work together, the mapreduce sample program blocks
ECFuzz created MAPREDUCE-7460: - Summary: When "yarn.nodemanager.resource.memory-mb" and "mapreduce.map.memory.mb" work together, the mapreduce sample program blocks Key: MAPREDUCE-7460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7460 Project: Hadoop Map/Reduce Issue Type: Bug Components: yarn Affects Versions: 3.3.6 Reporter: ECFuzz My Hadoop version is 3.3.6. The core-site.xml and hdfs-site.xml are set as default. yarn-site.xml like below. {code:java} {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-7449) Add add-opens flag to container launch commands on JDK17 nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-7449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke reopened MAPREDUCE-7449: -- > Add add-opens flag to container launch commands on JDK17 nodes > -- > > Key: MAPREDUCE-7449 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7449 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mr-am, mrv2 >Affects Versions: 3.4.0 >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > To allow containers to launch on JDK17 nodes the add-opens flag should be > added to the container launch commands if the node has JDK17+, and it > shouldn't on previous JDKs. This behaviour should be configurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7458) Race condition in TaskReportPBImpl#getProto when generating task reports process in concurrency scenarios
Tao Yang created MAPREDUCE-7458: --- Summary: Race condition in TaskReportPBImpl#getProto when generating task reports process in concurrency scenarios Key: MAPREDUCE-7458 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7458 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Reporter: Tao Yang There is a rare race condition in *TaskReportPBImpl.getProto* when JobHistoryServer getting concurrent getTaskReports requests for the same job at the same time. Exception scenario: # client calls JobClient#getTaskReports in parallel for the same job at the same time. # JobHistoryServer gets these requests and then generating response based on *cached* task reports according to HistoryClientService$HSClientProtocolHandler#getTaskReports. # When the same task report is processed concurrently, we may see UnsupportedOperationException exceptions with different stacks as following. ExceptionStack-1: TaskReportPBImpl#convertToProtoFormat {noformat} java.lang.UnsupportedOperationException at java.util.AbstractList.add(AbstractList.java:148) at java.util.AbstractList.add(AbstractList.java:108) at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:330) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$CounterGroupProto$Builder.addAllCounters(MRProtos.java:4393) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CounterGroupPBImpl.addContersToProto(CounterGroupPBImpl.java:182) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CounterGroupPBImpl.mergeLocalToBuilder(CounterGroupPBImpl.java:63) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CounterGroupPBImpl.mergeLocalToProto(CounterGroupPBImpl.java:70) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CounterGroupPBImpl.getProto(CounterGroupPBImpl.java:55) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.convertToProtoFormat(CountersPBImpl.java:195) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.access$100(CountersPBImpl.java:38) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl$1$1.next(CountersPBImpl.java:162) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl$1$1.next(CountersPBImpl.java:150) at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:329) at org.apache.hadoop.mapreduce.v2.proto.MRProtos$CountersProto$Builder.addAllCounterGroups(MRProtos.java:5102) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.addCounterGroupsToProto(CountersPBImpl.java:172) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.mergeLocalToBuilder(CountersPBImpl.java:64) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.mergeLocalToProto(CountersPBImpl.java:71) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.CountersPBImpl.getProto(CountersPBImpl.java:56) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.TaskReportPBImpl.convertToProtoFormat(TaskReportPBImpl.java:401) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.TaskReportPBImpl.mergeLocalToBuilder(TaskReportPBImpl.java:76) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.TaskReportPBImpl.mergeLocalToProto(TaskReportPBImpl.java:92) at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.TaskReportPBImpl.getProto(TaskReportPBImpl.java:64) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.convertToProtoFormat(GetTaskReportsResponsePBImpl.java:173) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.access$100(GetTaskReportsResponsePBImpl.java:36) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl$1$1.next(GetTaskReportsResponsePBImpl.java:138) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl$1$1.next(GetTaskReportsResponsePBImpl.java:127) at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:329) at org.apache.hadoop.mapreduce.v2.proto.MRServiceProtos$GetTaskReportsResponseProto$Builder.addAllTaskReports(MRServiceProtos.java:7049) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.addTaskReportsToProto(GetTaskReportsResponsePBImpl.java:150) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.mergeLocalToBuilder(GetTaskReportsResponsePBImpl.java:62) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.mergeLocalToProto(GetTaskReportsResponsePBImpl.java:69) at org.apache.hadoop.mapreduce.v2.api.protocolrecords.impl.pb.GetTaskReportsResponsePBImpl.getProto(GetTaskReportsResponsePBImpl.java:54
[jira] [Created] (MAPREDUCE-7457) Limit number of spill files getting created
Mudit Sharma created MAPREDUCE-7457: --- Summary: Limit number of spill files getting created Key: MAPREDUCE-7457 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7457 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Mudit Sharma Hi, We have been facing some issues where many of our cluster node disks go full because of some rogue applications creating a lot of spill data We wanted to fail the app if more than a threshold amount of spill files are written Please let us know if any such capability is supported -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7456) Extend add-opens flag to container launch commands on JDK17 nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-7456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth resolved MAPREDUCE-7456. --- Fix Version/s: 3.4.0 Resolution: Fixed > Extend add-opens flag to container launch commands on JDK17 nodes > - > > Key: MAPREDUCE-7456 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7456 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: Peter Szucs >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There was a previous ticket for adding add-opens flag to container launch to > be able to run them on JDK17 nodes: > https://issues.apache.org/jira/browse/MAPREDUCE-7449 > As testing discovered, this should be extended with > "{_}-add-exports=java.base/sun.net.dns=ALL-UNNAMED{_}" and > "{_}-add-exports=java.base/sun.net.util=ALL-UNNAMED{_}" options to be able to > run containers on Isilon. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7455) org.apache.hadoop.mapred.SpillRecord crashes due to overflow in buffer size computation
ConfX created MAPREDUCE-7455: Summary: org.apache.hadoop.mapred.SpillRecord crashes due to overflow in buffer size computation Key: MAPREDUCE-7455 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7455 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.3.6 Reporter: ConfX A large `mapreduce.job.reduces` can cause overflow while computing the byte buffer in `org.apache.hadoop.mapred.SpillRecord#SpillRecord(int)`, since the byte buffer size equals to `mapreduce.job.reduces` * MapTask.MAP_OUTPUT_INDEX_RECORD_LENGTH To reproduce: 1. set `mapreduce.job.reduces` to 509103844 2. run `mvn surefire:test -Dtest=org.apache.hadoop.mapred.TestMapTask#testShufflePermissions` We create a PR that provides a fix by checking the computed buffer size is positive. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7453) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved MAPREDUCE-7453. --- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Container logs are missing when yarn.app.container.log.filesize is set to > default value 0. > -- > > Key: MAPREDUCE-7453 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7453 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 3.3.6 >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Since HADOOP-18649, in container-log4j.properties, > log4j.appender.\{APPENDER}.MaxFileSize is set to > ${yarn.app.container.log.filesize}, but yarn.app.container.log.filesize is 0 > in default. So log is missing. This log is always rolling and only show the > latest log. > This is the running log like below: > {code:java} > Log Type: syslog > Log Upload Time: Fri Sep 08 11:36:09 +0800 2023 > Log Length: 0 > Log Type: syslog.1 > Log Upload Time: Fri Sep 08 11:36:09 +0800 2023 > Log Length: 179 > 2023-09-08 11:31:34,494 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Got an error when resolve > hostNames. Falling back to /default-rack for all. {code} > Note: log4j.appender.\{APPENDER}.MaxFileSize is not set before, then use > default value 10M, so no problem before HADOOP-18649 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7454) missing checking for null when acquiring appId for a null jobId
ConfX created MAPREDUCE-7454: Summary: missing checking for null when acquiring appId for a null jobId Key: MAPREDUCE-7454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7454 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened? null pointer exception is triggered when trying to acquire appId for a null jobId h2. Where's the bug? In line 90 of JobResourceUploader.java: {code:java} private ApplicationId jobIDToAppId(JobID jobId) { return ApplicationId.newInstance(Long.parseLong(jobId.getJtIdentifier()), jobId.getId()); } {code} Here the jobId is not checked before generating the `ApplicationId` for it. h2. How to reproduce? 1. set {{mapreduce.job.sharedcache.mode=archives, mapreduce.framework.name=yarn, yarn.sharedcache.enabled=true}} 2. run {{org.apache.hadoop.mapreduce.TestJobResourceUploader#testErasureCodingDisabled}} and observe this exception: {code:java} java.lang.NullPointerException at org.apache.hadoop.mapreduce.JobResourceUploader.jobIDToAppId(JobResourceUploader.java:91) at org.apache.hadoop.mapreduce.JobResourceUploader.initSharedCache(JobResourceUploader.java:79) at org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:134) at org.apache.hadoop.mapreduce.TestJobResourceUploader.testErasureCodingSetting(TestJobResourceUploader.java:442) at org.apache.hadoop.mapreduce.TestJobResourceUploader.testErasureCodingDisabled(TestJobResourceUploader.java:380) {code} For an easy reproduction, run the {{reproduce.sh}} in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7453) Container logs are missing when yarn.app.container.log.filesize is set to default value 0.
zhengchenyu created MAPREDUCE-7453: -- Summary: Container logs are missing when yarn.app.container.log.filesize is set to default value 0. Key: MAPREDUCE-7453 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7453 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 3.3.6 Reporter: zhengchenyu Assignee: zhengchenyu Since HADOOP-18649, in container-log4j.properties, log4j.appender.\{APPENDER}.MaxFileSize is set to ${yarn.app.container.log.filesize}, but yarn.app.container.log.filesize is 0 in default. So log is missing. This log is always rolling and only show the latest log. This is the running log like below: {code:java} Log Type: syslog Log Upload Time: Fri Sep 08 11:36:09 +0800 2023 Log Length: 0 Log Type: syslog.1 Log Upload Time: Fri Sep 08 11:36:09 +0800 2023 Log Length: 179 2023-09-08 11:31:34,494 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Got an error when resolve hostNames. Falling back to /default-rack for all. {code} Note: log4j.appender.\{APPENDER}.MaxFileSize is not set before, then use default value 10M, so no problem before HADOOP-18649 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7451) review TrackerDistributedCacheManager.checkPermissionOfOther
[ https://issues.apache.org/jira/browse/MAPREDUCE-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7451. --- Resolution: Won't Fix The class TrackerDistributedCacheManager only exists in hadoop releases <= 1.2. no need to look at it > review TrackerDistributedCacheManager.checkPermissionOfOther > > > Key: MAPREDUCE-7451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7451 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0, 1.2.1 >Reporter: Yiheng Cao >Priority: Major > > TrackerDistributedCacheManager.checkPermissionOfOther() doesn't seem to work > reliably -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7452) ManifestCommitter to support / as a destination
Steve Loughran created MAPREDUCE-7452: - Summary: ManifestCommitter to support / as a destination Key: MAPREDUCE-7452 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7452 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 3.3.6 Reporter: Steve Loughran you can't commit work to the root of an object store through the manifest committer, as it will fail if the destination path exists, which always holds for root. proposed * check for dest / in job setup; if the path is not root, use createNewDirectory() as today * if the path is root, delete all children but not the dir. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7451) Security Vulnerability - Action Required: “Incorrect Permission Assignment for Critical Resource” vulnerability in the newest version of hadoop
Yiheng Cao created MAPREDUCE-7451: - Summary: Security Vulnerability - Action Required: “Incorrect Permission Assignment for Critical Resource” vulnerability in the newest version of hadoop Key: MAPREDUCE-7451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7451 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yiheng Cao I think the method {{org.apache.hadoop.filecache.TrackerDistributedCacheManager.checkPermissionOfOther(FileSystem fs, Path path, FsAction action)}} may have an “Incorrect Permission Assignment for Critical Resource”vulnerability which is vulnerable in the newest version of hadoop. It shares similarities to a recent CVE disclosure _CVE-2017-3166_ in the same project _"apache/hadoop"_ project. The vulnerability is present in the class org.apache.hadoop.filecache.TrackerDistributedCacheManager of method checkPermissionOfOther(FileSystem fs, Path path, FsAction action), which is responsible for Checking whether the file system object (FileSystem) at the specified path has additional user permissions for the specified operation(action). {*}But t{*}{*}he check snippet is similar to the vulnerable snippet for CVE-2017-3166{*} and may have the same consequence as CVE-2017-3166: {*}a file in an encryption zone with access permissions will be stored in a world-readable location and can be freely shared with any application that requests the file to be localized{*}. Therefore, maybe you need to fix the vulnerability with much the same fix code as the CVE-2017-3166 patch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7449) Add add-opens flag to container launch commands on JDK17 nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-7449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved MAPREDUCE-7449. -- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add add-opens flag to container launch commands on JDK17 nodes > -- > > Key: MAPREDUCE-7449 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7449 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mr-am, mrv2 >Affects Versions: 3.4.0 >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > To allow containers to launch on JDK17 nodes the add-opens flag should be > added to the container launch commands if the node has JDK17+, and it > shouldn't on previous JDKs. This behaviour should be configurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7450) Set the record delimiter for the input file based on its path
lvhu created MAPREDUCE-7450: --- Summary: Set the record delimiter for the input file based on its path Key: MAPREDUCE-7450 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7450 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 3.3.6 Environment: Any Reporter: lvhu Fix For: MR-3902 In the mapreduce program, when reading files, we can easily set the record delimiter based on the parameter textinputformat.record.delimiter. This parameter can also be easily set, including Spark, for example: spark.sparkContext.hadoopConfiguration.set("textinputformat.record.delimiter", "|@|") val rdd = spark.sparkContext.newAPIHadoopFile(...) But once the textinputformat.record.delimiter parameter is modified, it will take effect for all files. In actual scenarios, different files often have different delimiters. In Hive, as Hive does not support programming, we cannot modify the record delimiter through the above methods. If modified through a configuration file, it will take effect on all Hive tables. The only way to modify record delimiter in hive is to rewrite a TextInputFormat class. The current method of hive is as follows: package abc.hive.MyFstTextInputFormat public class MyFstTextInputFormat extends FileInputFormat implements JobConfigurable { ... } create table test ( id string, name string ) stored as INPUTFORMAT 'abc.hive.MyFstTextInputFormat' If there are multiple different record delimiters, multiple TextInputFormats need to be rewritten. My idea is to modify TextInputFormat class to support setting record delimiter for input files based on the prefix of the file path. The specific idea is to make the following modifications to TextInputFormat: public class TextInputFormat extends FileInputFormat implements JobConfigurable { public RecordReader getRecordReader( InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException { reporter.setStatus(genericSplit.toString()); // default delimiter String delimiter = job.get("textinputformat.record.delimiter"); //Obtain the path of the file String filePath = genericSplit.getPath().toUri().getPath(); //Obtain a list of file paths and delimiter relationships based on configuration file parameters Map pathToDelimiterMap = //Obtain by parsing the configuration file for(Map.Entry entry: pathToDelimiterMap.entrySet()){ //config path String configPath = entry.getKey(); //if configPath is the prefix of filePath if(filePath.startsWith(configPath)){ //Set delimiter corresponding to the file path delimiter = entry.getValue(); } }); byte[] recordDelimiterBytes = null; if (null != delimiter) { recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8); } return new LineRecordReader(job, (FileSplit) genericSplit, recordDelimiterBytes); } } After implementing the record delimiter function of setting input files according to the path, not only does it save code to modify the delimiter, but it is also very convenient for Hadoop and Spark, without frequent parameter configuration modifications. If you accept my idea, I hope you can assign the task to me. My Github account is: lvhu-goodluck I really hope to contribute code to the community. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7449) Add add-opens flag to container launch commands on JDK17
Benjamin Teke created MAPREDUCE-7449: Summary: Add add-opens flag to container launch commands on JDK17 Key: MAPREDUCE-7449 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7449 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mr-am, mrv2 Affects Versions: 3.4.0 Reporter: Benjamin Teke Assignee: Benjamin Teke To allow containers to launch on JDK17 nodes the add-opens flag should be added to the container launch commands if the node has JDK17+, and it shouldn't on previous JDKs. This behaviour should be configurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7446) NegativeArraySizeException when running MR jobs with large data size
[ https://issues.apache.org/jira/browse/MAPREDUCE-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Szucs resolved MAPREDUCE-7446. Resolution: Fixed > NegativeArraySizeException when running MR jobs with large data size > > > Key: MAPREDUCE-7446 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7446 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Peter Szucs >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > We are using bit shifting to double the byte array in IFile's > [nextRawValue|https://github.infra.cloudera.com/CDH/hadoop/blob/bef14a39c7616e3b9f437a6fb24fc7a55a676b57/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFile.java#L437] > method to store the byte values in it. With large dataset it can easily > happen that we shift the leftmost bit when we are calculating the size of the > array, which can lead to a negative number as the array size, causing the > NegativeArraySizeException. > It would be safer to expand the backing array with a 1.5x factor, and have a > check not to extend Integer's max value during that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-7446) NegativeArraySizeException when running MR jobs with large data size
[ https://issues.apache.org/jira/browse/MAPREDUCE-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke reopened MAPREDUCE-7446: -- Reopening for 3.2/3.3 backport. > NegativeArraySizeException when running MR jobs with large data size > > > Key: MAPREDUCE-7446 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7446 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Peter Szucs >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > We are using bit shifting to double the byte array in IFile's > [nextRawValue|https://github.infra.cloudera.com/CDH/hadoop/blob/bef14a39c7616e3b9f437a6fb24fc7a55a676b57/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFile.java#L437] > method to store the byte values in it. With large dataset it can easily > happen that we shift the leftmost bit when we are calculating the size of the > array, which can lead to a negative number as the array size, causing the > NegativeArraySizeException. > It would be safer to expand the backing array with a 1.5x factor, and have a > check not to extend Integer's max value during that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7448) Inconsistent Behavior for FileOutputCommitter V1 to commit successfully many times
ConfX created MAPREDUCE-7448: Summary: Inconsistent Behavior for FileOutputCommitter V1 to commit successfully many times Key: MAPREDUCE-7448 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7448 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened I turned on {{mapreduce.fileoutputcommitter.cleanup.skipped=true}} and then the version 1 of {{FileOutputCommitter}} can commit several times, which is unexpected. h2. Where's the problem In {{{}FileOutputCommitter.commitJobInternal{}}}, {noformat} if (algorithmVersion == 1) { for (FileStatus stat: getAllCommittedTaskPaths(context)) { mergePaths(fs, stat, finalOutput, context); } } if (skipCleanup) { LOG.info("Skip cleanup the _temporary folders under job's output " + "directory in commitJob."); ...{noformat} Here if we skip cleanup, the _temporary folder would not be deleted and the _SUCCESS file would also not be created, which cause the {{mergePaths}} next time to not fail. h2. How to reproduce # set {{{}mapreduce.fileoutputcommitter.cleanup.skipped{}}}={{{}true{}}} # run {{org.apache.hadoop.mapred.TestFileOutputCommitter#testCommitterWithDuplicatedCommitV1}} you should observe {noformat} java.lang.AssertionError: Duplicate commit successful: wrong behavior for version 1. at org.junit.Assert.fail(Assert.java:89) at org.apache.hadoop.mapred.TestFileOutputCommitter.testCommitterWithDuplicatedCommitInternal(TestFileOutputCommitter.java:295) at org.apache.hadoop.mapred.TestFileOutputCommitter.testCommitterWithDuplicatedCommitV1(TestFileOutputCommitter.java:269){noformat} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7447) Unnecessary NPE encountered when starting CryptoOutputStream with encrypted-intermediate-data
ConfX created MAPREDUCE-7447: Summary: Unnecessary NPE encountered when starting CryptoOutputStream with encrypted-intermediate-data Key: MAPREDUCE-7447 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7447 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened? Got NullPointerException when initializing a {{{}CryptoOutputStream{}}}. h2. Where's the bug? In line 106 of {{{}CryptoOutputStream{}}},the code lacks a check to verify whether the key parameter is null or not. {noformat} public CryptoOutputStream(OutputStream out, CryptoCodec codec, int bufferSize, byte[] key, byte[] iv, long streamOffset, boolean closeOutputStream) throws IOException { ... this.key = key.clone();{noformat} As a result, when the configuration provides a null key, the key.clone() operation will throw a NullPointerException. It is essential to add a null check for the key parameter before using it. h2. How to reproduce? (1) set {{mapreduce.job.encrypted-intermediate-data}} to {{true}} (2) run {{org.apache.hadoop.mapreduce.task.reduce.TestMergeManager#testLargeMemoryLimits}} h2. Stacktrace h2. Stacktrace {noformat} java.lang.NullPointerException at org.apache.hadoop.crypto.CryptoOutputStream.(CryptoOutputStream.java:106) at org.apache.hadoop.fs.crypto.CryptoFSDataOutputStream.(CryptoFSDataOutputStream.java:38) at org.apache.hadoop.mapreduce.CryptoUtils.wrapIfNecessary(CryptoUtils.java:141) at org.apache.hadoop.mapreduce.security.IntermediateEncryptedStream.wrapIfNecessary(IntermediateEncryptedStream.java:46) at org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput.(OnDiskMapOutput.java:87) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:274) at org.apache.hadoop.mapreduce.task.reduce.TestMergeManager.verifyReservedMapOutputType(TestMergeManager.java:309) at org.apache.hadoop.mapreduce.task.reduce.TestMergeManager.testLargeMemoryLimits(TestMergeManager.java:303){noformat} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7446) NegativeArraySizeException when running MR jobs with large data size
Peter Szucs created MAPREDUCE-7446: -- Summary: NegativeArraySizeException when running MR jobs with large data size Key: MAPREDUCE-7446 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7446 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Peter Szucs Assignee: Peter Szucs We are using bit shifting to double the byte array in IFile's [nextRawValue|https://github.infra.cloudera.com/CDH/hadoop/blob/bef14a39c7616e3b9f437a6fb24fc7a55a676b57/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFile.java#L437] method to store the byte values in it. With large dataset it can easily happen that we shift the leftmost bit when we are calculating the size of the array, which can lead to a negative number as the array size, causing the NegativeArraySizeException. It would be safer to expand the backing array with a 1.5x factor, and have a check not to extend Integer's max value during that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7445) ShuffleSchedulerImpl causes ArithmeticException due to improper detailsInterval value checking
ConfX created MAPREDUCE-7445: Summary: ShuffleSchedulerImpl causes ArithmeticException due to improper detailsInterval value checking Key: MAPREDUCE-7445 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7445 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened There is no value checking for parameter {{{}mapreduce.reduce.shuffle.maxfetchfailures{}}}. This may cause improper calculations and crashes the system like division by 0. h2. Buggy code In {{{}ShuffleSchedulerImpl.java{}}}, there is no value checking for {{maxFetchFailuresBeforeReporting}} and this variable is directly passed to method {{{}checkAndInformMRAppMaster{}}}. When {{maxFetchFailuresBeforeReporting }} is mistakenly set to 0, the code would cause division by 0 and throw ArithmeticException to crash the system. {noformat} private void checkAndInformMRAppMaster( ... if (connectExcpt || (reportReadErrorImmediately && readError) || ((failures % maxFetchFailuresBeforeReporting) == 0) || hostFailed) { ... }{noformat} h2. How to reproduce (1) set {{{}mapreduce.reduce.shuffle.maxfetchfailures{}}}={{{}0{}}}, {{{}mapreduce.reduce.shuffle.notify.readerror{}}}={{{}false{}}} (2) run {{mvn surefire:test -Dtest=org.apache.hadoop.mapreduce.task.reduce.TestShuffleScheduler#TestSucceedAndFailedCopyMap}} h2. Stacktrace {noformat} java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkAndInformMRAppMaster(ShuffleSchedulerImpl.java:347) at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:308) at org.apache.hadoop.mapreduce.task.reduce.TestShuffleScheduler.TestSucceedAndFailedCopyMap(TestShuffleScheduler.java:285){noformat} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7442) exception message is not intusive when accessing the job configuration web UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Fei resolved MAPREDUCE-7442. Resolution: Fixed > exception message is not intusive when accessing the job configuration web UI > - > > Key: MAPREDUCE-7442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster > Environment: >Reporter: Jiandan Yang >Priority: Major > Labels: pull-request-available > Attachments: image-2023-07-14-11-23-10-762.png > > > I launched a Teragen job on hadoop-3.3.4 cluster. > The web occured an error when I clicked the link of Configuration of Job. The > error page said "HTTP ERROR 500 java.lang.IllegalArgumentException: RFC6265 > Cookie values may not contain character: [ ]", and I can't find any solution > by this error message. > I found some additional stacks in the log of AM, and those stacks reflect > yarn did not have the permission of stagging directory. When I give > permission to yarn I can access configuration page. > I think the problem is that the error page does not provide useful or > meaningful prompts. > It's better if there are message about "yarn does not have hdfs permission" > in the error page. > The snapshot of error page is as follows: > !image-2023-07-14-11-23-10-762.png! > The error logs of am are as folllows: > {code:java} > 2023-07-14 11:20:08,218 ERROR [qtp1379757019-43] > org.apache.hadoop.yarn.webapp.View: Error while reading > hdfs://dmp/user/ubd_dmp_test/.staging/job_1689296289020_0006/job.xml > org.apache.hadoop.security.AccessControlException: Permission denied: > user=yarn, access=EXECUTE, > inode="/user/ubd_dmp_test/.staging":ubd_dmp_test:ubd_dmp_test:drwx-- > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:506) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:422) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:333) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:713) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1892) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1910) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:727) > at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2089) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:762) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:458) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.r
[jira] [Created] (MAPREDUCE-7444) LineReader improperly sets AdditionalRecord, causing extra line read
ConfX created MAPREDUCE-7444: Summary: LineReader improperly sets AdditionalRecord, causing extra line read Key: MAPREDUCE-7444 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7444 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened: After setting {{io.file.buffer.size}} precisely to some values, the LineReader sometimes mistakenly reads an extra line after the split. Note that this bug is highly critical. A malicious party can create a file that makes the while loop run forever given knowledge of the buffer size setting and control over the file being read. h2. Buggy code: Consider reading a record file with multiple lines. In MapReduce, this is done using [org.apache.hadoop.mapred.LineRecordReader|https://github.com/confuzz-org/fuzz-hadoop/blob/confuzz/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java#L249], which then calls [org.apache.hadoop.util.LineReader.readCustomLine|https://github.com/confuzz-org/fuzz-hadoop/blob/confuzz/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java#L268] if you specify your own delimiter ('\n' or "\r\n"). Intrinsically, if the file is compressed, the reading of the file is implemented using [org.apache.hadoop.mapreduce.lib.input. CompressedSplitLineReader|https://github.com/confuzz-org/fuzz-hadoop/blob/confuzz/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CompressedSplitLineReader.java]. Notice that in its [fillBuffer|https://github.com/confuzz-org/fuzz-hadoop/blob/confuzz/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CompressedSplitLineReader.java#L128] function to refill the buffer, when {{inDelimiter}} is set to {{{}true{}}}, if haven't reached EOF the switch for additional record is set to true (suppose not using CRLF). Now, we sometimes want to read the file in different splits (or up until a specific position). Consider the case where the end of the split is a delimiter. This is where interesting thing happen. We use an easy example to illustrate this. Suppose that the delimiter is 10 bytes long, and the buffer size is set precise enough such that the end of the second to last buffer ends right in the exact middle of the delimiter. In the final round of loop in {{{}readCustomLine{}}}, the buffer would be refilled first: {noformat} ... if (bufferPosn >= bufferLength) { startPosn = bufferPosn = 0; bufferLength = fillBuffer(in, buffer, ambiguousByteCount > 0); ...{noformat} note that {{needAdditionalRecord}} would be set to {{true}} in {{{}CompressedSplitLineReader{}}}. Next, after some reading and string comparing the second half of the final delimiter, the function would try to calculate the bytes read: {noformat} int readLength = bufferPosn - startPosn; bytesConsumed += readLength; int appendLength = readLength - delPosn; ... if (appendLength >= 0 && ambiguousByteCount > 0) { ... unsetNeedAdditionalRecordAfterSplit(); }{noformat} note that here the {{readLength}} would be 5 (half of the delimiter length) and {{appendLength}} would be {{{}5-10=-5{}}}. Thus, the condition for the {{if}} clause would be false and the {{needAdditionalRecord}} would not be unset. In fact, the {{needAdditionalRecord}} switch would never be unset all the way until {{readCustomLine}} ends. However, it should be set to {{{}false{}}}, or the next time the user calls {{next}} the reader would read at least another line since the {{while}} condition in {{{}next{}}}: {noformat} while (getFilePosition() <= end || in.needAdditionalRecordAfterSplit()) {{noformat} is true due to not reaching EOF and {{needAdditionalRecord}} set to {{{}true{}}}. This can lead to severe problems if every time after the split the start of the final buffer falls in the delimiter. h2. How to reproduce: (1) Set {{io.file.buffer.size }} to {{188}} (2) Run test: {{org.apache.hadoop.mapred.TestLineRecordReader#testBzipWithMultibyteDelimiter}} For an easy reproduction please run the {{reproduce.sh}} included in the attachment. h2. StackTrace: {noformat} java.lang.AssertionError: Unexpected number of records in split expected:<60> but was:<61> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.hadoop.mapred.TestLineRecordReader.testSplitRecordsForFile(TestLineRecordReader.java:110) at org.apache.hadoop.mapred.TestLineRecordReader.testBzipWithMultibyteDelimiter(TestLineRecordReader.java:684){noformat} For an easy reproduction, run the
[jira] [Created] (MAPREDUCE-7443) state polluter for system file permissions
ConfX created MAPREDUCE-7443: Summary: state polluter for system file permissions Key: MAPREDUCE-7443 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7443 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh h2. What happened: After setting {{fs.permissions.umask-mode}} to disable write permission of a file, the file's write permission is also disabled in host permission. h2. Buggy code: When creating {{target/test-dir/output}} the RawLocalFileSystem directly manipulate the system permission (line 978 of {{{}RawLocalFileSystem.java{}}}): {noformat} String perm = String.format("%04o", permission.toShort()); Shell.execCommand(Shell.getSetPermissionCommand(perm, false, FileUtil.makeShellPath(pathToFile(p), true)));{noformat} If the permission turns off the write permission to the folder, the test would fail due to permission denied. However, the test does not clean the folder properly (by chmod and clean in an [@after|https://github.com/after] method), causing all the subsequent runs to be polluted. h2. StackTrace: {noformat} java.io.IOException: Mkdirs failed to create file:/home/ctestfuzz/fuzz-hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/output/_temporary/1/_temporary/attempt_200707121733_0001_m_00_0 (exists=false, cwd=file:/home/ctestfuzz/fuzz-hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core), at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:515), at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500), at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195), at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1081), at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:125), at org.apache.hadoop.mapred.TestFileOutputCommitter.testRecoveryInternal(TestFileOutputCommitter.java:109), at org.apache.hadoop.mapred.TestFileOutputCommitter.testRecoveryUpgradeV1V2(TestFileOutputCommitter.java:171){noformat} h2. How to reproduce: There are two ways to reproduce: # (1) Set {{fs.permissions.umask-mode}} to {{243}} (2) Run test: {{org.apache.hadoop.mapred.TestFileOutputCommitter#testRecoveryUpgradeV1V2}} and observe an IOException (3) Check to see that current user has lost writing access to {{target/test-dir/output}} # (1) Add an {{assertTrue(False);}} to line 112 of {{TestFileOutputCommitter.java}} to simulate the test failing in the middle (2) Run test: {{org.apache.hadoop.mapred.TestFileOutputCommitter#testRecoveryUpgradeV1V2}} and observe an AssertionError (3) Set {{fs.permissions.umask-mode}} to {{243}} (4) Run test: {{org.apache.hadoop.mapred.TestFileOutputCommitter#testRecoveryUpgradeV1V2}} and observe an IOException. (5) Check to see that the current user has lost writing access to {{test-dir/output/_temporary/1/_temporary/attempt_200707121733_0001_m_00_0}} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. {{{}{}}}{{{}{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7442) exception message is not intusive when accessing the job configuration web UI
Jiandan Yang created MAPREDUCE-7442: Summary: exception message is not intusive when accessing the job configuration web UI Key: MAPREDUCE-7442 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7442 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Environment: Reporter: Jiandan Yang Attachments: image-2023-07-14-11-23-10-762.png I launched a Teragen job on hadoop-3.3.4 cluster. The web occured an error when I clicked the link of Configuration of Job. The error page said "HTTP ERROR 500 java.lang.IllegalArgumentException: RFC6265 Cookie values may not contain character: [ ]", and I can't find a solution by this error message. I found some additional stacks in the log of AM, and those stacks reflect yarn did not have the permission of stagging directory. When I give permission to yarn I can access configuration page. I think the problem is that the error page does not provide useful or meaningful prompts. It's better if there are message about "yarn does not have hdfs permission" in the error page. The snapshot of error page is as follows: !image-2023-07-14-11-23-10-762.png! The error logs of am are as folllows: {code:java} 2023-07-14 11:20:08,218 ERROR [qtp1379757019-43] org.apache.hadoop.yarn.webapp.View: Error while reading hdfs://dmp/user/ubd_dmp_test/.staging/job_1689296289020_0006/job.xml org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=EXECUTE, inode="/user/ubd_dmp_test/.staging":ubd_dmp_test:ubd_dmp_test:drwx-- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:506) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:422) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:333) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:713) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1892) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1910) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:727) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2089) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:762) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:458) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:902) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:889) at org.apache.hadoop.hdfs.DFSClient.g
[jira] [Resolved] (MAPREDUCE-7432) Make Manifest Committer the default for abfs and gcs
[ https://issues.apache.org/jira/browse/MAPREDUCE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7432. --- Fix Version/s: 3.4.0 3.3.9 Release Note: By default, the mapreduce manifest committer is used for jobs working with abfs and gcs.. Hadoop mapreduce jobs will pick this up automatically; for Spark it is a bit complicated: read the docs to see the steps required. Resolution: Fixed > Make Manifest Committer the default for abfs and gcs > > > Key: MAPREDUCE-7432 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7432 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: client >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > Switch to the manifest committer as default for abfs and gcs > * abfs: needed for performance, scale and resilience under some failure modes > * gcs: provides correctness through atomic task commit and better job commit > performance -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7441) Race condition in closing FadvisedFileRegion
[ https://issues.apache.org/jira/browse/MAPREDUCE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth resolved MAPREDUCE-7441. --- Fix Version/s: 3.4.0 Resolution: Fixed > Race condition in closing FadvisedFileRegion > > > Key: MAPREDUCE-7441 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7441 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: Benjamin Teke >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > This issue is similar to the one described in MAPREDUCE-7095, just for > FadvisedFileRegion.transferSuccessful. There are warning messages when > multiple threads are calling the transferSuccessful method: > {code:java} > 2023-05-25 08:41:57,288 WARN org.apache.hadoop.mapred.FadvisedFileRegion: > Failed to manage OS cache for > /hadoop/data04/yarn/nm/usercache/hive/appcache/application_1684916804740_8245/output/attempt_1684916804740_8245_1_00_001154_0_10003/file.out > EBADF: Bad file descriptor > at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method) > at > org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:271) > at > org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:148) > at > org.apache.hadoop.mapred.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:163) > at > org.apache.hadoop.mapred.ShuffleChannelHandler.lambda$sendMapOutput$0(ShuffleChannelHandler.java:516) > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7441) Race condition in closing FadvisedFileRegion
Benjamin Teke created MAPREDUCE-7441: Summary: Race condition in closing FadvisedFileRegion Key: MAPREDUCE-7441 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7441 Project: Hadoop Map/Reduce Issue Type: Bug Components: yarn Affects Versions: 3.4.0 Reporter: Benjamin Teke This issue is similar to the one described in MAPREDUCE-7095, just for FadvisedFileRegion.transferSuccessful. There are warning messages when multiple threads are calling the transferSuccessful method: {code:java} 2023-05-25 08:41:57,288 WARN org.apache.hadoop.mapred.FadvisedFileRegion: Failed to manage OS cache for /hadoop/data04/yarn/nm/usercache/hive/appcache/application_1684916804740_8245/output/attempt_1684916804740_8245_1_00_001154_0_10003/file.out EBADF: Bad file descriptor at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:271) at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:148) at org.apache.hadoop.mapred.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:163) at org.apache.hadoop.mapred.ShuffleChannelHandler.lambda$sendMapOutput$0(ShuffleChannelHandler.java:516) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7435) ManifestCommitter OOM on azure job
[ https://issues.apache.org/jira/browse/MAPREDUCE-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7435. --- Fix Version/s: 3.4.0 3.3.9 Resolution: Fixed > ManifestCommitter OOM on azure job > -- > > Key: MAPREDUCE-7435 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7435 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > I've got some reports of spark jobs OOM if the manifest committer is used > through abfs. > either the manifests are using too much memory, or something is not working > with azure stream memory use (or both). > before proposing a solution, first step should be to write a test to load > many, many manifests, each with lots of dirs and files to see what breaks. > note: we did have OOM issues with the s3a committer, on teragen but those > structures have to include every etag of every block, so the manifest size is > O(blocks); the new committer is O(files + dirs). > {code} > java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.readOneBlock(AbfsInputStream.java:314) > at > org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.read(AbfsInputStream.java:267) > at java.io.DataInputStream.read(DataInputStream.java:149) > at > com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.ensureLoaded(ByteSourceJsonBootstrapper.java:539) > at > com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.detectEncoding(ByteSourceJsonBootstrapper.java:133) > at > com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructParser(ByteSourceJsonBootstrapper.java:256) > at com.fasterxml.jackson.core.JsonFactory._createParser(JsonFactory.java:1656) > at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1085) > at > com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3585) > at > org.apache.hadoop.util.JsonSerialization.fromJsonStream(JsonSerialization.java:164) > at org.apache.hadoop.util.JsonSerialization.load(JsonSerialization.java:279) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.TaskManifest.load(TaskManifest.java:361) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.impl.ManifestStoreOperationsThroughFileSystem.loadTaskManifest(ManifestStoreOperationsThroughFileSystem.java:133) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage.lambda$loadManifest$6(AbstractJobOrTaskStage.java:493) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage$$Lambda$231/1813048085.apply(Unknown > Source) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$217/489150849.apply(Unknown > Source) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage.loadManifest(AbstractJobOrTaskStage.java:492) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage.fetchTaskManifest(LoadManifestsStage.java:170) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage.processOneManifest(LoadManifestsStage.java:138) > at > org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage$$Lambda$229/137752948.run(Unknown > Source) > at > org.apache.hadoop.util.functional.TaskPool$Builder.lambda$runParallel$0(TaskPool.java:410) > at > org.apache.hadoop.util.functional.TaskPool$Builder$$Lambda$230/467893357.run(Unknown > Source) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7440) Enhancing Security in Hadoop Delegation Tokens: Phasing out DIGEST-MD5 Auth mechanism
Saurabh Rai created MAPREDUCE-7440: -- Summary: Enhancing Security in Hadoop Delegation Tokens: Phasing out DIGEST-MD5 Auth mechanism Key: MAPREDUCE-7440 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7440 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Reporter: Saurabh Rai SASL secured connections are commonly configured to negotiate confidential (encrypted) connections, known as the "auth-conf" quality of protection. This ensures both authentication and data encryption, enhancing the security of wire communication. The use of AES encryption, negotiated on "auth-conf" connections with Kerberos/GSSAPI, meets the requirements of modern commercial and governmental cryptographic regulations and policies. However, when deploying a YARN job that incorporates a network client expecting to negotiate the same level of security ({color:#1d1c1d}for example an HBase client, but any code that integrates Hadoop's UGI and related and the JRE's SASLClient will be affected{color}). The problem arises from the fact that delegation tokens, the only hard-coded option available for tasks, rely on the Digest-MD5 SASL mechanism. Unfortunately, the Digest-MD5 negotiation standard supports only five outdated and slow ciphers for SASL confidentiality: RC4 (40 bits key length), RC4 (56 bits key length), RC4 (128 bits key length), DES, and Triple DES. Notably, the use of RC4 has been prohibited by the IETF since 2015, and DES was compromised in 1999 and subsequently withdrawn as a standard by NIST. The limitations of the Digest-MD5 mechanism have significant implications for compliance with modern cryptographic regulations and policies that mandate wire encryption. As a result, YARN applications utilizing Digest-MD5 for confidentiality negotiation cannot adhere to these requirements. It is worth noting that this issue is not documented in the Hadoop documentation or logs, potentially leading developers and operators to remain unaware of the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7439) JHS MoveIntermediateToDone thread pool should override CallerContext
dzcxzl created MAPREDUCE-7439: - Summary: JHS MoveIntermediateToDone thread pool should override CallerContext Key: MAPREDUCE-7439 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7439 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Reporter: dzcxzl Now the job history server provides RPC services. If the Client passes the CallerContext, the RPC will create a MoveIntermediateToDone thread. The CallerContext supports parent-child thread variable inheritance, so the MoveIntermediateToDone thread may always be the CallerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7438) Support removal of only selective node states in untracked removal flow
[ https://issues.apache.org/jira/browse/MAPREDUCE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mudit Sharma resolved MAPREDUCE-7438. - Resolution: Invalid > Support removal of only selective node states in untracked removal flow > --- > > Key: MAPREDUCE-7438 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7438 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mudit Sharma >Priority: Major > Labels: pull-request-available > > Currently inactive nodes are removed from the Yarn local memory irrespective > of which state they are in. This makes the node removal process not too much > configurable > After this patch: https://issues.apache.org/jira/browse/YARN-10854 > If autoscaling is enabled, lot many nodes go into DECOMMISSIONED state but > still other states like LOST, SHUTDOWN are very less and systems might want > them to be still visible on UI for better tracking > The proposal is to introduce a new config, which when set, will allow only > selective node states to be removed after going into untracked state. > > Attaching PR for reference: https://github.com/apache/hadoop/pull/5680 > Any thoughts/suggestions/feedbacks are welcome! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7438) Support removal of only selective node states in untracked removal flow
Mudit Sharma created MAPREDUCE-7438: --- Summary: Support removal of only selective node states in untracked removal flow Key: MAPREDUCE-7438 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7438 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Mudit Sharma Currently inactive nodes are removed from the Yarn local memory irrespective of which state they are in. This makes the node removal process not too much configurable After this patch: https://issues.apache.org/jira/browse/YARN-10854 If autoscaling is enabled, lot many nodes go into DECOMMISSIONED state but still other states like LOST, SHUTDOWN are very less and systems might want them to be still visible on UI for better tracking The proposal is to introduce a new config, which when set, will allow only selective node states to be removed after going into untracked state. Any thoughts/suggestions/feedbacks are welcome! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7437) spotbugs complaining about .Fetcher's update of a nonatomic static counter
[ https://issues.apache.org/jira/browse/MAPREDUCE-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7437. --- Fix Version/s: 3.4.0 3.3.9 Resolution: Fixed > spotbugs complaining about .Fetcher's update of a nonatomic static counter > -- > > Key: MAPREDUCE-7437 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7437 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build, client >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > I'm having to do this to get MAPREDUCE-7435 through the build; spotbugs is > complaining about the Fetcher constructor incrementing a non-static shared > counter. Which is true, just odd it has only just surfaced. > going to fix as a standalone patch but include that in the commit chain of > that PR too -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7437) spotbugs complaining about .Fetcher's update of a nonatomic static counter
Steve Loughran created MAPREDUCE-7437: - Summary: spotbugs complaining about .Fetcher's update of a nonatomic static counter Key: MAPREDUCE-7437 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7437 Project: Hadoop Map/Reduce Issue Type: Bug Components: build, client Affects Versions: 3.4.0 Reporter: Steve Loughran Assignee: Steve Loughran I'm having to do this to get MAPREDUCE-7435 through the build; spotbugs is complaining about the Fetcher constructor incrementing a non-static shared counter. Which is true, just odd it has only just surfaced. going to fix as a standalone patch but include that in the commit chain of that PR too -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7436) Fix few testcase failure for org.apache.hadoop.mapreduce.v2
Susheel Gupta created MAPREDUCE-7436: Summary: Fix few testcase failure for org.apache.hadoop.mapreduce.v2 Key: MAPREDUCE-7436 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7436 Project: Hadoop Map/Reduce Issue Type: Test Components: mr-am Reporter: Susheel Gupta 1) TestUberAM#testThreadDumpOnTaskTimeout {noformat} java.lang.AssertionError: No AppMaster log found! Expected :1 Actual :0 at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.hadoop.mapreduce.v2.TestMRJobs.testThreadDumpOnTaskTimeout(TestMRJobs.java:1281) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750){noformat} 2) TestMRJobs#testThreadDumpOnTaskTimeout {noformat} java.lang.AssertionError: No thread dump at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.hadoop.mapreduce.v2.TestMRJobs.testThreadDumpOnTaskTimeout(TestMRJobs.java:1273) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) {noformat} 3) TestMRJobsWithProfiler#testDefaultProfiler {noformat} java.lang.AssertionError: Expected :4 Actual :0 at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:225) at org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDefaultProfiler(TestMRJobsWithProfiler.java:116) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750){noformat} 4) TestMRJobsWithProfiler#testDifferentProfilers {noformat} java.lang.AssertionError: Expected :4 Actual :0 at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633
[jira] [Created] (MAPREDUCE-7435) ManifestCommitter OOM on azure job
Steve Loughran created MAPREDUCE-7435: - Summary: ManifestCommitter OOM on azure job Key: MAPREDUCE-7435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7435 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.3.5 Reporter: Steve Loughran Assignee: Steve Loughran I've got some reports of spark jobs OOM if the manifest committer is used through abfs. either the manifests are using too much memory, or something is not working with azure stream memory use (or both). before proposing a solution, first step should be to write a test to load many, many manifests, each with lots of dirs and files to see what breaks. note: we did have OOM issues with the s3a committer, on teragen but those structures have to include every etag of every block, so the manifest size is O(blocks); the new committer is O(files + dirs). {code} java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.readOneBlock(AbfsInputStream.java:314) at org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.read(AbfsInputStream.java:267) at java.io.DataInputStream.read(DataInputStream.java:149) at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.ensureLoaded(ByteSourceJsonBootstrapper.java:539) at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.detectEncoding(ByteSourceJsonBootstrapper.java:133) at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructParser(ByteSourceJsonBootstrapper.java:256) at com.fasterxml.jackson.core.JsonFactory._createParser(JsonFactory.java:1656) at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1085) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3585) at org.apache.hadoop.util.JsonSerialization.fromJsonStream(JsonSerialization.java:164) at org.apache.hadoop.util.JsonSerialization.load(JsonSerialization.java:279) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.TaskManifest.load(TaskManifest.java:361) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.impl.ManifestStoreOperationsThroughFileSystem.loadTaskManifest(ManifestStoreOperationsThroughFileSystem.java:133) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage.lambda$loadManifest$6(AbstractJobOrTaskStage.java:493) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage$$Lambda$231/1813048085.apply(Unknown Source) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$217/489150849.apply(Unknown Source) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.AbstractJobOrTaskStage.loadManifest(AbstractJobOrTaskStage.java:492) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage.fetchTaskManifest(LoadManifestsStage.java:170) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage.processOneManifest(LoadManifestsStage.java:138) at org.apache.hadoop.mapreduce.lib.output.committer.manifest.stages.LoadManifestsStage$$Lambda$229/137752948.run(Unknown Source) at org.apache.hadoop.util.functional.TaskPool$Builder.lambda$runParallel$0(TaskPool.java:410) at org.apache.hadoop.util.functional.TaskPool$Builder$$Lambda$230/467893357.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved MAPREDUCE-7428. - Resolution: Fixed > Fix failures related to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > --- > > Key: MAPREDUCE-7428 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Akira Ajisaka >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > Few test are getting failed due to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7434) Fix testFailure TestShuffleHandler.testMapFileAccess
Tamas Domok created MAPREDUCE-7434: -- Summary: Fix testFailure TestShuffleHandler.testMapFileAccess Key: MAPREDUCE-7434 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7434 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.4.0 Reporter: Tamas Domok Assignee: Tamas Domok https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1143/testReport/junit/org.apache.hadoop.mapred/TestShuffleHandler/testMapFileAccess/ {code} Error Message Server returned HTTP response code: 500 for URL: http://127.0.0.1:13562/mapOutput?job=job_1_0001=0=attempt_1_0001_m_01_0 Stacktrace java.io.IOException: Server returned HTTP response code: 500 for URL: http://127.0.0.1:13562/mapOutput?job=job_1_0001=0=attempt_1_0001_m_01_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1902) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1500) at org.apache.hadoop.mapred.TestShuffleHandler.testMapFileAccess(TestShuffleHandler.java:292) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) Standard Output 12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleConnections with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of current shuffle connections]) 12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableCounterLong org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputBytes with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[Shuffle output in bytes]) 12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableCounterInt org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputsFailed with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of failed shuffle outputs]) 12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableCounterInt org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputsOK with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of succeeeded shuffle outputs]) 12:04:17.466 [Time-limited test] DEBUG o.a.h.m.impl.MetricsSystemImpl - ShuffleMetrics, Shuffle output metrics 12:04:17.467 [Time-limited test] DEBUG o.a.hadoop.service.AbstractService - Service: mapreduce_shuffle entered state INITED 12:04:17.477 [Time-limited test] DEBUG o.a.hadoop.service.AbstractService - Config has been overridden during init 12:04:17.478 [Time-limited test] INFO org.apache.hadoop.mapred.IndexCache - IndexCache created with max memory = 10485760 12:04:17.479 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleConnections with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of current shuffle connections]) 12:04:17.479 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableCounterLong org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputBytes with annotation @org.apache.hadoop.metrics2
[jira] [Reopened] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reopened MAPREDUCE-7428: - > Fix failures related to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > --- > > Key: MAPREDUCE-7428 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Akira Ajisaka >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > Few test are getting failed due to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7433) Remove unused mapred/LoggingHttpResponseEncoder.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved MAPREDUCE-7433. -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 Resolution: Fixed > Remove unused mapred/LoggingHttpResponseEncoder.java > > > Key: MAPREDUCE-7433 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7433 > Project: Hadoop Map/Reduce > Issue Type: Task >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > > It's no longer needed after MAPREDUCE-7431 (I forgot to include the removal > in the previous PR). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7433) Remove unused mapred/LoggingHttpResponseEncoder.java
Tamas Domok created MAPREDUCE-7433: -- Summary: Remove unused mapred/LoggingHttpResponseEncoder.java Key: MAPREDUCE-7433 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7433 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Tamas Domok Assignee: Tamas Domok It's no longer needed after MAPREDUCE-7431 (I forgot to include the removal in the previous PR). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7432) Make Manifest Committer the default for abfs and gcs
Steve Loughran created MAPREDUCE-7432: - Summary: Make Manifest Committer the default for abfs and gcs Key: MAPREDUCE-7432 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7432 Project: Hadoop Map/Reduce Issue Type: New Feature Components: client Affects Versions: 3.3.5 Reporter: Steve Loughran Switch to the manifest committer as default for abfs and gcs * abfs: needed for performance, scale and resilience under some failure modes * gcs: provides correctness through atomic task commit and better job commit performance -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7375) JobSubmissionFiles don't set right permission after mkdirs
[ https://issues.apache.org/jira/browse/MAPREDUCE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved MAPREDUCE-7375. -- Fix Version/s: 3.4.0 3.2.5 3.3.9 Resolution: Fixed > JobSubmissionFiles don't set right permission after mkdirs > -- > > Key: MAPREDUCE-7375 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7375 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 3.3.2 >Reporter: Zhang Dongsheng >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.5, 3.3.9 > > Attachments: MAPREDUCE-7375.patch > > Time Spent: 2h 40m > Remaining Estimate: 0h > > JobSubmissionFiles provide getStagingDir to get Staging Directory.If > stagingArea missing, method will create new directory with this. > {quote}fs.mkdirs(stagingArea, new FsPermission(JOB_DIR_PERMISSION));{quote} > It seems create new directory with JOB_DIR_PERMISSION,but this permission > will be apply by umask.If umask too strict , this permission may be 000(if > umask is 700).So we should change permission after create. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7431) ShuffleHandler is not working correctly in SSL mode after the Netty 4 upgrade
Tamas Domok created MAPREDUCE-7431: -- Summary: ShuffleHandler is not working correctly in SSL mode after the Netty 4 upgrade Key: MAPREDUCE-7431 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7431 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 3.4.0 Reporter: Tamas Domok Attachments: sendMapPipeline.png HADOOP-15327 introduced some regressions in the ShuffleHandler. h3. 1. a memory leak {code:java} ERROR io.netty.util.ResourceLeakDetector: LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information. {code} The Shuffle's channelRead didn't release the message properly, the fix would be this: {code:java} try { // } finally { ReferenceCountUtil.release(msg); } {code} Or even simpler: {code:java} extends SimpleChannelInboundHandler {code} h3. 1. a bug in SSL mode with more than 1 reducers It manifested in multiple errors: {code:java} ERROR org.apache.hadoop.mapred.ShuffleHandler: Future is unsuccessful. Cause: java.io.IOException: Broken pipe ERROR org.apache.hadoop.mapred.ShuffleHandler: Future is unsuccessful. Cause: java.nio.channels.ClosedChannelException // if the reducer memory was not enough, then even this: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:136) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:123) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:91) {code} *Configuration* - mapred-site.xml {code:java} mapreduce.shuffle.ssl.enabled=true {code} Alternative is to build a custom jar where *FadvisedFileRegion* is replaced with *FadvisedChunkedFile* in {*}sendMapOutput{*}. *Reproduction* {code:java} hdfs dfs -rm -r -skipTrash /tmp/sort_input hdfs dfs -rm -r -skipTrash /tmp/sort_output yarn jar hadoop-3.4.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.0-SNAPSHOT.jar randomwriter "-Dmapreduce.randomwriter.totalbytes=100" /tmp/sort_input yarn jar hadoop-3.4.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.0-SNAPSHOT.jar sort -Dmapreduce.job.reduce.slowstart.completedmaps=1 -r 40 /tmp/sort_input /tmp/sort_output | tee sort_app_output.txt {code} h3. ShuffleHandler's protocol {code:java} // HTTP Request GET /mapOutput?job=job_1672901779104_0001=0=attempt_1672901779104_0001_m_03_0,attempt_1672901779104_0001_m_02_0,attempt_1672901779104_0001_m_01_0,attempt_1672901779104_0001_m_00_0,attempt_1672901779104_0001_m_05_0,attempt_1672901779104_0001_m_12_0,attempt_1672901779104_0001_m_09_0,attempt_1672901779104_0001_m_10_0,attempt_1672901779104_0001_m_07_0,attempt_1672901779104_0001_m_11_0,attempt_1672901779104_0001_m_08_0,attempt_1672901779104_0001_m_13_0,attempt_1672901779104_0001_m_14_0,attempt_1672901779104_0001_m_15_0,attempt_1672901779104_0001_m_19_0,attempt_1672901779104_0001_m_18_0,attempt_1672901779104_0001_m_16_0,attempt_1672901779104_0001_m_17_0,attempt_1672901779104_0001_m_20_0,attempt_1672901779104_0001_m_23_0 HTTP/1.1 + keep alive headers // HTTP Response Headers content-length=sum(serialised ShuffleHeader in bytes + MapOutput size) + keep alive headers // Response Data (transfer-encoding=chunked) serialised ShuffleHeader content of the MapOutput file (start offset - length) serialised ShuffleHeader content of the MapOutput file (start offset - length) serialised ShuffleHeader content of the MapOutput file (start offset - length) serialised ShuffleHeader content of the MapOutput file (start offset - length) ... LastHttpContent // close socket if no keep-alive {code} h3. Issues - {*}setResponseHeaders{*}: did not always set the the content-length, also the transfer-encoding=chunked header was missing. - {*}ReduceMapFileCount.operationComplete{*}: messed up t
[jira] [Created] (MAPREDUCE-7430) FileSystemCount enumeration changes will cause mapreduce application failure during upgrade
Daniel Ma created MAPREDUCE-7430: Summary: FileSystemCount enumeration changes will cause mapreduce application failure during upgrade Key: MAPREDUCE-7430 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7430 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Daniel Ma -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7429) In IPV
Daniel Ma created MAPREDUCE-7429: Summary: In IPV Key: MAPREDUCE-7429 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7429 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Daniel Ma -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7428. --- Fix Version/s: 3.4.0 Resolution: Fixed > Fix failures related to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > --- > > Key: MAPREDUCE-7428 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > Few test are getting failed due to Junit 4 to Junit 5 upgrade in > org.apache.hadoop.mapreduce.v2.app.webapp > [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7428) Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp
Ashutosh Gupta created MAPREDUCE-7428: - Summary: Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp Key: MAPREDUCE-7428 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7428 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 3.4.0 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Few test are getting failed due to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1071/testReport/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7401) Optimize liststatus for better performance by using recursive listing
[ https://issues.apache.org/jira/browse/MAPREDUCE-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7401. --- Resolution: Won't Fix > Optimize liststatus for better performance by using recursive listing > - > > Key: MAPREDUCE-7401 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7401 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 3.3.3 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This change adds recursive listing APIs to FileSystem. The purpose is to > enable different FileSystem implementations optimize on the listStatus calls > if they can. Default implementation is provided for normal FileSystem > implementation which does level by level listing for each directory. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7427) Parent directory could be wrong while create done_intermediate directory
Zhang Dongsheng created MAPREDUCE-7427: -- Summary: Parent directory could be wrong while create done_intermediate directory Key: MAPREDUCE-7427 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7427 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhang Dongsheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7390) Remove WhiteBox in mapreduce module.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved MAPREDUCE-7390. -- Fix Version/s: 3.4.0 Resolution: Fixed Committed to trunk. > Remove WhiteBox in mapreduce module. > > > Key: MAPREDUCE-7390 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7390 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: fanshilun >Assignee: fanshilun >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > WhiteBox is deprecated, try to remove this method in hadoop-mapreduce. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7386) Maven parallel builds (skipping tests) fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7386. --- Fix Version/s: 3.4.0 Resolution: Fixed in trunk, backport once we are happy that it is stable > Maven parallel builds (skipping tests) fail > --- > > Key: MAPREDUCE-7386 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7386 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Affects Versions: 3.4.0, 3.3.5 > Environment: The problem occurred while using the Hadoop development > environment (Ubuntu) >Reporter: Steve Vaughan >Assignee: Steve Vaughan >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Running a parallel build fails during assembly with the following error when > running either package or install: > {code:java} > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single > (package-mapreduce) on project hadoop-mapreduce: Failed to create assembly: > Artifact: org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.4.0-SNAPSHOT > (included by module) does not have an artifact with a file. Please ensure the > package phase is run before the assembly is generated. {code} > {code:java} > Caused by: org.apache.maven.plugin.MojoExecutionException: Failed to create > assembly: Artifact: > org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.4.0-SNAPSHOT (included > by module) does not have an artifact with a file. Please ensure the package > phase is run before the assembly is generated. {code} > The command executed was: > {code:java} > $ mvn -nsu clean install -Pdist,native -DskipTests -Dtar > -Dmaven.javadoc.skip=true -T 2C {code} > Adding dependencies to the assembly plugin configuration addresses the issue -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7425) Document Fix for yarn.app.mapreduce.client-am.ipc.max-retries
[ https://issues.apache.org/jira/browse/MAPREDUCE-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved MAPREDUCE-7425. -- Fix Version/s: 3.4.0 3.3.5 3.2.5 Assignee: teng wang Resolution: Fixed > Document Fix for yarn.app.mapreduce.client-am.ipc.max-retries > - > > Key: MAPREDUCE-7425 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7425 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.4 >Reporter: teng wang >Assignee: teng wang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5, 3.2.5 > > > The document of *yarn.app.mapreduce.client-am.ipc.max-retries* and > *yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts* is not detailed > and complete. *yarn.app.mapreduce.client-am.ipc.max-retries* is used to > *overwrite ipc.client.connect.max.retries* in ClientServiceDelegate.java. So, > the document is suggested to fix as: (refer to yarn.client.failover-retries) > > {code:java} > // mapred-default.xml > > yarn.app.mapreduce.client-am.ipc.max-retries > 3 > The number of client retries to the AM - before reconnecting > - to the RM to fetch Application Status. > + to the RM to fetch Application Status. > + In other words, it is the ipc.client.connect.max.retries to be used > during > + reconnecting to the RM and fetching Application Status. > {code} > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7426) Fix typo in class StartEndTImesBase
[ https://issues.apache.org/jira/browse/MAPREDUCE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved MAPREDUCE-7426. -- Fix Version/s: 3.4.0 Assignee: Samrat Deb Resolution: Fixed Merged the PR into trunk. > Fix typo in class StartEndTImesBase > --- > > Key: MAPREDUCE-7426 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7426 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.3.4 >Reporter: Samrat Deb >Assignee: Samrat Deb >Priority: Trivial > Labels: newbie, pull-request-available > Fix For: 3.4.0 > > > While going through the code , found some typo in the code related to naming > variables > - +slowTaskRelativeTresholds+ spells wrong can be fixed to > +slowTaskRelativeThresholds+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7411) Use secure XML parser utils in MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7411. --- Fix Version/s: 3.4.0 3.3.5 Resolution: Fixed merged back to branch-3.3.5 > Use secure XML parser utils in MapReduce > > > Key: MAPREDUCE-7411 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7411 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Uptake of HADOOP-18469 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7425) Document Fix for yarn.app.mapreduce.client-am.ipc.max-retries
teng wang created MAPREDUCE-7425: Summary: Document Fix for yarn.app.mapreduce.client-am.ipc.max-retries Key: MAPREDUCE-7425 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7425 Project: Hadoop Map/Reduce Issue Type: Bug Components: yarn Affects Versions: 3.3.4 Reporter: teng wang The document of *yarn.app.mapreduce.client-am.ipc.max-retries* and *yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts* is not detailed and complete. *yarn.app.mapreduce.client-am.ipc.max-retries* is used to *overwrite ipc.client.connect.max.retries* in ClientServiceDelegate.java. So, the document is suggested to fix as: (refer to yarn.client.failover-retries) {code:java} // mapred-default.xml yarn.app.mapreduce.client-am.ipc.max-retries 3 The number of client retries to the AM - before reconnecting - to the RM to fetch Application Status. + to the RM to fetch Application Status. + In other words, it is the ipc.client.connect.max.retries to be used during + reconnecting to the RM and fetching Application Status. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7424) Document Fix: the dependency between mapreduce.job.sharedcache.mode and yarn.sharedcache.enabled
teng wang created MAPREDUCE-7424: Summary: Document Fix: the dependency between mapreduce.job.sharedcache.mode and yarn.sharedcache.enabled Key: MAPREDUCE-7424 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7424 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 3.3.4 Reporter: teng wang Suggestions to fix the document (description of mapreduce.job.sharedcache.mode in mapred-default.xml): There is one dependency between mapreduce.job.sharedcache.mode and yarn.sharedcache.enabled in source code. That is, mapreduce.job.sharedcache.mode can work if the shared cache (yarn.sharedcache.enabled) is enabled. However, the document (mapred-default.xml) does not mention it, which could affects the use of this configuration. The dependency code: ``` /* /apache/hadoop/mapreduce/SharedCacheConfig.java */ public void init(Configuration conf) { if(!conf.getBoolean(YarnConfiguration.SHARED_CACHE_ENABLED, YarnConfiguration.DEFAULT_SHARED_CACHE_ENABLED)) { return; } Collection configs = StringUtils.getTrimmedStringCollection( conf.get(MRJobConfig.SHARED_CACHE_MODE, MRJobConfig.SHARED_CACHE_MODE_DEFAULT)); if (configs.contains("files")) { this.sharedCacheFilesEnabled = true; } ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7423) The check for maxtaskfailures.per.tracker is missing
teng wang created MAPREDUCE-7423: Summary: The check for maxtaskfailures.per.tracker is missing Key: MAPREDUCE-7423 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7423 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 3.3.4 Reporter: teng wang In the conf file mapred-default.xml, the description of *mapreduce.job.maxtaskfailures.per.tracker* is: The number of task-failures on a node manager of a given job after which new tasks of that job aren't assigned to it. It *MUST be less* than mapreduce.map.maxattempts and mapreduce.reduce.maxattempts otherwise the failed task will never be tried on a different node. However, there is no checking implement in source code. Violations of the dependency will prevent failed tasks from trying on different nodes. So, it is suggested to check the dependency of the two configuration parameters before using maxtaskfailures.per.tracker. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7422) Upgrade Junit 4 to 5 in hadoop-mapreduce-examples
Ashutosh Gupta created MAPREDUCE-7422: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-examples Key: MAPREDUCE-7422 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7422 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-examples -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7421) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-jobclient
Ashutosh Gupta created MAPREDUCE-7421: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-jobclient Key: MAPREDUCE-7421 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7421 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-jobclient -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7420) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-core
Ashutosh Gupta created MAPREDUCE-7420: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-core Key: MAPREDUCE-7420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7420 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-core -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7418) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-app
Ashutosh Gupta created MAPREDUCE-7418: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-app Key: MAPREDUCE-7418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7418 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-app -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7419) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-common
Ashutosh Gupta created MAPREDUCE-7419: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-common Key: MAPREDUCE-7419 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7419 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-common -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7416) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-shuffle
Ashutosh Gupta created MAPREDUCE-7416: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-shuffle Key: MAPREDUCE-7416 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7416 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-shuffle -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7417) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-uploader
Ashutosh Gupta created MAPREDUCE-7417: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-uploader Key: MAPREDUCE-7417 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7417 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4, 3.3.3 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-uploader -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7415) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-nativetask
Ashutosh Gupta created MAPREDUCE-7415: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-nativetask Key: MAPREDUCE-7415 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7415 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-nativetask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7414) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs
Ashutosh Gupta created MAPREDUCE-7414: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs Key: MAPREDUCE-7414 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7414 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4, 3.3.3 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7413) Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs-plugins
Ashutosh Gupta created MAPREDUCE-7413: - Summary: Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs-plugins Key: MAPREDUCE-7413 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7413 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: test Affects Versions: 3.3.4, 3.3.3 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs-plugins -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7412) I oder headset but I didn't get.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Gupta resolved MAPREDUCE-7412. --- Resolution: Invalid > I oder headset but I didn't get. > - > > Key: MAPREDUCE-7412 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7412 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: performance >Affects Versions: MR-2454 >Reporter: Jovithavl v >Priority: Major > Labels: documentation > Fix For: MR-3902 > > Attachments: Screenshot_2022-10-07-03-50-10-098_com.phonepe.app.jpg > > Original Estimate: 2,147,483,647h 21,810,231,541,955m > Remaining Estimate: 2,147,483,647h 21,810,231,541,955m > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7412) I oder headset but I didn't get.
Jovithavl v created MAPREDUCE-7412: -- Summary: I oder headset but I didn't get. Key: MAPREDUCE-7412 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7412 Project: Hadoop Map/Reduce Issue Type: Bug Components: performance Affects Versions: MR-2454 Reporter: Jovithavl v Fix For: MR-3902 Attachments: Screenshot_2022-10-07-03-50-10-098_com.phonepe.app.jpg -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7411) Use secure XML parser utils
PJ Fanning created MAPREDUCE-7411: - Summary: Use secure XML parser utils Key: MAPREDUCE-7411 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7411 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: PJ Fanning Uptake of HADOOP-18469 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7410) Expose API to get task ids and individual task report given task Id from org.apache.hadoop.mapreduce.Job
Ujjawal Kumar created MAPREDUCE-7410: Summary: Expose API to get task ids and individual task report given task Id from org.apache.hadoop.mapreduce.Job Key: MAPREDUCE-7410 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7410 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver, yarn Reporter: Ujjawal Kumar Attachments: Screenshot 2022-10-06 at 4.46.48 PM.png Currently org.apache.hadoop.mapreduce.Job exposes getTaskReports(TaskType) API to fetch task reports of either mapper or reducer. However for MR jobs with large number of tasks this causes OOM issues while fetching all task reports as seen with JHS (HistoryClientService.getTaskReports), HistoryClientService also exposes an API getTaskReport() where a TaskId can be provided within the GetTaskReportRequest. org.apache.hadoop.mapreduce.Job can expose 2 API so that individual task report can be fetched after listing them from client side # Job.getTasks(TaskType) -> List - This would return TaskId of all tasks with given Type to the client # Job.getTaskReport(TaskId) -> TaskReport - This would return task report for single task to the client For JHS since JobHistoryParser.parse already parses full history file by default and maintains the list of tasks within JobHistoryParser.JobInfo's tasksMap, this info should be easy to get One additional thing that needs to be seen is if this can be supported for requests which are redirected to MRClientService (within MRAppMaster) for running jobs -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7407) Avoid stopContainer() on dead node
[ https://issues.apache.org/jira/browse/MAPREDUCE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri resolved MAPREDUCE-7407. Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Avoid stopContainer() on dead node > -- > > Key: MAPREDUCE-7407 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7407 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > If a container failed to launch earlier due to terminated instances, it has > already been removed from the container hash map. Avoiding the kill() for > CONTAINER_REMOTE_CLEANUP will avoid wasting 15min per container on > retries/timeout. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org