[jira] [Created] (MAPREDUCE-5932) Provide an option to use a dedicated reduce-side shuffle log
Gera Shegalov created MAPREDUCE-5932: Summary: Provide an option to use a dedicated reduce-side shuffle log Key: MAPREDUCE-5932 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5932 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov For reducers in large jobs our users cannot easily spot portions of the log associated with problems with their code. An example reducer with INFO-level logging generates ~3500 lines / ~700KiB lines per second. 95% of the log is the client-side of the shuffle {{org.apache.hadoop.mapreduce.task.reduce.*}} {code} $ wc syslog 3642 48192 691013 syslog $ grep task.reduce syslog | wc 3424 46534 659038 $ grep task.reduce.ShuffleScheduler syslog | wc 1521 17745 251458 $ grep task.reduce.Fetcher syslog | wc 1045 15340 223683 $ grep task.reduce.InMemoryMapOutput syslog | wc 4004800 72060 $ grep task.reduce.MergeManagerImpl syslog | wc 4328200 106555 {code} Byte percentage breakdown: {code} Shuffle total: 95% ShuffleScheduler:36% Fetcher: 32% InMemoryMapOutput: 10% MergeManagerImpl:15% {code} While this is information is actually often useful for devops debugging shuffle performance issues, the job users are often lost. We propose to have a dedicated syslog.shuffle file. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Release Apache Hadoop 2.4.1
+1 verified checksum signature on SRC TARBALL verified CHANGES.txt files run apache-rat:check on SRC build SRC installed pseudo cluster run successfully a few MR sample jobs verified HttpFS Thanks Arun On Mon, Jun 16, 2014 at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro
[jira] [Created] (MAPREDUCE-5933) Enable MR AM to post history events to the timeline server
Zhijie Shen created MAPREDUCE-5933: -- Summary: Enable MR AM to post history events to the timeline server Key: MAPREDUCE-5933 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5933 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Zhijie Shen Assignee: Zhijie Shen Nowadays, MR AM collects the history events and writes it to HDFS for JHS to source. With the timeline server, MR AM can put these events there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5934) Make JHS source the timeline server for job history information
Zhijie Shen created MAPREDUCE-5934: -- Summary: Make JHS source the timeline server for job history information Key: MAPREDUCE-5934 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5934 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: jobhistoryserver Reporter: Zhijie Shen Assignee: Zhijie Shen After MAPREDUCE-5933, JHS can source the timeline server to get the job history information. -- This message was sent by Atlassian JIRA (v6.2#6252)
Hadoop-Mapreduce-trunk - Build # 1805 - Still Failing
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1805/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 36181 lines...] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.841 sec - in org.apache.hadoop.mapreduce.TestChild Running org.apache.hadoop.mapreduce.filecache.TestURIFragments Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.065 sec - in org.apache.hadoop.mapreduce.filecache.TestURIFragments Running org.apache.hadoop.mapreduce.TestMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.021 sec - in org.apache.hadoop.mapreduce.TestMapReduce Running org.apache.hadoop.mapreduce.TestNewCombinerGrouping Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.617 sec - in org.apache.hadoop.mapreduce.TestNewCombinerGrouping Results : Tests run: 494, Failures: 0, Errors: 0, Skipped: 11 [INFO] [INFO] Reactor Summary: [INFO] [INFO] hadoop-mapreduce-client ... SUCCESS [2.605s] [INFO] hadoop-mapreduce-client-core .. SUCCESS [53.915s] [INFO] hadoop-mapreduce-client-common SUCCESS [26.598s] [INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [4.286s] [INFO] hadoop-mapreduce-client-app ... SUCCESS [7:04.854s] [INFO] hadoop-mapreduce-client-hs SUCCESS [4:42.751s] [INFO] hadoop-mapreduce-client-jobclient . FAILURE [1:38:48.125s] [INFO] hadoop-mapreduce-client-hs-plugins SKIPPED [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 1:52:04.159s [INFO] Finished at: Wed Jun 18 15:18:26 UTC 2014 [INFO] Final Memory: 29M/112M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on project hadoop-mapreduce-client-jobclient: There was a timeout or other error in the fork - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-mapreduce-client-jobclient Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE Archiving artifacts Sending artifact delta relative to Hadoop-Mapreduce-trunk #1765 Archived 2 artifacts Archive block size is 32768 Received 0 blocks and 16948842 bytes Compression is 0.0% Took 10 sec Updating MAPREDUCE-5924 Updating HADOOP-10590 Updating HADOOP-10660 Updating HADOOP-10557 Updating HDFS-6527 Updating HDFS-6545 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Resolved] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved MAPREDUCE-5928. --- Resolution: Duplicate Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Release Apache Hadoop 2.4.1
There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [VOTE] Release Apache Hadoop 2.4.1
+1, built from source code. installed single node cluster. ran a few sample jobs successfully. Jian On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.1
I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Thanks and Regards, Mayank Cell: 408-718-9370
Re: [VOTE] Release Apache Hadoop 2.4.1
I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Resolved] (MAPREDUCE-5777) Support utf-8 text with BOM (byte order marker)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved MAPREDUCE-5777. - Resolution: Fixed Fix Version/s: 1.3.0 Support utf-8 text with BOM (byte order marker) --- Key: MAPREDUCE-5777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5777 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0, 1.2.1, 2.2.0 Reporter: bc Wong Assignee: zhihai xu Fix For: 1.3.0, 2.5.0 Attachments: MAPREDUCE-5777.000.patch, MAPREDUCE-5777.001.patch, MAPREDUCE-5777.002.patch, MAPREDUCE-5777.003.patch, MAPREDUCE-5777.004.patch, MAPREDUCE-5777.branch1.patch UTF-8 text may have a BOM. TextInputFormat, KeyValueTextInputFormat and friends should recognize the BOM and not treat it as actual data. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Release Apache Hadoop 2.4.1
Point to another MR compatibility issue marked for 2.4.1: MAPREDUCE-5831 Old MR client is not compatible with new MR application, though it happens since 2.3. It would be good to figure out whether we include it now or later. It seems that we're going to be in a better position once we have versioning for MR components. Other than that, +1 (non-binding) for rc0. I've downloaded the source code, built the executable from it, run through MR examples and DS jobs, checked the metrics in the timeline server, and passed the test cases mentioned in the change log. - Zhijie On Thu, Jun 19, 2014 at 5:45 AM, Mayank Bansal maban...@gmail.com wrote: I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Thanks and Regards, Mayank Cell: 408-718-9370 -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.