[jira] Updated: (MAPREDUCE-1865) [Rumen] Rumen should also support jobhistory files generated using trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1865: --- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this! Thanks Amar! > [Rumen] Rumen should also support jobhistory files generated using trunk > > > Key: MAPREDUCE-1865 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1865 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1865-v1.2.patch, mapreduce-1865-v1.6.2.patch, > mapreduce-1865-v1.7.1.patch, mapreduce-1865-v1.7.patch > > > Rumen code in trunk parses and process only jobhistory files from pre-21 > hadoop mapreduce clusters. It should also support jobhistory files generated > using trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1865) [Rumen] Rumen should also support jobhistory files generated using trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1865: Status: Patch Available (was: Open) > [Rumen] Rumen should also support jobhistory files generated using trunk > > > Key: MAPREDUCE-1865 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1865 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1865-v1.2.patch, mapreduce-1865-v1.6.2.patch, > mapreduce-1865-v1.7.1.patch, mapreduce-1865-v1.7.patch > > > Rumen code in trunk parses and process only jobhistory files from pre-21 > hadoop mapreduce clusters. It should also support jobhistory files generated > using trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1865) [Rumen] Rumen should also support jobhistory files generated using trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1865: Status: Open (was: Patch Available) > [Rumen] Rumen should also support jobhistory files generated using trunk > > > Key: MAPREDUCE-1865 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1865 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1865-v1.2.patch, mapreduce-1865-v1.6.2.patch, > mapreduce-1865-v1.7.1.patch, mapreduce-1865-v1.7.patch > > > Rumen code in trunk parses and process only jobhistory files from pre-21 > hadoop mapreduce clusters. It should also support jobhistory files generated > using trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1865) [Rumen] Rumen should also support jobhistory files generated using trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889016#action_12889016 ] Ravi Gummadi commented on MAPREDUCE-1865: - Patch looks good. +1 > [Rumen] Rumen should also support jobhistory files generated using trunk > > > Key: MAPREDUCE-1865 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1865 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1865-v1.2.patch, mapreduce-1865-v1.6.2.patch, > mapreduce-1865-v1.7.1.patch, mapreduce-1865-v1.7.patch > > > Rumen code in trunk parses and process only jobhistory files from pre-21 > hadoop mapreduce clusters. It should also support jobhistory files generated > using trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1621) Streaming's TextOutputReader.getLastOutput throws NPE if it has never read any output
[ https://issues.apache.org/jira/browse/MAPREDUCE-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1621: Status: Patch Available (was: Open) > Streaming's TextOutputReader.getLastOutput throws NPE if it has never read > any output > - > > Key: MAPREDUCE-1621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: patch-1621.txt > > > If TextOutputReader.readKeyValue() has never successfully read a line, then > its bytes member will be left null. Thus when logging a task failure, > PipeMapRed.getContext() can trigger an NPE when it calls > outReader_.getLastOutput(). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1621) Streaming's TextOutputReader.getLastOutput throws NPE if it has never read any output
[ https://issues.apache.org/jira/browse/MAPREDUCE-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1621: Status: Open (was: Patch Available) > Streaming's TextOutputReader.getLastOutput throws NPE if it has never read > any output > - > > Key: MAPREDUCE-1621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: patch-1621.txt > > > If TextOutputReader.readKeyValue() has never successfully read a line, then > its bytes member will be left null. Thus when logging a task failure, > PipeMapRed.getContext() can trigger an NPE when it calls > outReader_.getLastOutput(). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1911) Fix errors in -info option in streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1911: --- Attachment: patch-1911-2.txt Patch incorporates review comments. > Fix errors in -info option in streaming > --- > > Key: MAPREDUCE-1911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1911 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1911-1.txt, patch-1911-2.txt, patch-1911.txt > > > Here are some of the findings by Karam while verifying -info option in > streaming: > # We need to add "Optional" for -mapper, -reducer,-combiner and -file options. > # For -inputformat and -outputformat options, we should put "Optional" in the > prefix for the sake on uniformity. > # We need to remove -cluster decription. > # -help option is not displayed in usage message. > # when displaying message for -info or -help options, we should not display > "Streaming Job Failed!"; also exit code should be 0 in case of -help/-info > option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1911) Fix errors in -info option in streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1911: --- Status: Patch Available (was: Open) > Fix errors in -info option in streaming > --- > > Key: MAPREDUCE-1911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1911 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1911-1.txt, patch-1911-2.txt, patch-1911.txt > > > Here are some of the findings by Karam while verifying -info option in > streaming: > # We need to add "Optional" for -mapper, -reducer,-combiner and -file options. > # For -inputformat and -outputformat options, we should put "Optional" in the > prefix for the sake on uniformity. > # We need to remove -cluster decription. > # -help option is not displayed in usage message. > # when displaying message for -info or -help options, we should not display > "Streaming Job Failed!"; also exit code should be 0 in case of -help/-info > option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1911) Fix errors in -info option in streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1911: --- Status: Open (was: Patch Available) > Fix errors in -info option in streaming > --- > > Key: MAPREDUCE-1911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1911 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1911-1.txt, patch-1911.txt > > > Here are some of the findings by Karam while verifying -info option in > streaming: > # We need to add "Optional" for -mapper, -reducer,-combiner and -file options. > # For -inputformat and -outputformat options, we should put "Optional" in the > prefix for the sake on uniformity. > # We need to remove -cluster decription. > # -help option is not displayed in usage message. > # when displaying message for -info or -help options, we should not display > "Streaming Job Failed!"; also exit code should be 0 in case of -help/-info > option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1936) [gridmix3] Make Gridmix3 more customizable.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889005#action_12889005 ] Hong Tang commented on MAPREDUCE-1936: -- test-patch passes on my local machine: {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 5 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} > [gridmix3] Make Gridmix3 more customizable. > --- > > Key: MAPREDUCE-1936 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1936 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1936-20100715.patch, mr-1936-yhadoop-20.1xx.patch > > > I'd like to make gridmix3 more customizable. Specifically, the proposed > customizations include: > - add (random) location information for each sleep map task. > - make the parameters used in stress submission load throttling configurable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1946) enhance FileInputFormat.setInputPaths()
enhance FileInputFormat.setInputPaths() --- Key: MAPREDUCE-1946 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1946 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission Affects Versions: 0.20.2 Reporter: Ted Yu FileInputFormat.setInputPaths(Job job, Path... inputPaths) can be enhanced in the following 3 ways: 1) when the input paths are known only at runtime, we need another form which accepts Collection<> as second parameter. E.g. Set inputPaths 2) Use StringBuilder instead of StringBuffer because StringBuilder doesn't incur synchronization cost 3) The biggest performance boost comes from calling the following constructor of StringBuilder: public StringBuilder(int capacity) capacity can be a 3rd parameter to setInputPaths() This would avoid excessive calls to Arrays.copyOf(). The following stack trace was observed when our code used FileInputFormat.addInputPath() many times when a lot of files are eligible for processing: java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:330) at com.carrieriq.m2m.platform.mmp2.input.PackageInput.configureJobConf(PackageInput.java:336) After incorporating all three optimizations, total time taken in customized setInputPaths(JobConf conf, Set inputPaths) was 2 seconds. The combined time calling FileInputFormat.addInputPath() was over 80 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1945) Support for using different Kerberos keys for Jobtracker and TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kan Zhang updated MAPREDUCE-1945: - Attachment: m6632-03.patch A patch for the mapred part of HADOOP-6632. It incorporates Deveraj's bug fix (see HADOOP-6632). Ran "ant test". TestRumenJobTraces failed, but it's unrelated. Also, manually verified the feature on a single node cluster. > Support for using different Kerberos keys for Jobtracker and TaskTrackers > - > > Key: MAPREDUCE-1945 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1945 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Kan Zhang >Assignee: Kan Zhang > Attachments: m6632-03.patch > > > This is the MapRed part of HADOOP-6632. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1945) Support for using different Kerberos keys for Jobtracker and TaskTrackers
Support for using different Kerberos keys for Jobtracker and TaskTrackers - Key: MAPREDUCE-1945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1945 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Kan Zhang Assignee: Kan Zhang This is the MapRed part of HADOOP-6632. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1833) [gridmix3] limit the maximum task duration in sleep job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang reassigned MAPREDUCE-1833: Assignee: Hong Tang > [gridmix3] limit the maximum task duration in sleep job. > > > Key: MAPREDUCE-1833 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1833 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1833-yahoo-20.10.patch, mr-1833-yahoo-20.1xx.patch > > > In production job history logs, sometimes a task takes very long time to > finish. Replaying such trace in sleep-job mode in Gridmix3 would > unnecessarily prolong the benchmark execution time. It would be desirable to > allow users to limit the maximum task duration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1935) HFTP needs to be updated to use delegation tokens (from HDFS-1007)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-1935: -- Attachment: MAPREDUCE-1935-1.patch > HFTP needs to be updated to use delegation tokens (from HDFS-1007) > -- > > Key: MAPREDUCE-1935 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1935 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1935-1.patch, MAPREDUCE-1935.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888954#action_12888954 ] Scott Chen commented on MAPREDUCE-1848: --- Thanks, Dmytro and Dhruba! > Put number of speculative, data local, rack local tasks in JobTracker metrics > - > > Key: MAPREDUCE-1848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1848 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1848-20100614.txt, > MAPREDUCE-1848-20100617.txt, MAPREDUCE-1848-20100623.txt > > > It will be nice that we can collect these information in JobTracker metrics -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1848: Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. Thanks Scott! > Put number of speculative, data local, rack local tasks in JobTracker metrics > - > > Key: MAPREDUCE-1848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1848 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1848-20100614.txt, > MAPREDUCE-1848-20100617.txt, MAPREDUCE-1848-20100623.txt > > > It will be nice that we can collect these information in JobTracker metrics -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1936) [gridmix3] Make Gridmix3 more customizable.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1936: - Status: Patch Available (was: Open) > [gridmix3] Make Gridmix3 more customizable. > --- > > Key: MAPREDUCE-1936 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1936 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1936-20100715.patch, mr-1936-yhadoop-20.1xx.patch > > > I'd like to make gridmix3 more customizable. Specifically, the proposed > customizations include: > - add (random) location information for each sleep map task. > - make the parameters used in stress submission load throttling configurable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1936) [gridmix3] Make Gridmix3 more customizable.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1936: - Attachment: mr-1936-20100715.patch > [gridmix3] Make Gridmix3 more customizable. > --- > > Key: MAPREDUCE-1936 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1936 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1936-20100715.patch, mr-1936-yhadoop-20.1xx.patch > > > I'd like to make gridmix3 more customizable. Specifically, the proposed > customizations include: > - add (random) location information for each sleep map task. > - make the parameters used in stress submission load throttling configurable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1936) [gridmix3] Make Gridmix3 more customizable.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang reassigned MAPREDUCE-1936: Assignee: Hong Tang > [gridmix3] Make Gridmix3 more customizable. > --- > > Key: MAPREDUCE-1936 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1936 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1936-20100715.patch, mr-1936-yhadoop-20.1xx.patch > > > I'd like to make gridmix3 more customizable. Specifically, the proposed > customizations include: > - add (random) location information for each sleep map task. > - make the parameters used in stress submission load throttling configurable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1566) Need to add a mechanism to import tokens and secrets into a submitted job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated MAPREDUCE-1566: Attachment: MR-1566.1.patch Patch for trunk. > Need to add a mechanism to import tokens and secrets into a submitted job. > -- > > Key: MAPREDUCE-1566 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1566 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: security >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 0.22.0 > > Attachments: mr-1566-1.1.patch, mr-1566-1.patch, MR-1566.1.patch > > > We need to include tokens and secrets into a submitted job. I propose adding > a configuration attribute that when pointed at a token storage file will > include the tokens and secrets from that token storage file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1912) [Rumen] Add a driver for Rumen tool
[ https://issues.apache.org/jira/browse/MAPREDUCE-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888770#action_12888770 ] Amar Kamat commented on MAPREDUCE-1912: --- Ravi, Thats something I am working on. Quite not figured it out. The testing of this improvement is 2 fold # Test if the jar is correct i.e w.r.t the libraries # Test the switches in Driver and Rumen #2 is something doable. #1 is something I am currently working on. > [Rumen] Add a driver for Rumen tool > > > Key: MAPREDUCE-1912 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1912 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1912-v1.1.patch > > > Rumen, as a tool, has 2 entry points : > - Trace builder > - Folder > It would be nice to have a single driver program and have 'trace-builder' and > 'folder' as its options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1912) [Rumen] Add a driver for Rumen tool
[ https://issues.apache.org/jira/browse/MAPREDUCE-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888767#action_12888767 ] Ravi Gummadi commented on MAPREDUCE-1912: - Also, can we add a unit test ? > [Rumen] Add a driver for Rumen tool > > > Key: MAPREDUCE-1912 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1912 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amar Kamat >Assignee: Amar Kamat > Fix For: 0.22.0 > > Attachments: mapreduce-1912-v1.1.patch > > > Rumen, as a tool, has 2 entry points : > - Trace builder > - Folder > It would be nice to have a single driver program and have 'trace-builder' and > 'folder' as its options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1911) Fix errors in -info option in streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888766#action_12888766 ] Ravi Gummadi commented on MAPREDUCE-1911: - Some comments: (1) In DumpTypedBytes.java, LoadTypedBytes.java and in HadoopStreaming.java, in printUsage(), "$HADOOP_HOME/hadoop-streaming.jar" is mentioned, which is not the correct path of streaming jar. May be it is better to just say hadoop-streaming.jar instead of giving path. (2) In DumpTypedBytes.java, in "if (args.length == 0)", before printUsage() call, printing an error message like "too few arguments to dumptb" would be useful. (3) Would it be better to say in printUsage() in HadoopStreaming.java as "[streamjob]" to signify that streamjob option is optional ? (4) Description of options/arguments is missing in printUsage() in LoadTypedBytes and DumpTypedBytes. You can give the same description provided printUsage() of HadoopStreaming.java. For dumptb, should we say that the input can be either text file or sequence file ? Will let you know offline some very minor nits. > Fix errors in -info option in streaming > --- > > Key: MAPREDUCE-1911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1911 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1911-1.txt, patch-1911.txt > > > Here are some of the findings by Karam while verifying -info option in > streaming: > # We need to add "Optional" for -mapper, -reducer,-combiner and -file options. > # For -inputformat and -outputformat options, we should put "Optional" in the > prefix for the sake on uniformity. > # We need to remove -cluster decription. > # -help option is not displayed in usage message. > # when displaying message for -info or -help options, we should not display > "Streaming Job Failed!"; also exit code should be 0 in case of -help/-info > option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1944) Not able to rum worcount test example of Map/REduce
Not able to rum worcount test example of Map/REduce --- Key: MAPREDUCE-1944 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1944 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 0.20.2 Environment: Centos Reporter: diptee dalal Not able to run Workcount test. I get this logs INFO mapred.FileInputFormat: Total input paths to process : 2 10/07/15 14:22:43 INFO mapred.JobClient: Running job: job_201007151211_0007 10/07/15 14:22:44 INFO mapred.JobClient: map 0% reduce 0% 10/07/15 14:22:53 INFO mapred.JobClient: map 66% reduce 0% 10/07/15 14:22:56 INFO mapred.JobClient: map 100% reduce 0% 10/07/15 14:22:58 INFO mapred.JobClient: Task Id : attempt_201007151211_0007_r_00_0, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(Unknown Source) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605) 10/07/15 14:23:04 INFO mapred.JobClient: Task Id : attempt_201007151211_0007_r_00_1, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(Unknown Source) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605) 10/07/15 14:23:10 INFO mapred.JobClient: Task Id : attempt_201007151211_0007_r_00_2, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(Unknown Source) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605) 10/07/15 14:23:19 INFO mapred.JobClient: Job complete: job_201007151211_0007 10/07/15 14:23:19 INFO mapred.JobClient: Counters: 13 10/07/15 14:23:19 INFO mapred.JobClient: Job Counters 10/07/15 14:23:19 INFO mapred.JobClient: Launched reduce tasks=4 10/07/15 14:23:19 INFO mapred.JobClient: Launched map tasks=3 10/07/15 14:23:19 INFO mapred.JobClient: Data-local map tasks=3 10/07/15 14:23:19 INFO mapred.JobClient: Failed reduce tasks=1 10/07/15 14:23:19 INFO mapred.JobClient: FileSystemCounters 10/07/15 14:23:19 INFO mapred.JobClient: HDFS_BYTES_READ=41 10/07/15 14:23:19 INFO mapred.JobClient: FILE_BYTES_WRITTEN=190 10/07/15 14:23:19 INFO mapred.JobClient: Map-Reduce Framework 10/07/15 14:23:19 INFO mapred.JobClient: Combine output records=7 10/07/15 14:23:19 INFO mapred.JobClient: Map input records=4 10/07/15 14:23:19 INFO mapred.JobClient: Spilled Records=7 10/07/15 14:23:19 INFO mapred.JobClient: Map output bytes=62 10/07/15 14:23:19 INFO mapred.JobClient: Map input bytes=38 10/07/15 14:23:19 INFO mapred.JobClient: Combine input records=7 10/07/15 14:23:19 INFO mapred.JobClient: Map output records=7 Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.myorg.WordCount.main(WordCount.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.