[jira] Commented: (MAPREDUCE-1764) FairScheduler locality delay may put heavy pressure on Jobtracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867424#action_12867424 ] Joydeep Sen Sarma commented on MAPREDUCE-1764: -- it seems better to find out why the index is not helping (assuming it's actually being used) rather than adding another cache on top .. > FairScheduler locality delay may put heavy pressure on Jobtracker > - > > Key: MAPREDUCE-1764 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1764 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Dmytro Molkov > Fix For: 0.22.0 > > > FairScheduler locality delay feature holds the scheduling of jobs until it > gets good locality. > This greatly improves the locality of the tasks. Reduce the cost of traffic. > We have observed the following problem on FairScheduler locality delay: > We have some machines have older data and some newly added machines do not > have important data. > When these machines send heartbeat, JT scans tasks to find jobs has the right > locality. > Often time, these machines will scan all of the tasks of all the jobs and do > not get any tasks. > Scanning all the tasks on the JT is very costly. This makes JT very slow. > And these machines often time do not get scheduled. This hurts the cluster > utilization. > Any ideas? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1743) conf.get("map.input.file") returns null when using MultipleInputs in Hadoop 0.20
[ https://issues.apache.org/jira/browse/MAPREDUCE-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867397#action_12867397 ] luo Yi commented on MAPREDUCE-1743: --- the following code may get the true file name from the TaggedInputSplit. because TaggedInputSplit is a hadoop inner class ,you should make your class in the org.apache.hadoop.mapred.lib classspace: {code:title=TaggedInputSplitGetName.java|borderStyle=solid} InputSplit is = reporter.getInputSplit(); String name = is.getClass().getName(); if ( name.compareTo("org.apache.hadoop.mapred.FileSplit") == 0 ) { FileSplit fs = (FileSplit)is; String path = fs.getPath().toString(); word.set(path); output.collect(word, one); } if ( name.compareTo("org.apache.hadoop.mapred.lib.TaggedInputSplit") == 0 ) { TaggedInputSplit tis = (TaggedInputSplit)is; InputSplit iis = tis.getInputSplit(); String iname = iis.getClass().getName(); word.set(iname); output.collect(word, one); if ( iname.compareTo("org.apache.hadoop.mapred.FileSplit") == 0 ) { FileSplit fs = (FileSplit)iis; // the path from the TaggedInputSplit should be prefixed by "convert: " String path = "convert: " + fs.getPath().toString(); word.set(path); output.collect(word, one); } } and the output file give me : {noformat} $ grep 'convert' testout/part-0 |head -n 5 convert: hdfs://myowndir/pt=2010051300/attempt_201003291206_327196_r_00_01 convert: hdfs://myowndir/pt=2010051300/attempt_201003291206_327196_r_01_01 convert: hdfs://myowndir/pt=2010051300/attempt_201003291206_327196_r_02_01 convert: hdfs://myowndir/pt=2010051300/attempt_201003291206_327196_r_03_01 convert: hdfs://myowndir/pt=2010051300/attempt_201003291206_327196_r_04_01 {noformat} you may give it a try. {code} > conf.get("map.input.file") returns null when using MultipleInputs in Hadoop > 0.20 > > > Key: MAPREDUCE-1743 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1743 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.2 >Reporter: Yuanyuan Tian > > There is a problem in getting the input file name in the mapper when uisng > MultipleInputs in Hadoop 0.20. I need to use MultipleInputs to support > different formats for my inputs to the my MapReduce job. And inside each > mapper, I also need to know the exact input file that the mapper is > processing. However, conf.get("map.input.file") returns null. Can anybody > help me solve this problem? Thanks in advance. > public class Test extends Configured implements Tool{ > static class InnerMapper extends MapReduceBase implements > Mapper > { > > > public void configure(JobConf conf) > { > String inputName=conf.get("map.input.file")); > ... > } > > } > > public int run(String[] arg0) throws Exception { > JonConf job; > job = new JobConf(Test.class); > ... > > MultipleInputs.addInputPath(conf, new Path("A"), > TextInputFormat.class); > MultipleInputs.addInputPath(conf, new Path("B"), > SequenceFileFormat.class); > ... > } > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1354) Incremental enhancements to the JobTracker for better scalability
[ https://issues.apache.org/jira/browse/MAPREDUCE-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1354: --- Status: Patch Available (was: Open) > Incremental enhancements to the JobTracker for better scalability > - > > Key: MAPREDUCE-1354 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1354 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Devaraj Das >Assignee: Dick King >Priority: Critical > Attachments: mapreduce-1354--2010-03-10.patch, > mapreduce-1354--2010-05-13.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > mr-1354-y20.patch > > > It'd be nice to have the JobTracker object not be locked while accessing the > HDFS for reading the jobconf file and while writing the jobinfo file in the > submitJob method. We should see if we can avoid taking the lock altogether. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1354) Incremental enhancements to the JobTracker for better scalability
[ https://issues.apache.org/jira/browse/MAPREDUCE-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1354: --- Status: Open (was: Patch Available) Patch looks fine. Canceling patch to submit for hudson, as trunk compiles now. > Incremental enhancements to the JobTracker for better scalability > - > > Key: MAPREDUCE-1354 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1354 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Devaraj Das >Assignee: Dick King >Priority: Critical > Attachments: mapreduce-1354--2010-03-10.patch, > mapreduce-1354--2010-05-13.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > mr-1354-y20.patch > > > It'd be nice to have the JobTracker object not be locked while accessing the > HDFS for reading the jobconf file and while writing the jobinfo file in the > submitJob method. We should see if we can avoid taking the lock altogether. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867390#action_12867390 ] Amareshwari Sriramadasu commented on MAPREDUCE-913: --- Test failure for TestNodeRefresh is not related to the patch. The test failed because JVM exited abnormally. The same test passed on my machine. > TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks > and hung TaskTracker > > > Key: MAPREDUCE-913 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-913 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.1 >Reporter: Vinod K V >Assignee: Amareshwari Sriramadasu >Priority: Blocker > Fix For: 0.21.0 > > Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, > MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913-1.txt, > patch-913-2.txt, patch-913.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1018) Document changes to the memory management and scheduling model
[ https://issues.apache.org/jira/browse/MAPREDUCE-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala reassigned MAPREDUCE-1018: --- Assignee: Hemanth Yamijala (was: rahul k singh) Assigning to myself to take forward. I've started work on it, but please bear with slow progress. > Document changes to the memory management and scheduling model > -- > > Key: MAPREDUCE-1018 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1018 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: documentation >Affects Versions: 0.21.0 >Reporter: Hemanth Yamijala >Assignee: Hemanth Yamijala >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPRED-1018-1.patch, MAPRED-1018-2.patch, > MAPRED-1018-3.patch, MAPRED-1018-4.patch.txt, MAPRED-1018-5.patch.txt, > MAPRED-1018-6.patch.txt, MAPRED-1018-commons.patch > > > There were changes done for the configuration, monitoring and scheduling of > high ram jobs. This must be documented in the mapred-defaults.xml and also on > forrest documentation -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1789) MapReduce trunk fails to compile following HADOOP-6600
[ https://issues.apache.org/jira/browse/MAPREDUCE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1789: - Status: Resolved (was: Patch Available) Resolution: Invalid This was fixed by MAPREDUCE-1539 > MapReduce trunk fails to compile following HADOOP-6600 > -- > > Key: MAPREDUCE-1789 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1789 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Tom White >Assignee: Tom White >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1789.patch > > > A few classes need updating following the change to KerberosInfo introduced > in HADOOP-6600 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1539) authorization checks for inter-server protocol (based on HADOOP-6600)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko resolved MAPREDUCE-1539. Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. Thank you Boris. > authorization checks for inter-server protocol (based on HADOOP-6600) > - > > Key: MAPREDUCE-1539 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1539 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1539-1.patch, MAPREDUCE-1539-2.patch, > MAPREDUCE-1539-3.patch, MAPREDUCE-1539-5.patch > > > authorization checks for inter-server protocol (based on HADOOP-6600) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1539) authorization checks for inter-server protocol (based on HADOOP-6600)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-1539: -- Attachment: MAPREDUCE-1539-5.patch > authorization checks for inter-server protocol (based on HADOOP-6600) > - > > Key: MAPREDUCE-1539 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1539 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1539-1.patch, MAPREDUCE-1539-2.patch, > MAPREDUCE-1539-3.patch, MAPREDUCE-1539-5.patch > > > authorization checks for inter-server protocol (based on HADOOP-6600) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1442) StackOverflowError when JobHistory parses a really long line
[ https://issues.apache.org/jira/browse/MAPREDUCE-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867324#action_12867324 ] Dick King commented on MAPREDUCE-1442: -- I reviewed Luke's change, and it looks correct to me. I agree with Luke that {{trunk}} does not have this problem and does not need this patch or any revision of this patch. -dk > StackOverflowError when JobHistory parses a really long line > > > Key: MAPREDUCE-1442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.20.1 >Reporter: bc Wong >Assignee: Luke Lu > Attachments: mr-1442-y20s-v1.patch, overflow.history > > > JobHistory.parseLine() fails with StackOverflowError on a really big COUNTER > value, triggered via the web interface. See attached file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1785) Add streaming config option for not emitting the key
[ https://issues.apache.org/jira/browse/MAPREDUCE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated MAPREDUCE-1785: --- Attachment: mapreduce-1785-1.patch Patch attached. * Adds stream.map.input.ignoreKey for toggling key emission. The default behavior is unchanged. * Updated streaming.xml docs and added test coverage in TestStreamingKeyValue > Add streaming config option for not emitting the key > > > Key: MAPREDUCE-1785 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1785 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/streaming >Affects Versions: 0.22.0 >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Minor > Fix For: 0.22.0 > > Attachments: mapreduce-1785-1.patch > > > PipeMapper currently does not emit the key when using TextInputFormat. If you > switch to input formats (eg LzoTextInputFormat) the key will be emitted. We > should add an option so users can explicitly make streaming not emit the key > so they can change input formats without breaking or having to modify their > existing programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1789) MapReduce trunk fails to compile following HADOOP-6600
[ https://issues.apache.org/jira/browse/MAPREDUCE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-1789: - Status: Patch Available (was: Open) > MapReduce trunk fails to compile following HADOOP-6600 > -- > > Key: MAPREDUCE-1789 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1789 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Tom White >Assignee: Tom White >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1789.patch > > > A few classes need updating following the change to KerberosInfo introduced > in HADOOP-6600 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1789) MapReduce trunk fails to compile following HADOOP-6600
[ https://issues.apache.org/jira/browse/MAPREDUCE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-1789: - Attachment: MAPREDUCE-1789.patch Patch fixing compilation errors. > MapReduce trunk fails to compile following HADOOP-6600 > -- > > Key: MAPREDUCE-1789 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1789 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Tom White >Assignee: Tom White >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1789.patch > > > A few classes need updating following the change to KerberosInfo introduced > in HADOOP-6600 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1789) MapReduce trunk fails to compile following HADOOP-6600
MapReduce trunk fails to compile following HADOOP-6600 -- Key: MAPREDUCE-1789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1789 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Tom White Assignee: Tom White Priority: Blocker Fix For: 0.21.0 A few classes need updating following the change to KerberosInfo introduced in HADOOP-6600 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1764) FairScheduler locality delay may put heavy pressure on Jobtracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867300#action_12867300 ] Scott Chen commented on MAPREDUCE-1764: --- Joydeep: Matei and I had some discussion and we have also looked the code. In JobInProgress, there is such a HashMap of node->[tasks] and rack->[tasks] exists. It is not clear to me why this is so slow. I agree with your point that we should not leave the slots idle especially in the case that cluster is full of jobs. > FairScheduler locality delay may put heavy pressure on Jobtracker > - > > Key: MAPREDUCE-1764 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1764 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Dmytro Molkov > Fix For: 0.22.0 > > > FairScheduler locality delay feature holds the scheduling of jobs until it > gets good locality. > This greatly improves the locality of the tasks. Reduce the cost of traffic. > We have observed the following problem on FairScheduler locality delay: > We have some machines have older data and some newly added machines do not > have important data. > When these machines send heartbeat, JT scans tasks to find jobs has the right > locality. > Often time, these machines will scan all of the tasks of all the jobs and do > not get any tasks. > Scanning all the tasks on the JT is very costly. This makes JT very slow. > And these machines often time do not get scheduled. This hurts the cluster > utilization. > Any ideas? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1761) FairScheduler should allow separate configuration of node and rack locality wait time
[ https://issues.apache.org/jira/browse/MAPREDUCE-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867299#action_12867299 ] Matei Zaharia commented on MAPREDUCE-1761: -- Looks good, thanks! > FairScheduler should allow separate configuration of node and rack locality > wait time > - > > Key: MAPREDUCE-1761 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1761 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1761-v1.1.txt, MAPREDUCE-1761-v1.2.txt, > MAPREDUCE-1761.txt > > > It would be nice that we can separately assign rack locality wait time. > In our use case, we would set node locality wait to zero and wait only rack > locality. > I propose that we add two parameters > mapred.fairscheduler.locality.delay.nodetorack > mapred.fairscheduler.locality.delay.racktoany > This allows specifying the wait time on each stage. > And we can use > mapred.fairscheduler.locality.delay > as the default value of the above fields so that this is backward compatible. > Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1354) Incremental enhancements to the JobTracker for better scalability
[ https://issues.apache.org/jira/browse/MAPREDUCE-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dick King updated MAPREDUCE-1354: - Attachment: mapreduce-1354--2010-05-13.patch I honored the last two comments by Amareshwari [and ignored the one he invited me to ginore] and this is the patch, but as I write this, {{trunk}} does not compile, so I'm not resubmitting this patch just yet. Rather than taking the Big Lock, I chose to turn {{nextJobId}} into an {{AtomicInteger}} . I agree that the {{ugi == null}} test is dead. When {{trunk}} comes to build I'll test this patch and Submit it. -dk > Incremental enhancements to the JobTracker for better scalability > - > > Key: MAPREDUCE-1354 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1354 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Devaraj Das >Assignee: Dick King >Priority: Critical > Attachments: mapreduce-1354--2010-03-10.patch, > mapreduce-1354--2010-05-13.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > MAPREDUCE-1354_yhadoop20.patch, MAPREDUCE-1354_yhadoop20.patch, > mr-1354-y20.patch > > > It'd be nice to have the JobTracker object not be locked while accessing the > HDFS for reading the jobconf file and while writing the jobinfo file in the > submitJob method. We should see if we can avoid taking the lock altogether. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1539) authorization checks for inter-server protocol (based on HADOOP-6600)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867258#action_12867258 ] Boris Shkolnik commented on MAPREDUCE-1539: --- ran tests manually all passed. > authorization checks for inter-server protocol (based on HADOOP-6600) > - > > Key: MAPREDUCE-1539 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1539 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1539-1.patch, MAPREDUCE-1539-2.patch, > MAPREDUCE-1539-3.patch > > > authorization checks for inter-server protocol (based on HADOOP-6600) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1788) o.a.h.mapreduce.Job shouldn't make a copy of the JobConf
[ https://issues.apache.org/jira/browse/MAPREDUCE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867256#action_12867256 ] Aaron Kimball commented on MAPREDUCE-1788: -- Related: MAPREDUCE-1486 > o.a.h.mapreduce.Job shouldn't make a copy of the JobConf > > > Key: MAPREDUCE-1788 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1788 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 0.21.0 >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Blocker > > Having o.a.h.mapreduce.Job make a copy of the passed in JobConf has several > issues: any modifications done by various pieces such as InputSplit etc. are > not reflected back and causes issues for frameworks built on top. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1788) o.a.h.mapreduce.Job shouldn't make a copy of the JobConf
o.a.h.mapreduce.Job shouldn't make a copy of the JobConf Key: MAPREDUCE-1788 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1788 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.21.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Blocker Having o.a.h.mapreduce.Job make a copy of the passed in JobConf has several issues: any modifications done by various pieces such as InputSplit etc. are not reflected back and causes issues for frameworks built on top. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1781) option "-D mapred.tasktracker.map.tasks.maximum=1" does not work when no of mappers is bigger than no of nodes - always spawns 2 mapers/node
[ https://issues.apache.org/jira/browse/MAPREDUCE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867205#action_12867205 ] Hemanth Yamijala commented on MAPREDUCE-1781: - bq. - is it possible to specify that I want 4 mappers/processors or am I limited to a static value at the startup of Hadoop? The configuration per tasktracker can be different for each node, in general. However, that makes managing configurations much harder. Does that work for you now though ? bq. which parameters are set at startup and which at job runtime. OK. Possibly you should file a JIRA asking for this to be explained. But the general rule of thumb is that configurations whose names contain the names of daemons like 'tasktracker' will be start-up only parameters. Configurations whose names contain 'job' or 'task' can be overridden per job. > option "-D mapred.tasktracker.map.tasks.maximum=1" does not work when no of > mappers is bigger than no of nodes - always spawns 2 mapers/node > > > Key: MAPREDUCE-1781 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1781 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.2 > Environment: Debian Lenny x64, and Hadoop 0.20.2, 2GB RAM >Reporter: Tudor Vlad > > Hello > I am a new user of Hadoop and I have some trouble using Hadoop Streaming and > the "-D mapred.tasktracker.map.tasks.maximum" option. > I'm experimenting with an unmanaged application (C++) which I want to run > over several nodes in 2 scenarios > 1) the number of maps (input splits) is equal to the number of nodes > 2) the number of maps is a multiple of the number of nodes (5, 10, 20, ... > Initially, when running the tests in scenario 1 I would sometimes get 2 > process/node on half the nodes. However I fixed this by adding the optin "-D > mapred.tasktracker.map.tasks.maximum=1", so everything works fine. > In the case of scenario 2 (more maps than nodes) this directive no longer > works, always obtaining 2 processes/node. I tested the even with putting > maximum=5 and I still get 2 processes/node. > The entire command I use is: > /usr/bin/time --format="-duration:\t%e |\t-MFaults:\t%F > |\t-ContxtSwitch:\t%w" \ > /opt/hadoop/bin/hadoop jar > /opt/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \ > -D mapred.tasktracker.map.tasks.maximum=1 \ > -D mapred.map.tasks=30 \ > -D mapred.reduce.tasks=0 \ > -D io.file.buffer.size=5242880 \ > -libjars "/opt/hadoop/contrib/streaming/hadoop-7debug.jar" \ > -input input/test \ > -output out1 \ > -mapper "/opt/jobdata/script_1k" \ > -inputformat "me.MyInputFormat" > Why is this happening and how can I make it work properly (i.e. be able to > limit exactly how many mappers I can have at 1 time per node)? > Thank you in advance -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1442) StackOverflowError when JobHistory parses a really long line
[ https://issues.apache.org/jira/browse/MAPREDUCE-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867204#action_12867204 ] Luke Lu commented on MAPREDUCE-1442: Amarsri, my test case for the old parser exploits details of the old file format and weaknesses of Java's backtracking NFA regex implementation. The new implementation in trunk uses the standard json format and a mature json parser (jackson) with about 700 tests. It'll be counterproductive for me to add any tests to have any material impact to the test coverage of the new parser. > StackOverflowError when JobHistory parses a really long line > > > Key: MAPREDUCE-1442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.20.1 >Reporter: bc Wong >Assignee: Luke Lu > Attachments: mr-1442-y20s-v1.patch, overflow.history > > > JobHistory.parseLine() fails with StackOverflowError on a really big COUNTER > value, triggered via the web interface. See attached file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1787) Remove verbose logging from the Groups class
[ https://issues.apache.org/jira/browse/MAPREDUCE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867202#action_12867202 ] Boris Shkolnik commented on MAPREDUCE-1787: --- moved by mistake. moving back to COMMON. > Remove verbose logging from the Groups class > > > Key: MAPREDUCE-1787 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1787 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Boris Shkolnik > Attachments: HADOOP-6598-BP20-Fix.patch, HADOOP-6598-BP20.patch, > HADOOP-6598.patch > > > {quote} > 2010-02-25 08:30:52,269 INFO security.Groups (Groups.java:(60)) - > Group m > apping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; > cacheTimeout > =30 > ... > 2010-02-25 08:30:57,872 INFO security.Groups (Groups.java:getGroups(76)) - > Retu > rning cached groups for 'oom' > {quote} > should both be demoted to debug level. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Moved: (MAPREDUCE-1787) Remove verbose logging from the Groups class
[ https://issues.apache.org/jira/browse/MAPREDUCE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik moved HADOOP-6598 to MAPREDUCE-1787: --- Project: Hadoop Map/Reduce (was: Hadoop Common) Key: MAPREDUCE-1787 (was: HADOOP-6598) > Remove verbose logging from the Groups class > > > Key: MAPREDUCE-1787 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1787 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Boris Shkolnik > Attachments: HADOOP-6598-BP20-Fix.patch, HADOOP-6598-BP20.patch, > HADOOP-6598.patch > > > {quote} > 2010-02-25 08:30:52,269 INFO security.Groups (Groups.java:(60)) - > Group m > apping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; > cacheTimeout > =30 > ... > 2010-02-25 08:30:57,872 INFO security.Groups (Groups.java:getGroups(76)) - > Retu > rning cached groups for 'oom' > {quote} > should both be demoted to debug level. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867133#action_12867133 ] Hadoop QA commented on MAPREDUCE-913: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444388/patch-913-2.txt against trunk revision 943372. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/185/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/185/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/185/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/185/console This message is automatically generated. > TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks > and hung TaskTracker > > > Key: MAPREDUCE-913 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-913 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.1 >Reporter: Vinod K V >Assignee: Amareshwari Sriramadasu >Priority: Blocker > Fix For: 0.21.0 > > Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, > MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913-1.txt, > patch-913-2.txt, patch-913.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1741) Automate the test scenario of job related files are moved from history directory to done directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iyappan Srinivasan updated MAPREDUCE-1741: -- Attachment: TestJobHistoryLocation.patch Patch added after deleting the test scenario which adds 100 files in the done directory and then checking if the history files are still moved to the done directory. Reason being, it does not add value to the functionality. Discussed with Sharad about this. > Automate the test scenario of job related files are moved from history > directory to done directory > --- > > Key: MAPREDUCE-1741 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1741 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 0.22.0 >Reporter: Iyappan Srinivasan > Fix For: 0.22.0 > > Attachments: TestJobHistoryLocation.patch, > TestJobHistoryLocation.patch, TestJobHistoryLocation.patch > > > Job related files are moved from history directory to done directory, when > 1) Job succeeds > 2) Job is killed > 3) When 100 files are put in the done directory > 4) When multiple jobs are completed at the same time, some successful, some > failed. > Also, two files, conf.xml and job files should be present in the done > directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1677) Test scenario for a distributed cache file behaviour when the file is private
[ https://issues.apache.org/jira/browse/MAPREDUCE-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iyappan Srinivasan updated MAPREDUCE-1677: -- Attachment: TestDistributedCachePrivateFile.patch This addresses a sceanrio, when the user who submits the job is different from the user who started the jobtracker/tasktracker daemon. In that case the directory and file permissions will differ. > Test scenario for a distributed cache file behaviour when the file is private > -- > > Key: MAPREDUCE-1677 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1677 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 0.22.0 >Reporter: Iyappan Srinivasan >Assignee: Iyappan Srinivasan > Attachments: > TEST-org.apache.hadoop.mapred.TestDistributedCachePrivateFile.txt, > TestDistributedCachePrivateFile.patch, TestDistributedCachePrivateFile.patch, > TestDistributedCachePrivateFile.patch, TestDistributedCachePrivateFile.patch, > TestDistributedCachePrivateFile.patch > > > Verify the Distributed Cache functionality. > This test scenario is for a distributed cache file behaviour when the file > is private. Once a job uses a distributed > cache file with private permissions that file is stored in the > mapred.local.dir, under the directory which has the same name > as job submitter's username. The directory has 700 permission and the file > under it, should have 777 permissions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-913: -- Status: Patch Available (was: Open) > TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks > and hung TaskTracker > > > Key: MAPREDUCE-913 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-913 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.1 >Reporter: Vinod K V >Assignee: Amareshwari Sriramadasu >Priority: Blocker > Fix For: 0.21.0 > > Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, > MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913-1.txt, > patch-913-2.txt, patch-913.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-913: -- Attachment: patch-913-2.txt Patch updated to trunk > TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks > and hung TaskTracker > > > Key: MAPREDUCE-913 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-913 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.1 >Reporter: Vinod K V >Assignee: Amareshwari Sriramadasu >Priority: Blocker > Fix For: 0.21.0 > > Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, > MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913-1.txt, > patch-913-2.txt, patch-913.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1713) Utilities for system tests specific.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1713: - Attachment: 1713-ydist-security.patch Now I understood you point and moved those two methods into JTClient. Uploaded the latest patch by addressing all the comments. > Utilities for system tests specific. > > > Key: MAPREDUCE-1713 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1713 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1713-ydist-security.patch, 1713-ydist-security.patch, > 1713-ydist-security.patch, 1713-ydist-security.patch, > 1713-ydist-security.patch, systemtestutils_MR1713.patch, > utilsforsystemtest_1713.patch > > > 1. A method for restarting the daemon with new configuration. > public static void restartCluster(Hashtable props, String > confFile) throws Exception; > 2. A method for resetting the daemon with default configuration. > public void resetCluster() throws Exception; > 3. A method for waiting until daemon to stop. > public void waitForClusterToStop() throws Exception; > 4. A method for waiting until daemon to start. > public void waitForClusterToStart() throws Exception; > 5. A method for checking the job whether it has started or not. > public boolean isJobStarted(JobID id) throws IOException; > 6. A method for checking the task whether it has started or not. > public boolean isTaskStarted(TaskInfo taskInfo) throws IOException; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.