[jira] [Commented] (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447568#comment-13447568 ] Ming Jin commented on MAPREDUCE-577: Hi everyone, I found the exact same issue in Hadoop v1.0.3(http://fossies.org/dox/hadoop-1.0.3/StreamXmlRecordReader_8java_source.html). Is there any plan to fix it in v1.0.3? Duplicate Mapper input when using StreamXmlRecordReader --- Key: MAPREDUCE-577 URL: https://issues.apache.org/jira/browse/MAPREDUCE-577 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Environment: HADOOP 0.17.0, Java 6.0 Reporter: David Campbell Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 0002-patch-for-HADOOP-3484.patch, 577.20S.patch, 577.patch, 577.v1.patch, 577.v2.patch, 577.v3.patch, 577.v4.patch, HADOOP-3484.combined.patch, HADOOP-3484.try3.patch I have an XML file with 93626 rows. A row is marked by row.../row. I've confirmed this with grep and the Grep example program included with HADOOP. Here is the grep example output. 93626 row I've setup my job configuration as follows: conf.set(stream.recordreader.class, org.apache.hadoop.streaming.StreamXmlRecordReader); conf.set(stream.recordreader.begin, row); conf.set(stream.recordreader.end, /row); conf.setInputFormat(StreamInputFormat.class); I have a fairly simple test Mapper. Here's the map method. public void map(Text key, Text value, OutputCollectorText, IntWritable output, Reporter reporter) throws IOException { try { output.collect(totalWord, one); if (key != null key.toString().indexOf(01852) != -1) { output.collect(new Text(01852), one); } } catch (Exception ex) { Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, null, ex); System.out.println(value); } } For totalWord (TOTAL), I get: TOTAL 140850 and for 01852 I get. 01852 86 There are 43 instances of 01852 in the file. I have the following setting in my config. conf.setNumMapTasks(1); I have a total of six machines in my cluster. If I run without this, the result is 12x the actual value, not 2x. Here's some info from the cluster web page. Maps Reduces Total Submissions Nodes Map Task Capacity Reduce Task CapacityAvg. Tasks/Node 0 0 1 6 12 12 4.00 I've also noticed something really strange in the job's output. It looks like it's starting over or redoing things. This was run using all six nodes and no limitations on map or reduce tasks. I haven't seen this behavior in any other case. 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 1 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018 08/06/03 10:50:37 INFO mapred.JobClient: map 0% reduce 0% 08/06/03 10:50:42 INFO mapred.JobClient: map 2% reduce 0% 08/06/03 10:50:45 INFO mapred.JobClient: map 12% reduce 0% 08/06/03 10:50:47 INFO mapred.JobClient: map 31% reduce 0% 08/06/03 10:50:48 INFO mapred.JobClient: map 49% reduce 0% 08/06/03 10:50:49 INFO mapred.JobClient: map 68% reduce 0% 08/06/03 10:50:50 INFO mapred.JobClient: map 100% reduce 0% 08/06/03 10:50:54 INFO mapred.JobClient: map 87% reduce 0% 08/06/03 10:50:55 INFO mapred.JobClient: map 100% reduce 0% 08/06/03 10:50:56 INFO mapred.JobClient: map 0% reduce 0% 08/06/03 10:51:00 INFO mapred.JobClient: map 0% reduce 1% 08/06/03 10:51:05 INFO mapred.JobClient: map 28% reduce 2% 08/06/03 10:51:07 INFO mapred.JobClient: map 80% reduce 4% 08/06/03 10:51:08 INFO mapred.JobClient: map 100% reduce 4% 08/06/03 10:51:09 INFO mapred.JobClient: map 100% reduce 7% 08/06/03 10:51:10 INFO mapred.JobClient: map 90% reduce 9% 08/06/03 10:51:11 INFO mapred.JobClient: map 100% reduce 9% 08/06/03 10:51:12 INFO mapred.JobClient: map 100% reduce 11% 08/06/03 10:51:13 INFO mapred.JobClient: map 90% reduce 11% 08/06/03 10:51:14 INFO mapred.JobClient: map 97% reduce 11% 08/06/03 10:51:15 INFO mapred.JobClient: map 63% reduce 11% 08/06/03 10:51:16 INFO mapred.JobClient: map 48% reduce 11% 08/06/03 10:51:17 INFO mapred.JobClient: map 21% reduce 11% 08/06/03 10:51:19 INFO mapred.JobClient: map 0% reduce 11% 08/06/03 10:51:20 INFO mapred.JobClient: map 15% reduce 12% 08/06/03 10:51:21 INFO mapred.JobClient: map 27% reduce 13% 08/06/03 10:51:22 INFO mapred.JobClient: map 67% reduce 13% 08/06/03 10:51:24 INFO mapred.JobClient: map 22% reduce 16% 08/06/03 10:51:25 INFO mapred.JobClient: map 46% reduce 16% 08/06/03 10:51:26 INFO mapred.JobClient: map 70% reduce 16% 08/06/03
[jira] [Created] (MAPREDUCE-4631) Duplicate Mapper input when using StreamXmlRecordReader
Ming Jin created MAPREDUCE-4631: --- Summary: Duplicate Mapper input when using StreamXmlRecordReader Key: MAPREDUCE-4631 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4631 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Affects Versions: 1.0.3 Reporter: Ming Jin This is the same defect as https://issues.apache.org/jira/browse/MAPREDUCE-577, which was fixed in v0.22.0. So I'm wondering whether there is a plan to fix it in v1.0.3 as well? Or shall I move to v2.0.x? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4631) Duplicate Mapper input when using StreamXmlRecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Jin updated MAPREDUCE-4631: Environment: Hadoop v1.0.3, JDK 6 Duplicate Mapper input when using StreamXmlRecordReader --- Key: MAPREDUCE-4631 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4631 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Affects Versions: 1.0.3 Environment: Hadoop v1.0.3, JDK 6 Reporter: Ming Jin This is the same defect as https://issues.apache.org/jira/browse/MAPREDUCE-577, which was fixed in v0.22.0. So I'm wondering whether there is a plan to fix it in v1.0.3 as well? Or shall I move to v2.0.x? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447581#comment-13447581 ] Avner BenHanoch commented on MAPREDUCE-4049: _Asokan,_ My design has no conflict with your design. Below is a comment *I wrote you 4 months ago*: {quote} _Your patch for the trunk is good enough for my needs. I can write my RDMA shuffle plugin based on either your patch or based on my patch. Hence, I am not planning to submit additional patch for the trunk on top of your patch. (I will only submit patch for 1.x)_ {quote} (I have only now submitted a patch for the trunk, because of Arun's/Todd's comment on my 1.x patch) The academic paper I pointed as *Reference* should not be confused with my plugin (Personally, I consider code in academic researches as POC and not as product). The two relevant conclusions I take from this academic research are: 1) Hadoop can benefit from RDMA shuffle and shuffle plugin-ability 2) With fast shuffle, Hadoop can benefit from *additional* merge algorithms that are not practical with slow shuffle. That's all! There is no request for Hadoop to keep its coupling of shuffle with merge. Again, I encourage your decoupling! When your patch will be accepted to the trunk, I will adjust future versions of my plugin following your decoupling. *My design should not disturb you in any way!* When reviewing my design from ReduceTask.java point of view, *If you merely rename: ShuffleConsumerPlugin - ReduceFeederPlugin, then you could easily see that your decoupling design can peacefully come after my design.* I believe the thing that disturbs you is that currently Hadoop uses 'shuffle' which invokes 'merge' while you want the opposite direction. However, this is outside the scope of my patch. Hence, you are welcome to build your patch on top of mine. It is not really different than building your patch on top of the current trunk. I will be more than happy to assist you with anything you might need, and I'll appreciate it if you gave me your blessing for my commit :-) plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-207) Computing Input Splits on the MR Cluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447584#comment-13447584 ] Johannes Zillmann commented on MAPREDUCE-207: - Currently in our hadoop applications we calculate the splits before we submit it to the client (then the client simply looks up the existing splits). We do that mainly to influence the reducer count base on the number of splits/map-tasks. In case hadoop does the splitting on the cluster (which makes sense), it would be nice to have a hook to influence configuration! Sometimes it also makes sense for us to decide on the map-reduce assembly after we know the splits (different join strategies for different data constellations). Just dumping some ideas here... Computing Input Splits on the MR Cluster Key: MAPREDUCE-207 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Philip Zeyliger Assignee: Arun C Murthy Attachments: MAPREDUCE-207.patch Instead of computing the input splits as part of job submission, Hadoop could have a separate job task type that computes the input splits, therefore allowing that computation to happen on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447668#comment-13447668 ] Hudson commented on MAPREDUCE-2786: --- Integrated in Hadoop-Hdfs-trunk #1155 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1155/]) MAPREDUCE-2786. Add compression option for TestDFSIO. Contributed by Plamen Jeliazkov. (Revision 1380310) Result = SUCCESS shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1380310 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/IOMapperBase.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java TestDFSIO should also test compression reading/writing from command-line. - Key: MAPREDUCE-2786 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks Affects Versions: 2.0.0-alpha Reporter: Plamen Jeliazkov Assignee: Plamen Jeliazkov Priority: Minor Labels: newbie Fix For: 2.2.0-alpha Attachments: MAPREDUCE_2786.patch, MAPREDUCE_2786.patch, MAPREDUCE_2786.patch, MAPREDUCE-2786.patch Original Estimate: 36h Remaining Estimate: 36h I thought it might be beneficial to simply alter the code of TestDFSIO to accept any compression codec class and allow testing for compression by a command line argument instead of having to change the config file everytime. Something like -compression would do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447699#comment-13447699 ] Hudson commented on MAPREDUCE-2786: --- Integrated in Hadoop-Mapreduce-trunk #1186 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1186/]) MAPREDUCE-2786. Add compression option for TestDFSIO. Contributed by Plamen Jeliazkov. (Revision 1380310) Result = SUCCESS shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1380310 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/IOMapperBase.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java TestDFSIO should also test compression reading/writing from command-line. - Key: MAPREDUCE-2786 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks Affects Versions: 2.0.0-alpha Reporter: Plamen Jeliazkov Assignee: Plamen Jeliazkov Priority: Minor Labels: newbie Fix For: 2.2.0-alpha Attachments: MAPREDUCE_2786.patch, MAPREDUCE_2786.patch, MAPREDUCE_2786.patch, MAPREDUCE-2786.patch Original Estimate: 36h Remaining Estimate: 36h I thought it might be beneficial to simply alter the code of TestDFSIO to accept any compression codec class and allow testing for compression by a command line argument instead of having to change the config file everytime. Something like -compression would do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447718#comment-13447718 ] Mariappan Asokan commented on MAPREDUCE-4049: - Hi Avner, Thanks for the clarification. I am also back-porting MAPREDUCE-2454 to Hadoop 1.1(please see MAPREDUCE-4482.) There also, I am trying to decouple merge related code from {{ReduceCopier}} class in {{ReduceTask.java}}. I started my work initially on the trunk version because Arun asked me to do so. Once I finish refactoring {{ReduceCopier}} I will post a patch in MAPREDUCE-4482. Please take a look at it when I am done. Thanks for your offer to help me. -- Asokan plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-1700: - Attachment: MAPREDUCE-1700.patch New patch with a unit test. The test isn't integrated into the build yet, so you have to build the class-isolation-example module manually first. I've also removed the fictitious libs and instead used Guava as an example of an incompatibility. User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4632) Make sure MapReduce declares correct set of dependencies
Tom White created MAPREDUCE-4632: Summary: Make sure MapReduce declares correct set of dependencies Key: MAPREDUCE-4632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 2.0.0-alpha Reporter: Tom White This is the equivalent of HADOOP-8278 for MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447876#comment-13447876 ] Avner BenHanoch commented on MAPREDUCE-4049: Cool. I wish you good luck with your issues. I am watching them for staying in the picture and for any question you may have. I understand you don't have obligation to my commit any more. Right? (Please let me know if you want the rename: ShuffleConsumerPlugin - ReduceFeederPlugin, or any other name) plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4582) [MAPREDUCE-3902] ScheduledRequests#remove should remove the elements from mapsHostMapping and mapsRackMapping
[ https://issues.apache.org/jira/browse/MAPREDUCE-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447882#comment-13447882 ] Siddharth Seth commented on MAPREDUCE-4582: --- bq. removing entries from attemptToLaunchRequestMap needs to happen after the ScheduledRequests.remove call. handleTaStopRequest removes the task attempt from attemptToLaunchRequestMap before it attempts to remove the attempt from ScheduledRequests. By the previous comment, I meant this order of removal needs to be fixed. [MAPREDUCE-3902] ScheduledRequests#remove should remove the elements from mapsHostMapping and mapsRackMapping - Key: MAPREDUCE-4582 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4582 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: MR-3902 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: MAPREDUCE-4582.2.patch, MAPREDUCE-4582.3.patch, MAPREDUCE-4582.patch ScheduledRequests#remove only remove the specified TaskAttemptId from maps. It's inefficient, and the method should remove the elements from mapsHostMapping and mapsRackMapping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4629) In DEBUG_MODE, JobHistory#directoryTime() returns incorrect time
[ https://issues.apache.org/jira/browse/MAPREDUCE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447886#comment-13447886 ] Karthik Kambatla commented on MAPREDUCE-4629: - From my conversation with Alejandro offline: More context: - The regular mode cleans up the history files after a month. The history filenames use /mm/dd. - The debug mode cleans up the history files after 20 minutes. The history filenames (currently) use /hour/min. The DEBUG_MODE overloads the regular mode. Instead, a better approach seems to be to append to the regular mode, instead of overloading it. Also, the config parameter to turn the DEBUG_MODE on/off mapreduce.jobhistory.debug.mode doesn't have a default in mapred-default.xml and needs to be added. In DEBUG_MODE, JobHistory#directoryTime() returns incorrect time Key: MAPREDUCE-4629 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4629 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4629.patch The helper methods in JobHistory - directoryTime() and timestampDirectoryComponent() - adjust the month field for readability (Jan is 0 as per Calendar, 1 for us) In DEBUG_MODE, JobHistory uses hour instead of month. However, the adjustment is still applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4629) In DEBUG_MODE, JobHistory#directoryTime() returns incorrect time
[ https://issues.apache.org/jira/browse/MAPREDUCE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4629: Status: Open (was: Patch Available) In DEBUG_MODE, JobHistory#directoryTime() returns incorrect time Key: MAPREDUCE-4629 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4629 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4629.patch The helper methods in JobHistory - directoryTime() and timestampDirectoryComponent() - adjust the month field for readability (Jan is 0 as per Calendar, 1 for us) In DEBUG_MODE, JobHistory uses hour instead of month. However, the adjustment is still applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4619) [MAPREDUCE-3902] Change AMContainerMap to extend AbstractService
[ https://issues.apache.org/jira/browse/MAPREDUCE-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447891#comment-13447891 ] Siddharth Seth commented on MAPREDUCE-4619: --- Looks good. +1, committing to branch MR-3902. Thanks Tsuyoshi [MAPREDUCE-3902] Change AMContainerMap to extend AbstractService Key: MAPREDUCE-4619 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4619 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: MR-3902 Reporter: Siddharth Seth Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4619.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4619) [MAPREDUCE-3902] Change AMContainerMap to extend AbstractService
[ https://issues.apache.org/jira/browse/MAPREDUCE-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved MAPREDUCE-4619. --- Resolution: Fixed Fix Version/s: MR-3902 Hadoop Flags: Reviewed [MAPREDUCE-3902] Change AMContainerMap to extend AbstractService Key: MAPREDUCE-4619 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4619 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: MR-3902 Reporter: Siddharth Seth Assignee: Tsuyoshi OZAWA Fix For: MR-3902 Attachments: MAPREDUCE-4619.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned MAPREDUCE-1700: Assignee: Arun C Murthy User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Arun C Murthy Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned MAPREDUCE-1700: Assignee: Tom White (was: Arun C Murthy) User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars
[ https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447908#comment-13447908 ] Arun C Murthy commented on MAPREDUCE-4421: -- Tucu - I think we are close, but I don't want MR AM or DistShell AM configs in yarn-site.xml. They belong in mapred-site.xml or distshell-site.xml etc. Makes sense? Remove dependency on deployed MR jars - Key: MAPREDUCE-4421 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Arun C Murthy Currently MR AM depends on MR jars being deployed on all nodes via implicit dependency on YARN_APPLICATION_CLASSPATH. We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, probably, just rely on adding a shaded MR jar along with job.jar to the dist-cache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447911#comment-13447911 ] Arun C Murthy commented on MAPREDUCE-1700: -- Tom, I don't understand specific advantages of OSGI or Felix, so please pardon some of my questions. However, with MR being an application in YARN (see MAPREDUCE-4421) we can just add user jars in front of the classpath for the tasks (we already allow it). This isn't the same Map/Reduce child inherits the TT classpath problem in MR1 (actually even in MR1 you can put child jars ahead in the classpath for a long while now). Given this, do we need to bring in OSGI or Felix, what do else do they provide? Thanks. User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447916#comment-13447916 ] Avner BenHanoch commented on MAPREDUCE-4049: [correcting typo:] Cool. I wish you good luck with your issues. I am watching them for staying in the picture and for any question you may have. I understand you don't have objection to my commit any more. Right? (Please let me know if you want the rename: ShuffleConsumerPlugin - ReduceFeederPlugin, or any other name) plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447943#comment-13447943 ] Siddharth Seth commented on MAPREDUCE-3902: --- Thanks for the help with this JIRA. bq. because MRAppMaster in container-reuse implementation has the feature to monitor whether the running tasks on the containers are the last task at a machine or not, for the purpose of exiting JVMs on containers, as you know. That will definitely be simpler to achieve with the container-reuse AM, with nodes already tracking container information. Last task on a node can be figured out relatively easily by the scheduler. It is, however, also possible with the current AM, and several bits like the decision on when to run the combiner - should be a straight forward port to the reuse-AM. IAC, it'll be good to get the re-use AM into trunk fast. Looking forward to the updates on 4502 and 4525. MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc. -- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Siddharth Seth Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) : MAPREDUCE-4525 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447981#comment-13447981 ] Plamen Jeliazkov commented on MAPREDUCE-4491: - Great work, Benoy! This looks like a very neat feature to add. I am all in support. I like your similarity with the compressor / decompressor interfaces and the ease of the implementation to plug-in any keystores. I am in the midst of applying your patches and doing a small test locally and will reply back with any results I find. Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. Update: The patches are uploaded to subtasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447996#comment-13447996 ] Steve Loughran commented on MAPREDUCE-1700: --- Arun, I see where Tom is coming from. Irrespective of how the Hadoop services are deployed, you need to be able to do things like submit jobs from OSGi containers (e.g Spring others) which is what this patch appears to offer. And if Oracle finally commit to OSGi now that Java 8 is being redefined, it'd be good from all clients. I would like to see a way to support this which doesn't put an OSGi JAR on the classpath of everything. Tom -is there a way to abstract away OSGi support so that it's optional, even if its a subclass of JobSubmitter? An {{org.apache.hadoop.mapreduce.osgi.OSGiJobSubmitter}} could override some new specific protect methods to enable this. User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448021#comment-13448021 ] Mariappan Asokan commented on MAPREDUCE-4049: - Hi Avner, You agree that {{ShuffleConsumerPlugin}} should be decoupled from merge. In that case, it should have nothing to do with {{RawKeyValueIterator}}. However in its current form the {{run()}} method in {{ShuffleConsumerPlugin}} returns {{RawkKeyValueIterator}}. From a design point, my objection is that the abstraction {{ShuffleConsumerPlugin}} is not capturing the concept it is intended for namely moving just raw bytes from map hosts to reduce hosts and nothing more. I would ask you to go back to my original suggestion to make {{ShuffleRunner}} pluggable. We can add an {{initialize()}} method there. -- Asokan plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4633) history server doesn't set permissions on all subdirs
Thomas Graves created MAPREDUCE-4633: Summary: history server doesn't set permissions on all subdirs Key: MAPREDUCE-4633 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4633 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha Reporter: Thomas Graves The job history server creates a bunch of subdirectories under the done directory. They are like 2012/09/03/00. It only sets the permissions on the last one, ie 00 to 770.So the 2012/09/03 aren't explicitly set so if the umask is more restrictive, they won't be set as it expects. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4628) Use Builder to get RPC server in Map/Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448112#comment-13448112 ] Suresh Srinivas commented on MAPREDUCE-4628: +1 for the patch. Use Builder to get RPC server in Map/Reduce --- Key: MAPREDUCE-4628 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4628 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Priority: Minor Attachments: MAPREDUCE-4628.patch In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in Map/Reduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4414) Add main methods to JobConf and YarnConfiguration, for debug purposes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Plamen Jeliazkov reassigned MAPREDUCE-4414: --- Assignee: Plamen Jeliazkov (was: Linden Hillenbrand) Add main methods to JobConf and YarnConfiguration, for debug purposes - Key: MAPREDUCE-4414 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4414 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Plamen Jeliazkov Labels: newbie Just like Configuration has a main() func that dumps XML out for debug purposes, we should have a similar function under the JobConf and YarnConfiguration classes that do the same. This is useful in testing out app classpath setups at times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4414) Add main methods to JobConf and YarnConfiguration, for debug purposes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Plamen Jeliazkov updated MAPREDUCE-4414: Attachment: MAPREDUCE-4144.patch Add main methods to JobConf and YarnConfiguration, for debug purposes - Key: MAPREDUCE-4414 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4414 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Plamen Jeliazkov Labels: newbie Attachments: MAPREDUCE-4144.patch Just like Configuration has a main() func that dumps XML out for debug purposes, we should have a similar function under the JobConf and YarnConfiguration classes that do the same. This is useful in testing out app classpath setups at times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4414) Add main methods to JobConf and YarnConfiguration, for debug purposes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Plamen Jeliazkov updated MAPREDUCE-4414: Status: Patch Available (was: Open) Add main methods to JobConf and YarnConfiguration, for debug purposes - Key: MAPREDUCE-4414 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4414 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Plamen Jeliazkov Labels: newbie Attachments: MAPREDUCE-4144.patch Just like Configuration has a main() func that dumps XML out for debug purposes, we should have a similar function under the JobConf and YarnConfiguration classes that do the same. This is useful in testing out app classpath setups at times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4634) Change TestUmbilicalProtocolWithJobToken to use RPC builder
Vinod Kumar Vavilapalli created MAPREDUCE-4634: -- Summary: Change TestUmbilicalProtocolWithJobToken to use RPC builder Key: MAPREDUCE-4634 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4634 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Brandon Li Priority: Minor In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448178#comment-13448178 ] Scott Carey commented on MAPREDUCE-1700: Putting user jars before/after the application dependencies doesn't actually solve the problem. * The conflict might require a user jar that is not compatible with one needed by the framework, either order breaks something * The user might override a system jar and alter functionality in a way that breaks the framework, or subverts security. Both the host container and the user code need to be able to be certain of what code they are executing without stepping on each other's toes. This is _not possible_ with one classpath. User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4414) Add main methods to JobConf and YarnConfiguration, for debug purposes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448190#comment-13448190 ] Hadoop QA commented on MAPREDUCE-4414: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543758/MAPREDUCE-4144.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2812//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2812//console This message is automatically generated. Add main methods to JobConf and YarnConfiguration, for debug purposes - Key: MAPREDUCE-4414 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4414 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Plamen Jeliazkov Labels: newbie Attachments: MAPREDUCE-4144.patch Just like Configuration has a main() func that dumps XML out for debug purposes, we should have a similar function under the JobConf and YarnConfiguration classes that do the same. This is useful in testing out app classpath setups at times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448205#comment-13448205 ] Scott Carey commented on MAPREDUCE-1700: If we are lucky, projecct jigsaw will be pulled back into Java 8. According to: http://mreinhold.org/blog/late-for-the-train-qa it has not yet been decided. If it is brought back in, then perhaps we can wait until Java has a module system 1 to 1.5 years from now. If not, I do not think Hadoop can wait until Java 9, sometime 2015 to 2016 ish. User supplied dependencies may conflict with MapReduce system JARs -- Key: MAPREDUCE-1700 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-1700.patch, MAPREDUCE-1700.patch If user code has a dependency on a version of a JAR that is different to the one that happens to be used by Hadoop, then it may not work correctly. This happened with user code using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852081#action_12852081]. The problem is analogous to the one that application servers have with WAR loading. Using a specialized classloader in the Child JVM is probably the way to solve this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4634) Change TestUmbilicalProtocolWithJobToken to use RPC builder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated MAPREDUCE-4634: -- Attachment: MAPREDUCE-4634.patch Change TestUmbilicalProtocolWithJobToken to use RPC builder --- Key: MAPREDUCE-4634 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4634 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Brandon Li Priority: Minor Attachments: MAPREDUCE-4634.patch In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4634) Change TestUmbilicalProtocolWithJobToken to use RPC builder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448223#comment-13448223 ] Brandon Li commented on MAPREDUCE-4634: --- The patch part of the patch in YARN-84. Still uploaded it here just to save a record for the MapReduce change in JIRA system. Change TestUmbilicalProtocolWithJobToken to use RPC builder --- Key: MAPREDUCE-4634 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4634 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Brandon Li Priority: Minor Attachments: MAPREDUCE-4634.patch In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4634) Change TestUmbilicalProtocolWithJobToken to use RPC builder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved MAPREDUCE-4634. Resolution: Duplicate Thanks Brandon. Closing this as duplicate. Change TestUmbilicalProtocolWithJobToken to use RPC builder --- Key: MAPREDUCE-4634 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4634 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Brandon Li Priority: Minor Attachments: MAPREDUCE-4634.patch In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in MapReduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4635) MR side of YARN-83. Changing package of YarnClient
Bikas Saha created MAPREDUCE-4635: - Summary: MR side of YARN-83. Changing package of YarnClient Key: MAPREDUCE-4635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4635 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4635) MR side of YARN-83. Changing package of YarnClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4635: -- Attachment: YARN-83.3.MR.patch Attaching MR patch. MR side of YARN-83. Changing package of YarnClient -- Key: MAPREDUCE-4635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4635 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Attachments: YARN-83.3.MR.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira