[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-17 Thread Avner BenHanoch (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556000#comment-13556000
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:


Hi Alejandro - thanks for your thorough and fast review!

regarding 
{quote}
ReducerCopier class should be made public static in order to be able to be 
created via ReflectionUtils.newInstance()
{quote}
... cool!  Actually, I went in this direction in my very first patch.  I am 
happy to return to it.  (notice, that it will introduce changes in all places 
that currently ReduceCopier directly uses members of the encapsulating 
ReduceTask object - but, i believe this is correct thing)

regarding:
{quote}
I've just noticed, that your ShuffleConsumerPlugin API does not respect the API 
of the ReduceCopier, the createKVIterator() method has a different signature. 
The parameters being passed to it, in your patch, are already avail in the 
Context, except for the FileSystem, but you could create the FileSystem (and 
obtain the raw) within the your plugin impl using the conf received in the 
context.
{quote}
I think this comment is wrong.  Please clarify!

Regarding 
{quote}
I'm not trilled about the TT loading the default shuffle provider (which is not 
implementting the new shuffle provider interface) and in addition one extra 
custom shuffle provider.
Instead, I'd say the current shuffle provider logic should be refactored into a 
shuffle provider implementation and this one loaded by default. And, if as you 
indicated before, you want to load different impls simultaneously, then a 
shuffle plugin multiplexor implementation could be used.
This increases the scope of the changes, thus why I'd like to do this in a 
separate JIRA and keep this JIRA for the consumer (reducer) side.
{quote}
Actually, I wrote above _my intuition is that supporting 1 external shuffle 
service (in addition to the built-in shuffle service) is the 'keep it simple' 
solution. I feel that the use case of N providers is theoretical. Hence, I 
prefer to keep the conf and code simple_.  This clarify why I wrote my patch 
in this way instead of introducing big feature with shuffle plugin 
multiplexor... in hadoop-1.
*Again, this JIRA issue - since its creation - focus on _Support generic 
shuffle service as set of two plugins: ShuffleProvider  ShuffleConsumer_.  It 
has no value for me, if it deals with consumer only.*

*I am fine with all the rest of your comments.  Please let me know if I can 
continue according to this!*
Avner


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4077) Issues while using Hadoop Streaming job

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved MAPREDUCE-4077.
--

Resolution: Not A Problem

 Issues while using Hadoop Streaming job
 ---

 Key: MAPREDUCE-4077
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4077
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Devaraj K
Assignee: Devaraj K

 When we use -file option it says deprecated and use -files.
 {code:xml}
 linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./hadoop 
 jar 
 ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar -input /hadoop 
 -output /test/output/3 -mapper cat -reducer wc -file hadoop
 02/02/19 10:55:51 WARN streaming.StreamJob: -file option is deprecated, 
 please use generic option -files instead.
 {code}
 But when we use -files option, it says unrecognized option.
 {code:xml}
 linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./hadoop 
 jar 
 ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar -input /hadoop 
 -output 
 /test/output/3 -mapper cat -reducer wc -files hadoop
 02/02/19 10:56:42 ERROR streaming.StreamJob: Unrecognized option: -files
 Usage: $HADOOP_PREFIX/bin/hadoop jar hadoop-streaming.jar [options]
 {code}
 When we use -archives option,  it says unrecognized option.
 {code:xml}
 linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./hadoop 
 jar 
 ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar -input /hadoop 
 -output 
 /test/output/3 -mapper cat -reducer wc -archives testarchive.rar
 02/02/19 11:05:43 ERROR streaming.StreamJob: Unrecognized option: -archives
 Usage: $HADOOP_PREFIX/bin/hadoop jar hadoop-streaming.jar [options]
 {code}
 But in the options it will display the usage of the -archives.
 {code:xml}
 linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./hadoop 
 jar 
 ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar -input /hadoop 
 -output 
 /test/output/3 -mapper cat -reducer wc -archives testarchive.rar
 02/02/19 11:05:43 ERROR streaming.StreamJob: Unrecognized option: -archives
 Usage: $HADOOP_PREFIX/bin/hadoop jar hadoop-streaming.jar [options]
 ..
 ..
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2309) While querying the Job Statics from the command-line, if we give wrong status name then there is no warning or response.

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-2309:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 While querying the Job Statics from the command-line, if we give wrong status 
 name then there is no warning or response.
 

 Key: MAPREDUCE-2309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Minor
 Fix For: 0.22.1

 Attachments: MAPREDUCE-2309-0.20.patch, MAPREDUCE-2309-trunk.patch


 If we try to get the jobs information by giving the wrong status name from 
 the command line interface, it is not giving any warning or response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2548) Log improvements in DBOutputFormat.java and CounterGroup.java

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-2548:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 Log improvements in DBOutputFormat.java and CounterGroup.java
 -

 Key: MAPREDUCE-2548
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2548
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-2548-1.patch, MAPREDUCE-2548.patch


 1. Instead of the printing the stack trace on the console, It can be logged. 
 {code:title=DBOutputFormat.java|borderStyle=solid}
 
 public void write(K key, V value) throws IOException {
   try {
 key.write(statement);
 statement.addBatch();
   } catch (SQLException e) {
 e.printStackTrace();
   }
 }
 {code}
 2. Missing resource information can be logged. 
 {code:title=CounterGroup .java|borderStyle=solid}
 protected CounterGroup(String name) {
 this.name = name;
 try {
   bundle = getResourceBundle(name);
 }
 catch (MissingResourceException neverMind) {
 }
 displayName = localize(CounterGroupName, name);
   }
   private String localize(String key, String defaultValue) {
 String result = defaultValue;
 if (bundle != null) {
   try {
 result = bundle.getString(key);
   }
   catch (MissingResourceException mre) {
   }
 }
 return result;
   }
 {code}
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2562) NullPointerException in Jobtracker when it is started without Name Node

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-2562:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 NullPointerException in Jobtracker when it is started without Name Node
 ---

 Key: MAPREDUCE-2562
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2562
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 0.22.1

 Attachments: MAPREDUCE-2562.patch


 It is throwing NullPointerException in job tracker logs when job tracker is 
 started without NameNode.
 {code:title=Bar.java|borderStyle=solid}
 2011-06-03 01:50:04,304 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.18.52.225:9000. Already tried 7 time(s).
 2011-06-03 01:50:05,307 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.18.52.225:9000. Already tried 8 time(s).
 2011-06-03 01:50:06,310 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.18.52.225:9000. Already tried 9 time(s).
 2011-06-03 01:50:21,243 FATAL org.apache.hadoop.mapred.JobTracker: 
 java.lang.NullPointerException
   at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1635)
   at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:287)
   at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:279)
   at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:274)
   at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4312)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-3207) TestMRCLI failing on trunk

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned MAPREDUCE-3207:


Assignee: (was: Devaraj K)

 TestMRCLI failing on trunk  
 

 Key: MAPREDUCE-3207
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3207
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Priority: Minor
 Fix For: 0.24.0

 Attachments: TEST-org.apache.hadoop.cli.TestMRCLI.txt


 Failing tests:
   7: Archive: Deleting a file in archive
   8: Archive: Renaming a file in archive

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-3222) ant test TestTaskContext failing on trunk

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned MAPREDUCE-3222:


Assignee: (was: Devaraj K)

 ant test TestTaskContext failing on trunk
 -

 Key: MAPREDUCE-3222
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3222
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Hitesh Shah
Priority: Minor
 Fix For: 0.24.0


 Testcase: testContextStatus took 29.977 sec
 FAILED
 null expected:map[  sort] but was:map[]
 junit.framework.ComparisonFailure: null expected:map[  sort] but 
 was:map[]
 at 
 org.apache.hadoop.mapreduce.TestTaskContext.testContextStatus(TestTaskContext.java:120)
 Testcase: testMapContextProgress took 17.371 sec
 Testcase: testReduceContextProgress took 16.267 sec

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4743) Job is marking as FAILED and also throwing the Transition exception instead of KILLED when issues a KILL command

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4743:
-

Summary: Job is marking as FAILED and also throwing the Transition 
exception instead of KILLED when issues a KILL command  (was: Job is marking as 
FAILED and also throwing thhe Transition exception instead of KILLED when 
issues a KILL command)

 Job is marking as FAILED and also throwing the Transition exception instead 
 of KILLED when issues a KILL command
 

 Key: MAPREDUCE-4743
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4743
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha
Reporter: Devaraj K
Assignee: Devaraj K

 {code:xml}
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 T_KILL at SUCCEEDED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:605)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:89)
at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:903)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:897)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-3841) Broken Server metrics and Local logs link under the tools menu

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned MAPREDUCE-3841:


Assignee: (was: Devaraj K)

 Broken Server metrics and Local logs link under the tools menu
 --

 Key: MAPREDUCE-3841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3841
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ramya Sunil

 Local logs link redirects to the cluster page and Server metrics opens an 
 empty page on the RM/JHS homepage. So does the links from nodemanager UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3556) Resource Leaks in key flows

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3556:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 Resource Leaks in key flows
 ---

 Key: MAPREDUCE-3556
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3556
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-3556.patch


 {code:title=MapTask.java|borderStyle=solid}
 {code} 
 {code:xml} 
  if (combinerRunner == null || numSpills  minSpillsForCombine) {
 Merger.writeFile(kvIter, writer, reporter, job);
   } else {
 combineCollector.setWriter(writer);
 combinerRunner.combine(kvIter, combineCollector);
   }
   //close
   writer.close();
 {code} 
 {code:title=InputSampler.java|borderStyle=solid}
 {code} 
 {code:xml} 
  for(int i = 1; i  numPartitions; ++i) {
   int k = Math.round(stepSize * i);
   while (last = k  comparator.compare(samples[last], samples[k]) == 0) 
 {
 ++k;
   }
   writer.append(samples[k], nullValue);
   last = k;
 }
 writer.close();{code} 
 The key flows have potential resource leaks. 
 {code:title=JobSplitWriter.java|borderStyle=solid}
 {code} 
 {code:xml} 
 SplitMetaInfo[] info = writeNewSplits(conf, splits, out);
 out.close();
 SplitMetaInfo[] info = writeOldSplits(splits, out);
 out.close();
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-3232) AM should handle reboot from Resource Manager

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved MAPREDUCE-3232.
--

Resolution: Not A Problem

 AM should  handle reboot from Resource Manager
 --

 Key: MAPREDUCE-3232
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3232
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.24.0
Reporter: Devaraj K
Assignee: Devaraj K

 When the RM doesn't have last response id for app attempt(or the request 
 response id is less than the last response id), RM sends reboot response but 
 AM doesn't handle this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4286) TestClientProtocolProviderImpls passes on failure conditions also

2013-01-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4286:
-

Affects Version/s: (was: 2.0.1-alpha)
   (was: 2.0.0-alpha)
   2.0.3-alpha
   0.23.5

 TestClientProtocolProviderImpls passes on failure conditions also
 -

 Key: MAPREDUCE-4286
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4286
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.2-alpha, 2.0.3-alpha, 0.23.5
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4286.patch, MAPREDUCE-4286.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556213#comment-13556213
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

Regarding ...ShuffleConsumerPlugin API does not respect the API of the 
ReduceCopier,...  I think this comment is wrong. Please clarify!... You are 
right, please disregard that comment. After integrating my comments into the 
consumer side I think it (the consumer) is ready to go in.

Regarding the producer changes, I think that the default producer 
implementation should implement the producer plugin interface as well. Once we 
have that, the multiplexor plugin would be trivial, I'd be happy to help with 
that. We can do the producer plugin as a subtask of this JIRA.



 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556247#comment-13556247
 ] 

Arun C Murthy commented on MAPREDUCE-4808:
--

bq. The goal is to be able to write alternate implementations of the Shuffle

Alejandro - it seems like you understand something about the use-case that I 
don't. Maybe you  Asokan have had a private chat? 

What are the use-cases for alternate implementations of the Shuffle? Like Chris 
also mentioned with MAPREDUCE-4049 we already allow alternate implementations 
of Shuffle, is this redundant then?

bq. While some of this logic replacement could be done at Merge level as you 
suggested, other, like MapOutput allocation cannot be done there as this is 
driven by the MergeManager. 

So, a combination of MapOutput re-factor and Merger interface should suffice?

IAC, what are the use-cases for alternate implementations of MapOutput? Or, is 
it the MapOutput re-factor merely a code-hygiene issue?



I'm not trying to be difficult here. But, I feel like I just don't understand 
the use-case. So, I'd appreciate if we could focus on concrete use-cases for 
the plugin. I admit I still am having a hard time understanding why we need 
this complexity.

Thanks.

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 2.0.3-alpha

 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4808:
-

Fix Version/s: (was: 2.0.3-alpha)

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4808:
-

Affects Version/s: (was: 2.0.2-alpha)

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-17 Thread Avner BenHanoch (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556262#comment-13556262
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:


Thanks.  So, we agreed upon the consumer details.

Now, For the producer details:
 - Again, throughout the lifetime of this JIRA issue, the consumer  producer 
come together, since they are the 2 sides of the shuffle service.  *This JIRA 
issue has no value if it has one without the other.* Hence, they should be kept 
together!  
Additionally, I want it to be one patch that can be ported to any hadoop-1.x.y 
version at once.
 - The default producer implementation already implements the producer plugin 
interface! (though, it is still loaded via HttpServlet interface)  As I said, I 
went on a keep it simple solution, in which I only support 1 extra provider 
(with simple code and simple conf). Please clarify whether this is enough, or 
rather you ask me to support N providers.  I don't want to write a new 
feature and then have someone say that we have a problem to introduce new 
feature in hadoop-1.

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-17 Thread Avner BenHanoch (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556344#comment-13556344
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:


Additionally, as I wrote once, currently, there is no request and no use case 
for N providers.  Hence, do we really want that?

 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-17 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-4946:
-

 Summary: Type conversion of map completion events leads to 
performance problems with large jobs
 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 0.23.5, 2.0.2-alpha
Reporter: Jason Lowe
Priority: Critical


We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
reducers fail to connect back to the AM after being launched due to connection 
timeout.  Looking at stack traces of the AM during this time we see a lot of 
IPC servers stuck waiting for a lock to get the application ID while type 
converting the map completion events.  What's odd is that normally getting the 
application ID should be very cheap, but in this case we're type-converting 
thousands of map completion events for *each* reducer connecting.  That means 
we end up type-converting the map completion events over 45 million times 
during the lifetime of the example job (13,000 * 3,500).

We either need to make the type conversion much cheaper (i.e.: lockless or at 
least read-write locked) or, even better, store the completion events in a form 
that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556391#comment-13556391
 ] 

Jason Lowe commented on MAPREDUCE-4946:
---

This performance problem prevents the AM from reliably supporting very large 
jobs (i.e.: tens of thousands of maps and thousands of reducers) because it can 
take too long to serve up requests and other clients end up being ignored and 
timeout.  If the same task times out enough attempts then the whole job fails.

 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Priority: Critical

 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556394#comment-13556394
 ] 

Jason Lowe commented on MAPREDUCE-4946:
---

Sample stacktrace from one of the many IPC server threads waiting for a lock 
during type-conversion of the map completion events:

{noformat}
IPC Server handler 9 on 45874 daemon prio=10 tid=0x08f76800 nid=0x1c27 
waiting for monitor entry [0x10583000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.getAppId(JobIdPBImpl.java:78)
- waiting to lock 0x21e729b8 (a 
org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl)
at 
org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:65)
at 
org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:119)
at 
org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:211)
at 
org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:185)
at 
org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:178)
at 
org.apache.hadoop.mapred.TaskAttemptListenerImpl.getMapCompletionEvents(TaskAttemptListenerImpl.java:284)
at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:394)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1530)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1526)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1221)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1524)
{noformat}

 Type conversion of map completion events leads to performance problems with 
 large jobs
 --

 Key: MAPREDUCE-4946
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Priority: Critical

 We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
 reducers fail to connect back to the AM after being launched due to 
 connection timeout.  Looking at stack traces of the AM during this time we 
 see a lot of IPC servers stuck waiting for a lock to get the application ID 
 while type converting the map completion events.  What's odd is that normally 
 getting the application ID should be very cheap, but in this case we're 
 type-converting thousands of map completion events for *each* reducer 
 connecting.  That means we end up type-converting the map completion events 
 over 45 million times during the lifetime of the example job (13,000 * 3,500).
 We either need to make the type conversion much cheaper (i.e.: lockless or at 
 least read-write locked) or, even better, store the completion events in a 
 form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4907) TrackerDistributedCacheManager issues too many getFileStatus calls

2013-01-17 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4907:
-

Fix Version/s: 0.23.7

merged this to branch-0.23

 TrackerDistributedCacheManager issues too many getFileStatus calls
 --

 Key: MAPREDUCE-4907
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4907
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1, tasktracker
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.2.0, 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4907.patch, MAPREDUCE-4907-trunk-1.patch, 
 MAPREDUCE-4907-trunk-1.patch, MAPREDUCE-4907-trunk-1.patch, 
 MAPREDUCE-4907-trunk.patch


 TrackerDistributedCacheManager issues a number of redundant getFileStatus 
 calls when determining the timestamps and visibilities of files in the 
 distributed cache.  300 distributed cache files deep in the directory 
 structure can hammer HDFS with a couple thousand requests.
 A couple optimizations can reduce this load:
 1. determineTimestamps and determineCacheVisibilities both call getFileStatus 
 on every file.  We could cache the results of the former and use them for the 
 latter.
 2. determineCacheVisibilities needs to check that all ancestor directories of 
 each file have execute permissions for everyone.  This currently entails a 
 getFileStatus on each ancestor directory for each file.  The results of these 
 getFileStatus calls could be cached as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556405#comment-13556405
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Arun,
  MAPREDUCE-4049 expects the plugin implementer to implement the shuffle from 
scratch.  With the default implementation of HTTP shuffle being robust and 
secure it is possible to reuse it in majority of the situations.

The alternate implementation of MapOutput can be left to the plugin 
implementer.  For example, it can be optimized to use less JVM memory and 
minimize Java garbage collection.

Some of the concrete use cases for the plugin are: hash aggregation, hash join, 
limit-N query, etc.

Thanks.

-- Asokan


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556524#comment-13556524
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

Regarding 'new features in Hadoop-1'. Small or big, this is a new feature and 
it should be treated as such. I'm all for having this in Hadoop 1. If you want 
I can start the discussion in common-dev@.

Regarding Again, throughout the lifetime of this JIRA issue,..., I see 
different ways this can be done:

* Keep everything in the same JIRA (as it is now) and wait till the whole patch 
is ready
* Break the JIRA in 2 subtasks, consumer and producer side
** Do it in branch-1 directly
** Do it in a dev branch (seems an overkill)

I'm OK with any approach, your call.

Regarding default implementation already implements the producer API.

Ahh, missed that because the initialize method it is not used.

With some minor tweaks to your patch I think we could get things done in a 
simple way:

* Add to the TT a 'public Server getHttpServer()' method
* In the TT constructor, where the MapOutputServlet is added to the HttpServer 
'server', remove that line and discover, instantiate and initialize the 
provider plugin.
* Don't make the MapOutputServlet to extends the provider interface.
* The default provider should be a class that simply adds the MapOutputServlet 
to the server via the TT.getHttpServer() method.
* Remove the logic to instantiate a custom single provider plugin.


A provider multiplexor would be a very simple class, something along the 
following lines:

{code}
public class MultiShuffleProviderPlugin implements ShuffleProviderPlugin {
  public static final String PLUGIN_CLASSES = 
hadoop.mapreduce.multi.shuffle.provider.classes;

  private ShuffleProviderPlugin[] plugins;

  public void initialize(TaskTracker tt) {
Configuration conf = tt.getJobConf();
Class[] klasses = conf.getClasses(PLUGIN_CLASSES, 
DefaultShuffleProvider.class);
//LOG INFO list of plugin classes
plugins = new ShuffleProviderPlugin[klasses.length];
for (int i = 0; i  klasses.length; i++) {
  plugins[i] = ReflectionUtils.newInstance(klasses[i], conf);
}
for (ShuffleProviderPlugin plugin : plugins) {
  plugin.initialize(tt);
}
  }

  public void destroy() {
if (plugins != null) {
  for (ShuffleProviderPlugin plugin : plugins) {
try {
  plugin.destroy();
} catch (Throwable ex) {
  //LOG WARN and ignore exception
}
  }
}
  }


}
{code}

And the default provider class would be:

{code}

public static class DefaultShuffleProviderPlugin implements 
ShuffleProviderPlugin {

  public void initialize(TaskTracker tt) {
  tt.getHttpServer().addInternalServlet(mapOutput, /mapOutput, 
MapOutputServlet.class);
  }

  public void destroy() {
  }


}
{code}


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 3.0.0

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast 

[jira] [Updated] (MAPREDUCE-4278) cannot run two local jobs in parallel from the same gateway.

2013-01-17 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4278:
-

Fix Version/s: 0.23.7

merged to branch-0.23

 cannot run two local jobs in parallel from the same gateway.
 

 Key: MAPREDUCE-4278
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4278
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.205.0
Reporter: Araceli Henley
Assignee: Sandy Ryza
 Fix For: 1.2.0, 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4278-2-branch1.patch, 
 MAPREDUCE-4278-3-branch1.patch, MAPREDUCE-4278-branch1.patch, 
 MAPREDUCE-4278-trunk.patch, MAPREDUCE-4278-trunk.patch


 I cannot run two local mode jobs from Pig in parallel from the same gateway, 
 this is a typical use case. If I re-run the tests sequentially, then the test 
 pass. This seems to be a problem from Hadoop.
 Additionally, the pig harness, expects to be able to run 
 Pig-version-undertest against Pig-version-stable from the same gateway.
 To replicate the error:
 I have two clusters running from the same gateway.
 If I run the Pig regression suites nightly.conf in local mode in paralell - 
 once on each cluster. Conflicts in M/R local mode result in failures in the 
 tests. 
 ERROR1:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
 output/file.out in any of the configured local directories
 at
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
 at
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
 at
 org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
 at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
 at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
 at org.apache.hadoop.mapred.Task.done(Task.java:875)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)
 ---
 ERROR2:
 2012-05-17 20:25:36,762 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
 -
 HadoopJobId: job_local_0001
 2012-05-17 20:25:36,778 [Thread-3] INFO  org.apache.hadoop.mapred.Task -
 Using ResourceCalculatorPlugin : org.apache.
 hadoop.util.LinuxResourceCalculatorPlugin@ffa490e
 2012-05-17 20:25:36,837 [Thread-3] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
 java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
 at java.util.ArrayList.RangeCheck(ArrayList.java:547)
 at java.util.ArrayList.get(ArrayList.java:322)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java
 :153)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputForm
 at.java:106)
 at
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.init(MapTask.java:489)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
 2012-05-17 20:25:41,291 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556536#comment-13556536
 ] 

Chris Douglas commented on MAPREDUCE-4808:
--

Asokan, the concern is that even breaking an API, even if it's marked unstable, 
is an incompatible change. Since the pluggable shuffle is particularly useful 
for frameworks, breaking this contract could require 
patching/validation/rewrite of plugin and optimizer code in projects that 
invest in it (Hive, Pig, etc.). Moreover, if we wanted to change the default 
{{Shuffle}} to a different implementation, then user/framework code would 
perform badly- or break- unless we exposed this implementation-specific 
mechanism in the _new_ impl. So it's fair to press for use cases, to ensure 
it's _sufficient_ and that the abstraction could apply to most {{Shuffle}} 
implementations.

Personally, I'm ambivalent about exposing this as an API and am +1 on the patch 
overall (mostly because I like the {{MapOutput}} refactoring). The user can 
always configure the current {{Shuffle}}, which is exactly how frameworks would 
handle this until they port/specialize their efficient {{MergeManager}} plugin.

As a compromise, would it make sense to just add a protected 
{{createMergeManager}} method to the {{Shuffle}}? The user still needs to 
configure their custom {{Shuffle}} impl now, but that's better than the 
inevitable future where they configure both. It also makes its tie to this 
implementation explicit.

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556580#comment-13556580
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4808:
---

Chris, are you suggesting?

* remove the MergeManagerPlugin interface
* introduce a protected createMergerManager() in the Shuffle class to 
instantiate (via new)  initialize the existing MergerManager.




 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556595#comment-13556595
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Arun,
  I will think about your suggestion to make the Merger class pluggable and 
post my findings for different use cases.

-- Asokan


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556600#comment-13556600
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Chris,
  I will work on creating a real working plugin for the use cases to show that 
the proposed API is sufficient to handle them.

-- Asokan

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556602#comment-13556602
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Alejandro,
  If the MergeManagerPlugin is to be removed, it should be possible to extend 
the framework's MergeManager by an external implementation.

-- Asokan


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556606#comment-13556606
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Alejandro,
  I meant to ask whether it is okay to make the existing MergeManager to be 
extendable?

-- Asokan

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556705#comment-13556705
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Arun,
  I will try to explain a simple use case of an external implementation of 
merge on the reduce side.  Let us say this merge implementation has some fixed 
area of memory (Java byte array) allocated to store the shuffled data.  This 
may be done to avoid frequent garbage collection by JVM or for better processor 
cache efficiency.

Looking at the methods in the {{Merge}} class, they either accept input to the 
merge in disk files(array of {{Path}} objects) or memory segments(list of 
{{Segment}} objects.)  The former is not suitable since merge is done in memory 
first and any intermediate merged output file is under the control of the 
plugin implementation.  The latter is not suitable because memory for the 
shuffled data is not under the control of the plugin implementation.

Ideally, if an {{InputStream}} object is available, the external implementation 
can read shuffled data from the stream to the fixed area of memory at a 
specific offset in the byte array.

With the {{MergeManagerPlugin,}} the external implementation will get the HTTP 
connection's {{InputStream}} object via the {{shuffle()}} method in 
{{MapOutput}} object.  In addition, if merge goes though multiple passes 
because the memory area is limited in size, there should be some way for the 
{{Shuffle}} to wait until memory is released by a merge pass.  There is no 
method in {{Merge}} for that either.

I find that it is possible to define the interaction points between current 
{{Shuffle}} and {{MergeManager}} using the {{MergeManagerPlugin}} interface.  
The plugin interface has only three methods and it allows the external plugin 
to have a lot of freedom in its implementation.  As a side effect, the 
{{MapOutput}} is also refactored.

Hope I explained this well.  If you have any questions, please let me know.

-- Asokan


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4808:
--

Attachment: MR-4808.patch

I've taken the liberty to tweak the patch a bit based the last comments in the 
JIRA.

* removed pluggability via config of the MergeManager
* Shuffle has a protected createMergeManager() method
* MergeManager is annotated as Private
* Kept MergeManagerPlugin interface
* Removed MergeManagerPlugin.Context
* MergeManagerPlugin interface annotated as Private

These changes avoid having an extra know (the MergeManager class) in the 
config. Keep the MergeManager owned by the Shuffle class. The interface allows, 
for impls like Jerry's and Asokan's, for alternate implementations.

Asokan, Arun, Chris?

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556738#comment-13556738
 ] 

Chris Douglas commented on MAPREDUCE-4808:
--

+1 Looked through it; the latest patch lgtm. Asokan, is that sufficient for 
your use cases? Arun?

_Very_ minor, optional nit: {{s/MergeManager/MergeManagerImpl/}} and 
{{s/MergeManagerPlugin/MergeManager/}}. There's an argument to be made for 
doing the same with the {{ShuffleScheduler}} while we're at it, but neither of 
these are blocking, IMO.

 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-17 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556749#comment-13556749
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
-

Hi Chris,
  Thanks for your quick feedback.  I looked at the patch.  It has one minor 
nit.  The {{createMergeManager}} method should take 
{{ShuffleConsumerPlugin.Context}} object. I will go over it one more time, work 
out the change, run tests, and post the patch shortly.

Thanks.

-- Asokan


 Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
 implementations
 --

 Key: MAPREDUCE-4808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
 mapreduce-4808.patch, MergeManagerPlugin.pdf, MR-4808.patch


 Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
 alternate implementations to be able to reuse portions of the default 
 implementation. 
 This would come with the strong caveat that these classes are LimitedPrivate 
 and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4923) Add toString method to TaggedInputSplit

2013-01-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556772#comment-13556772
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4923:
---

+1. Please open another jira for the annotation changes, thx

 Add toString method to TaggedInputSplit
 ---

 Key: MAPREDUCE-4923
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4923
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Attachments: MAPREDUCE-4923-branch-1.patch, MAPREDUCE-4923.patch


 Per MAPREDUCE-3678, map task logs now contain information about the input 
 split being processed.  Because TaggedInputSplit has no overridden toString 
 method, nothing useful gets printed out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4929) mapreduce.task.timeout is ignored

2013-01-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556774#comment-13556774
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4929:
---

It looks good to me, but before committing it, what is the precedence behavior 
in trunk? to make sure we have the same behavior.

 mapreduce.task.timeout is ignored
 -

 Key: MAPREDUCE-4929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4929-branch-1.patch


 In MR1, only mapred.task.timeout works.  Both should be made to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4929) mapreduce.task.timeout is ignored

2013-01-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556795#comment-13556795
 ] 

Sandy Ryza commented on MAPREDUCE-4929:
---

It doesn't exactly appear that there is precedence behavior in trunk.  When a 
Configuration#set() is called for a config with deprecations, all the 
corresponding configs are set.  So if we came across mapred.task.timeout first 
in a config file, both mapred.task.timeout and mapreduce.task.timeout would get 
set. Then if we came across mapreduce.task.timeout afterwards, both would get 
overriden.

 mapreduce.task.timeout is ignored
 -

 Key: MAPREDUCE-4929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4929-branch-1.patch


 In MR1, only mapred.task.timeout works.  Both should be made to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4923) Add toString method to TaggedInputSplit

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556822#comment-13556822
 ] 

Hudson commented on MAPREDUCE-4923:
---

Integrated in Hadoop-trunk-Commit #3258 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3258/])
MAPREDUCE-4923. Add toString method to TaggedInputSplit. (sandyr via tucu) 
(Revision 1434993)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434993
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TaggedInputSplit.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/TaggedInputSplit.java


 Add toString method to TaggedInputSplit
 ---

 Key: MAPREDUCE-4923
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4923
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-4923-branch-1.patch, MAPREDUCE-4923.patch


 Per MAPREDUCE-3678, map task logs now contain information about the input 
 split being processed.  Because TaggedInputSplit has no overridden toString 
 method, nothing useful gets printed out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4923) Add toString method to TaggedInputSplit

2013-01-17 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4923:
--

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   1.2.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Sandy. Committed to trunk, branch-1 and branch-2.

 Add toString method to TaggedInputSplit
 ---

 Key: MAPREDUCE-4923
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4923
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, task
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-4923-branch-1.patch, MAPREDUCE-4923.patch


 Per MAPREDUCE-3678, map task logs now contain information about the input 
 split being processed.  Because TaggedInputSplit has no overridden toString 
 method, nothing useful gets printed out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4911:
--

Attachment: MAPREDUCE-4911.patch

Add JobConf to configuration about node-level aggregation.
This patch also includes tests against the changes.

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4911:
--

 Target Version/s: trunk
Affects Version/s: trunk

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4911 started by Tsuyoshi OZAWA.

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4911:
--

Status: Patch Available  (was: In Progress)

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4863) Adding aggregationWaitMap for node-level combiner.

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4863 started by Tsuyoshi OZAWA.

 Adding aggregationWaitMap for node-level combiner.
 --

 Key: MAPREDUCE-4863
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4863
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: 
 0002-Adding-aggregationWaitMap-for-node-level-combiner.patch


 To manage node/rack-level combining, MRAppMaster needs to have a management 
 information about outputs of completed MapTasks to be aggregated.  
 AggregationWaitMap is used so that MRAppMaster decides whether or not 
 MapTasks start to combine local MapOutputFiles.
 AggregationWaitMap is a abstraction class of ConcurrentHashMapString, 
 ArrayListTaskAttemptCompletionEvent. These Events are candidate files to be 
 aggregated.
 When MapTasks are completed, MRAppMaster buffer TaskAttemptCompletionEvent 
 into AggregationWaitMap to delay reducers' fethcing outputs from mappers 
 until node-level aggregation are finished.  After node-level aggreagtion, 
 MRAppMaster write back mapAttemptCompletionEvents, to restart reducers' 
 feching outputs from mappers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4864 started by Tsuyoshi OZAWA.

 Adding new umbilical protocol RPC, getAggregationTargets(), for node-level 
 combiner.
 --

 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2, tasktracker
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: 
 0001-Adding-new-umbilical-protocol-RPC-getAggregationTarg-20130116.patch, 
 0001-Adding-new-umbilical-protocol-RPC-getAggregationTarg.patch


 MapTasks need to know whether or not they should start node-level combiner 
 agaist outputs of mapper on their node. 
 The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
 aggregated on the node. The definition as follows:
 AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
 IOException;
 AggregationTarget is a abstraction class of array of TaskAttemptID to be 
 aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4864) Adding new umbilical protocol RPC, getAggregationTargets(), for node-level combiner.

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4864:
--

Affects Version/s: trunk

 Adding new umbilical protocol RPC, getAggregationTargets(), for node-level 
 combiner.
 --

 Key: MAPREDUCE-4864
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4864
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2, tasktracker
Affects Versions: 3.0.0, trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: 
 0001-Adding-new-umbilical-protocol-RPC-getAggregationTarg-20130116.patch, 
 0001-Adding-new-umbilical-protocol-RPC-getAggregationTarg.patch


 MapTasks need to know whether or not they should start node-level combiner 
 agaist outputs of mapper on their node. 
 The new umbilical RPC, getAggregationTargets(), is used to get outputs to be 
 aggregated on the node. The definition as follows:
 AggregationTarget getAggregationTargets(TaskAttemptID aggregator) throws 
 IOException;
 AggregationTarget is a abstraction class of array of TaskAttemptID to be 
 aggregated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4910) Adding AggregationWaitMap to some components(MRAppMaster, TaskAttemptListener, JobImpl, MapTaskImpl).

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4910 started by Tsuyoshi OZAWA.

 Adding AggregationWaitMap to some components(MRAppMaster, 
 TaskAttemptListener, JobImpl, MapTaskImpl).
 -

 Key: MAPREDUCE-4910
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4910
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mrv2, task
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: 
 0003-Adding-AggregationWaitMap-to-some-components-MRAppMa.patch, 
 0004-Add-AggregationWaitMap-to-some-components-MRAppMaste.patch


 To implement MR-4502, AggregationWaitMap need to be used by some 
 components(MRAppMaster, TaskAttemptListener, JobImpl, MapTaskImpl).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4865) Launching node-level combiner at the end stage of MapTask and ignoring aggregated inputs at ReduceTask

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4865 started by Tsuyoshi OZAWA.

 Launching node-level combiner at the end stage of MapTask and ignoring 
 aggregated inputs at ReduceTask
 --

 Key: MAPREDUCE-4865
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4865
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: 
 0003-Changed-Mappers-and-Reducers-to-support-Node-level-aggregation-20130116.patch,
  0004-Changed-Mappers-and-Reducers-to-support-Node-level-a.patch


 MapTask needs to start node-level aggregation against local outputs at the 
 end stage of MapTask after calling getAggregationTargets().
 This feature is implemented with Merger and CombinerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2013-01-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4502 started by Tsuyoshi OZAWA.

 Multi-level aggregation with combining the result of maps per node/rack
 ---

 Key: MAPREDUCE-4502
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: design_v2.pdf, MAPREDUCE-4525-pof.diff, 
 speculative_draft.pdf


 The shuffle costs is expensive in Hadoop in spite of the existence of 
 combiner, because the scope of combining is limited within only one MapTask. 
 To solve this problem, it's a good way to aggregate the result of maps per 
 node/rack by launch combiner.
 This JIRA is to implement the multi-level aggregation infrastructure, 
 including combining per container(MAPREDUCE-3902 is related), coordinating 
 containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2013-01-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556986#comment-13556986
 ] 

Andrew Purtell commented on MAPREDUCE-4495:
---

For the benefit of those coming to this issue now, was this moved to the Oozie 
project? Was the yapp proposal submitted to the incubator? What is the 
current status of this? Is the code/design on this issue orphaned/dead?

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf, yapp_proposal.txt


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2013-01-17 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557002#comment-13557002
 ] 

Bo Wang commented on MAPREDUCE-4495:


Hi Andrew,

Thanks for looking at this issue. Currently this issue hasn't been moved to 
Oozie and I don't think the yapp proposal has been submitted to the incubator 
either. In terms of of implementation, a prototype based on the v2 design in 
the document is finished. 


 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf, yapp_proposal.txt


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4911) Add node-level aggregation flag feature(setLocalAggregation(boolean)) to JobConf

2013-01-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557003#comment-13557003
 ] 

Hadoop QA commented on MAPREDUCE-4911:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12565437/MAPREDUCE-4911.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
12 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapred.TestYARNRunner

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3247//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3247//console

This message is automatically generated.

 Add node-level aggregation flag feature(setLocalAggregation(boolean)) to 
 JobConf
 

 Key: MAPREDUCE-4911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: client
Affects Versions: trunk
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4911.patch


 This JIRA adds node-level aggregation flag 
 feature(setLocalAggregation(boolean)) to JobConf.
 This task is subtask of MAPREDUCE-4502.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4944) Backport YARN-40 to support listClusterNodes and printNodeStatus in command line tool

2013-01-17 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated MAPREDUCE-4944:
-

 Target Version/s: 1.2.0
Affects Version/s: 1.1.1
 Hadoop Flags: Incompatible change

 Backport YARN-40 to support listClusterNodes and printNodeStatus in command 
 line tool
 -

 Key: MAPREDUCE-4944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4944
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.1.1
Reporter: Binglin Chang
Priority: Minor

  support listClusterNodes and printNodeStatus in command line tool is useful 
 for admin to create certain automation tools, this can also used by 
 MAPREDUCE-4900 to get TaskTracker name so can set TT's slot dynamically

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4944) Backport YARN-40 to support listClusterNodes and printNodeStatus in command line tool

2013-01-17 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557014#comment-13557014
 ] 

Binglin Chang commented on MAPREDUCE-4944:
--

I look into the code and find some issue of this backport:
hadoop-1.x have similar command -list-active-trackers and 
-list-blacklisted-trackers, which just print trackerNames, they use 
JobSubmissionProtocol to talk to JobTracker, and their is no more information 
we can expose expect adding another protocol method into JobSubmissionProtocol, 
this wil break compatibility which I think is unacceptable for 
JobSubmissionProtocol, which is used by normal clients. 
Another option is add this to AdminOperationsProtocol(like MAPREDUCE-4900), it 
is a admin protocol, which I think has less compatibility requirement, still I 
don't think it's worth to break compatibility.
Another option is JMX, I haven't find out the proper way to write a command 
line tool using JMX, which needs to connect to some specific port in JobTracker 
host, this information is hard to get, because it is not in mapred-site.xml but 
depending some jvm config in hadoop-env.sh.


 Backport YARN-40 to support listClusterNodes and printNodeStatus in command 
 line tool
 -

 Key: MAPREDUCE-4944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4944
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.1.1
Reporter: Binglin Chang
Priority: Minor

  support listClusterNodes and printNodeStatus in command line tool is useful 
 for admin to create certain automation tools, this can also used by 
 MAPREDUCE-4900 to get TaskTracker name so can set TT's slot dynamically

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4944) Backport YARN-40 to support listClusterNodes and printNodeStatus in command line tool

2013-01-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557031#comment-13557031
 ] 

Junping Du commented on MAPREDUCE-4944:
---

I prefer to break AdminOpProtocol if we have to break compatibility. But 
another option could be add another protocol for slot get/set? I think 
manipulate map/red slots on TT is a very useful feature especially in sharing 
environment (like HBase region server live with TT), so may be deserved to add 
another protocol?

 Backport YARN-40 to support listClusterNodes and printNodeStatus in command 
 line tool
 -

 Key: MAPREDUCE-4944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4944
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.1.1
Reporter: Binglin Chang
Priority: Minor

  support listClusterNodes and printNodeStatus in command line tool is useful 
 for admin to create certain automation tools, this can also used by 
 MAPREDUCE-4900 to get TaskTracker name so can set TT's slot dynamically

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira