[jira] Commented: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.

2010-10-18 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922037#action_12922037
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1809:
--

I ran the test-patch manually for make sure that patch is ok.
{noformat} 
 +1 overall.

 +1 @author.  The patch does not contain any @author tags.

 +1 tests included.  The patch appears to include 19 new or modified tests.

 +1 javadoc.  The javadoc tool did not generate any warning messages.

 +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

 +1 findbugs.  The patch does not introduce any new Findbugs warnings.

 +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 +1 system tests framework.  The patch passed system tests framework 
compile.
{noformat} 

 Ant build changes for Streaming system tests in contrib projects.
 -

 Key: MAPREDUCE-1809
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: build
Affects Versions: 0.21.0
Reporter: Vinay Kumar Thota
Assignee: Vinay Kumar Thota
 Fix For: 0.21.1

 Attachments: 1809-ydist-security.patch, 1809-ydist-security.patch, 
 MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, 
 MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, 
 MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch


 Implementing new target( test-system) in build-contrib.xml file for executing 
 the system test that are in contrib projects. Also adding 'subant'  target in 
 aop.xml that calls the build-contrib.xml file for system tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2138) Gridmix tests with different time interval mr traces (1min, 3min and 5min).

2010-10-18 Thread Vinay Kumar Thota (JIRA)
Gridmix tests with different time interval mr traces (1min, 3min and 5min).
---

 Key: MAPREDUCE-2138
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2138
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: test
Reporter: Vinay Kumar Thota
Assignee: Vinay Kumar Thota


1. Generate input data based on cluster size and create the synthetic jobs by 
using the 1 min folded MR trace and
submit the jobs with below arguments.

GRIDMIX_JOB_TYPE = LoadJob
GRIDMIX_USER_RESOLVER = SubmitterUserResolver
GRIDMIX_SUBMISSION_POLICY = STRESS
Input Size = 400 MB * No. of nodes in cluster.
TRACE_FILE = 1 min folded trace.
Verify each job status and summary(QueueName, UserName, StatTime, FinishTime, 
maps, reducers and counters etc) after
completion of execution.

2. Generate input data based on cluster size and create the synthetic jobs by 
using the 3 min folded MR trace and
submit the jobs with below arguments.

GRIDMIX_JOB_TYPE = LoadJob
GRIDMIX_USER_RESOLVER = RoundRobinUserResolver
GRIDMIX_SUBMISSION_POLICY = Replay
Input Size = 200 MB * No. of nodes in cluster.
TRACE_FILE = 3 min folded trace.
PROXY_USERS = proxy users file path.
Verify each job status, submitted user and summary(QueueName, UserName, 
StatTime, FinishTime, maps, reducers and
counters etc) after completion of execution.

3. Generate input data based on cluster size and create the synthetic jobs by 
using the 5 min folded MR trace and
submit the jobs with below arguments.

GRIDMIX_JOB_TYPE = SleepJob
GRIDMIX_USER_RESOLVER = EchoUserResolver
GRIDMIX_MIN_FILE = 100 MB
GRIDMIX_SUBMISSION_POLICY = Serial
Input Size = 300 MB * No. of nodes in cluster.
TRACE_FILE = 5 min folded trace.
Verify each job status, file size and summary(QueueName, UserName, StatTime, 
FinishTime, maps, reducers and counters
etc) after completion of execution.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2139) Gridmix test for submitting jobs with different traces and runtime options.

2010-10-18 Thread Vinay Kumar Thota (JIRA)
Gridmix test for submitting jobs with different traces and runtime options.
---

 Key: MAPREDUCE-2139
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2139
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: test
Reporter: Vinay Kumar Thota
Assignee: Vinay Kumar Thota


1.Generate input data based on cluster size and create the synthetic jobs by 
using the 7 min folded MR trace and submit
the jobs with below arguments.

GRIDMIX_JOB_TYPE = SleepJob
GRIDMIX_USER_RESOLVER = SubmitterUserResolver
GRIDMIX_MIN_FILE = 200 MB
GRIDMIX_SUBMISSION_POLICY = STRESS
GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = false
Input Size = 400 MB * No. of nodes in cluster.
TRACE_FILE = 7 min folded trace.
Verify each job status, summary(QueueName, UserName, StatTime, FinishTime, 
maps, reducers and counters etc) after
completion of execution. Make sure the queue should be default queue name.

2. Generate input data based on cluster size and create the synthetic jobs by 
using the 10 min folded MR trace and
submit the jobs with below arguments.

GRIDMIX_JOB_TYPE = SleepJob
GRIDMIX_USER_RESOLVER = SubmitterUserResolver
GRIDMIX_MIN_FILE = 200 MB
GRIDMIX_SUBMISSION_POLICY = STRESS
GRIDMIX_JOB_SUBMISSION_QUEUE_IN_TRACE = false
SLEEPJOB_MAPTASK_ONLY = true
GRIDMIX_SLEEP_MAX_MAP_TIME = 10 sec
Input Size = 250 MB * No. of nodes in cluster.
TRACE_FILE = 7 min folded trace.
Verify each job status, summary(QueueName, UserName, StatTime, FinishTime, maps 
and counters etc) after completion of
execution. Make sure the reducers should be zero.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2084) Deprecated org.apache.hadoop.util package in MapReduce produces deprecations in Common classes in Eclipse

2010-10-18 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922161#action_12922161
 ] 

Tom White commented on MAPREDUCE-2084:
--

Nigel, the individual classes in MapReduce's org.apache.hadoop.util package 
(LinuxMemoryCalculatorPlugin, MemoryCalculatorPlugin, ProcessTree, 
ProcfsBasedProcessTree) are already marked as deprecated. The reason that I 
added a package-info.java class originally was so that I could mark the whole 
package as @InterfaceAudience.Private, and not have the package show up at all 
in the MapReduce Javadoc. Without this file the package appears in the Javadoc 
but is empty, since all the classes are Private themselves. Thinking about it 
more, the better fix is to keep the package-info.java file, but remove the 
@Deprecated annotation, since this is the part that is tripping up Eclipse.

 Deprecated org.apache.hadoop.util package in MapReduce produces deprecations 
 in Common classes in Eclipse
 -

 Key: MAPREDUCE-2084
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2084
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.21.1

 Attachments: MAPREDUCE-2084.patch


 As reported in [this 
 thread|http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201009.mbox/%3c4c9a0a08.3030...@web.de%3e]
  the classes in org.apache.hadoop.util from the Common JAR, like Tool, are 
 marked as deprecated by Eclipse, even though they are not deprecated. The fix 
 is to mark the individual classes in the MapReduce org.apache.hadoop.util 
 class as deprecated, not the whole package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2084) Deprecated org.apache.hadoop.util package in MapReduce produces deprecations in Common classes in Eclipse

2010-10-18 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922167#action_12922167
 ] 

Nigel Daley commented on MAPREDUCE-2084:


+1 to just removing @Deprecated


 Deprecated org.apache.hadoop.util package in MapReduce produces deprecations 
 in Common classes in Eclipse
 -

 Key: MAPREDUCE-2084
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2084
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.21.1

 Attachments: MAPREDUCE-2084.patch


 As reported in [this 
 thread|http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201009.mbox/%3c4c9a0a08.3030...@web.de%3e]
  the classes in org.apache.hadoop.util from the Common JAR, like Tool, are 
 marked as deprecated by Eclipse, even though they are not deprecated. The fix 
 is to mark the individual classes in the MapReduce org.apache.hadoop.util 
 class as deprecated, not the whole package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1592) Generate Eclipse's .classpath file from Ivy config

2010-10-18 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922181#action_12922181
 ] 

Tom White commented on MAPREDUCE-1592:
--

Sorry that this patch got forgotten. I just tried it out and it still works for 
me ({{ant clean compile eclipse}}). The compile step is needed to generate the 
Avro classes in org.apache.hadoop.mapreduce.jobhistory. So I think we should 
commit this, probably to the 0.21 branch and trunk. Similarly for HDFS-1035.

 Generate Eclipse's .classpath file from Ivy config
 --

 Key: MAPREDUCE-1592
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1592
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-1592.patch, MAPREDUCE-1592.patch, 
 MAPREDUCE-1592.patch


 MapReduce companion issue for HADOOP-6407.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2084) Deprecated org.apache.hadoop.util package in MapReduce produces deprecations in Common classes in Eclipse

2010-10-18 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-2084:
-

Attachment: MAPREDUCE-2084.patch

Here's a patch that removes @Deprecated from package-info.java.

 Deprecated org.apache.hadoop.util package in MapReduce produces deprecations 
 in Common classes in Eclipse
 -

 Key: MAPREDUCE-2084
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2084
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.21.1

 Attachments: MAPREDUCE-2084.patch, MAPREDUCE-2084.patch


 As reported in [this 
 thread|http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201009.mbox/%3c4c9a0a08.3030...@web.de%3e]
  the classes in org.apache.hadoop.util from the Common JAR, like Tool, are 
 marked as deprecated by Eclipse, even though they are not deprecated. The fix 
 is to mark the individual classes in the MapReduce org.apache.hadoop.util 
 class as deprecated, not the whole package.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2140) Re-generate fair scheduler design doc PDF

2010-10-18 Thread Matei Zaharia (JIRA)
Re-generate fair scheduler design doc PDF
-

 Key: MAPREDUCE-2140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2140
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Trivial
 Fix For: 0.21.1, 0.22.0


The Fair Scheduler contains a design document in 
src/contrib/fairscheduler/designdoc that is included both as a Latex file and 
as a PDF. However, the PDF that's currently there is not generated properly and 
has some question marks for section references. I'd like to regenerate it and 
commit the new one. There is no patch to attach because this just requires 
running pdflatex and committing a binary file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Matei Zaharia (JIRA)
Add an extra data field to Task for use by Mesos
--

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor


In order to support running Hadoop on the Mesos cluster manager 
(http://mesos.berkeley.edu/), I'd like to add an extra String field to the Task 
class to allow extra data (a Mesos task ID) to be associated with each task. 
This should have no impact on normal operation other than making the serialized 
form of Task a few bytes longer. In the Mesos support patch for Hadoop, this 
field is set by a pluggable Hadoop scheduler implementation to allow code on 
the TaskTracker side to see which Mesos task corresponds to each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2140) Re-generate fair scheduler design doc PDF

2010-10-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922397#action_12922397
 ] 

dhruba borthakur commented on MAPREDUCE-2140:
-

Sounds good to me. Please attach the newly generated pdf file to this jira for 
reference.

 Re-generate fair scheduler design doc PDF
 -

 Key: MAPREDUCE-2140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2140
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Trivial
 Fix For: 0.21.1, 0.22.0


 The Fair Scheduler contains a design document in 
 src/contrib/fairscheduler/designdoc that is included both as a Latex file and 
 as a PDF. However, the PDF that's currently there is not generated properly 
 and has some question marks for section references. I'd like to regenerate it 
 and commit the new one. There is no patch to attach because this just 
 requires running pdflatex and committing a binary file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-2141:
-

Attachment: mapreduce-2141-v1.patch

Here is a patch for this issue. It gives the new field package level access 
using get and set methods, which should be enough for our purposes.

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2140) Re-generate fair scheduler design doc PDF

2010-10-18 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-2140:
-

Attachment: fair_scheduler_design_doc.pdf

Sure, here it is.

 Re-generate fair scheduler design doc PDF
 -

 Key: MAPREDUCE-2140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2140
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Trivial
 Fix For: 0.21.1, 0.22.0

 Attachments: fair_scheduler_design_doc.pdf


 The Fair Scheduler contains a design document in 
 src/contrib/fairscheduler/designdoc that is included both as a Latex file and 
 as a PDF. However, the PDF that's currently there is not generated properly 
 and has some question marks for section references. I'd like to regenerate it 
 and commit the new one. There is no patch to attach because this just 
 requires running pdflatex and committing a binary file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-2141:
-

Status: Patch Available  (was: Open)

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-2140) Re-generate fair scheduler design doc PDF

2010-10-18 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia resolved MAPREDUCE-2140.
--

Resolution: Fixed

I've committed the regenerated design doc to trunk and 0.21.1. Thanks for 
taking a look at this, Dhruba.

 Re-generate fair scheduler design doc PDF
 -

 Key: MAPREDUCE-2140
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2140
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Trivial
 Fix For: 0.21.1, 0.22.0

 Attachments: fair_scheduler_design_doc.pdf


 The Fair Scheduler contains a design document in 
 src/contrib/fairscheduler/designdoc that is included both as a Latex file and 
 as a PDF. However, the PDF that's currently there is not generated properly 
 and has some question marks for section references. I'd like to regenerate it 
 and commit the new one. There is no patch to attach because this just 
 requires running pdflatex and committing a binary file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922409#action_12922409
 ] 

Owen O'Malley commented on MAPREDUCE-2141:
--

I'd suggest using either BytesWritable or Text rather than String. Other than 
that it looks good. Some additional JavaDoc that explained that the field was 
for the scheduler would be helpful too, I think.

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922416#action_12922416
 ] 

Todd Lipcon commented on MAPREDUCE-2141:


+1 for BytesWritable instead of String. Encoding is a pain in the butt.

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922420#action_12922420
 ] 

Matei Zaharia commented on MAPREDUCE-2141:
--

Thanks for the comments, Owen and Arun. We actually plan to submit Mesos to the 
Apache incubator soon, so maybe it will become Apache software eventually! I'll 
get back to you with an updated patch.

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922423#action_12922423
 ] 

Tom White commented on MAPREDUCE-2141:
--

Arun, I'm happy with this (using BytesWritable) too.

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2141) Add an extra data field to Task for use by Mesos

2010-10-18 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922433#action_12922433
 ] 

dhruba borthakur commented on MAPREDUCE-2141:
-

+1 on getting this into Apache Hadoop!

 Add an extra data field to Task for use by Mesos
 --

 Key: MAPREDUCE-2141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-2141-v1.patch


 In order to support running Hadoop on the Mesos cluster manager 
 (http://mesos.berkeley.edu/), I'd like to add an extra String field to the 
 Task class to allow extra data (a Mesos task ID) to be associated with each 
 task. This should have no impact on normal operation other than making the 
 serialized form of Task a few bytes longer. In the Mesos support patch for 
 Hadoop, this field is set by a pluggable Hadoop scheduler implementation to 
 allow code on the TaskTracker side to see which Mesos task corresponds to 
 each Hadoop task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-2142) Refactor RaidNode to remove dependence on map reduce

2010-10-18 Thread Patrick Kling (JIRA)
Refactor RaidNode to remove dependence on map reduce


 Key: MAPREDUCE-2142
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2142
 Project: Hadoop Map/Reduce
  Issue Type: Task
Reporter: Patrick Kling


I am refactoring the RaidNode code as follows: The base class RaidNode will 
contain the common functionality needed for raiding files. The derived class 
LocalRaidNode contains an implementation of RaidNode that performs raiding 
locally. The derived class DistRaidNode performs raiding using map reduce jobs. 
This way, only DistRaidNode has a dependency on map reduce code and RaidNode 
and LocalRaidNode can be moved to HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2132) Need a command line option in RaidShell to fix blocks using raid

2010-10-18 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12922444#action_12922444
 ] 

Scott Chen commented on MAPREDUCE-2132:
---

Thanks for the good work, Ram. This is really nice.
I only have some minor comments.

{code}
+if (pathStr.contains(RaidNode.HAR_SUFFIX)) {
{code}
Can we use endsWith() to make it more specific?
{code}
+Path indexFile = new Path(harDirectory + /_index);
{code}
Can we use some constant like HAR_INDEX_FILENAME here?

It seems the current test fixes only the source block.
Is it possible that you can add a test case that fix the parity file?

 Need a command line option in RaidShell to fix blocks using raid
 

 Key: MAPREDUCE-2132
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2132
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/raid
Reporter: Ramkumar Vadali
Assignee: Ramkumar Vadali
 Fix For: 0.22.0

 Attachments: MAPREDUCE-2132.patch


 RaidShell currently has an option to recover a file and return the path to 
 the recovered file. The administrator can then rename the recovered file to 
 the damaged file.
 The problem with this is that the file metadata is altered, specifically the 
 modification time. Instead we need a way to just repair the damaged blocks 
 and send the fixed blocks to a data node.
 Once this is done, we can put automation around it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.