from:"Harsh J Chouraria \(JIRA\)"

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-25 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039150#comment-13039150
]

Harsh J Chouraria commented on MAPREDUCE-2384:
--

Justification: Existing test cases already cover submissions. The change does
not require a new one, IMO.

Can MR make error response Immediately?
---

Key: MAPREDUCE-2384
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: job submission
Affects Versions: 0.21.0
Reporter: Denny Ye
Assignee: Harsh J Chouraria
Fix For: 0.23.0

Attachments: MAPREDUCE-2384.r1.diff

When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made
me confused about error response. For example:
1. JobSubmitter checking output for each job. MapReduce makes rule to
limit that each job output must be not exist to avoid fault overwrite. In my
opinion, MR should verify output at the point of client submitting. Actually,
it copies related files to specified target and then, doing the verifying.
2. JobTracker. Job has been submitted to JobTracker. In first step,
JT create JIT object that is very huge . Next step, JT start to verify job
queue authority and memory requirements.

In normal case, verifying client input then response immediately if
any cases in fault. Regular logic can be performed if all the inputs have
passed.
It seems like that those code does not make sense for understanding.
Is only my personal opinion? Wish someone help me to explain the details.
Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-05-24 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1347:
-

Attachment: MAPREDUCE-1347.r2.diff

Patch that uses Guava's MapMaker.makeComputingMap.

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: MAPREDUCE-1347.r2.diff, mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-05-24 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038623#comment-13038623
 ] 

Harsh J Chouraria commented on MAPREDUCE-1347:
--

Note: Guava has been added as a dependency. A thread where it was agreed upon 
can be found here: 
http://search-hadoop.com/m/NrnW72tdHRD1/guavasubj=Add+Guava+as+a+dependency+

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: MAPREDUCE-1347.r2.diff, mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-05-24 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1347:
-

Release Note: Fix missing synchronization in MultiOutputFormat.
  Status: Patch Available  (was: Open)

Tests added for multi-threaded exec of MultiTextOF.

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0, 0.20.2, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: MAPREDUCE-1347.r2.diff, mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-05-24 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1347:
-

Attachment: MAPREDUCE-1347.r3.diff

The last patch's test case data may not have been sufficient to test out the 
changes. Only test data contents have been changed in this new patch.

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: MAPREDUCE-1347.r2.diff, MAPREDUCE-1347.r3.diff, 
 mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-21 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J Chouraria updated MAPREDUCE-2384:
-

Attachment: MAPREDUCE-2384.r1.diff

Can MR make error response Immediately?
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-21 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J Chouraria updated MAPREDUCE-2384:
-

Fix Version/s: 0.23.0
Release Note: Submitter should fail on errors early, before transferring
files.
Status: Patch Available (was: Open)

As before, I do not think refactoring (2) is a good idea maintenance-wise.
Here's a patch for just the reordering of (1). Some simple jobsubs pass with
the change -- I believe existing test cases cover this change already; but let
me know if not.

Can MR make error response Immediately?
---

Attachments: MAPREDUCE-2384.r1.diff

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2328) memory-related configurations missing from mapred-default.xml

2011-05-12 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2328:
-

Status: Open  (was: Patch Available)

Thanks Todd. I'll put up a fixed patch shortly.

 memory-related configurations missing from mapred-default.xml
 -

 Key: MAPREDUCE-2328
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2328
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.22.0

 Attachments: MAPREDUCE-2328.r1.diff


 HADOOP-5881 added new configuration parameters for memory-based scheduling, 
 but they weren't added to mapred-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2328) memory-related configurations missing from mapred-default.xml

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2328:
-

Attachment: MAPREDUCE-2328.r1.diff

 memory-related configurations missing from mapred-default.xml
 -

 Key: MAPREDUCE-2328
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2328
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.22.0

 Attachments: MAPREDUCE-2328.r1.diff


 HADOOP-5881 added new configuration parameters for memory-based scheduling, 
 but they weren't added to mapred-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2328) memory-related configurations missing from mapred-default.xml

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2328:


Assignee: Harsh J Chouraria

 memory-related configurations missing from mapred-default.xml
 -

 Key: MAPREDUCE-2328
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2328
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.22.0

 Attachments: MAPREDUCE-2328.r1.diff


 HADOOP-5881 added new configuration parameters for memory-based scheduling, 
 but they weren't added to mapred-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2328) memory-related configurations missing from mapred-default.xml

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2328:
-

Status: Patch Available  (was: Open)

Patch with brief description of the options, in mapred-default.xml.

 memory-related configurations missing from mapred-default.xml
 -

 Key: MAPREDUCE-2328
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2328
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.22.0

 Attachments: MAPREDUCE-2328.r1.diff


 HADOOP-5881 added new configuration parameters for memory-based scheduling, 
 but they weren't added to mapred-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2410:


Assignee: Harsh J Chouraria

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2410:
-

Fix Version/s: 0.23.0
Affects Version/s: 0.20.2
   Status: Patch Available  (was: Open)

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2383) Improve documentation of DistributedCache methods

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2383:
-

Attachment: MAPREDUCE-2383.r1.diff

Patch that updates Job's javadocs to reflect the difference.

 Improve documentation of DistributedCache methods
 -

 Key: MAPREDUCE-2383
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2383
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: distributed-cache, documentation
Affects Versions: 0.23.0
Reporter: Todd Lipcon
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2383.r1.diff


 Users find the various methods in DistributedCache confusing - it's not 
 clearly documented what the difference is between addArchiveToClassPath vs 
 addFileToClassPath. We should improve the docs to clarify this and perhaps 
 add an example that uses the DistributedCache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2383) Improve documentation of DistributedCache methods

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2383:


Assignee: Harsh J Chouraria

 Improve documentation of DistributedCache methods
 -

 Key: MAPREDUCE-2383
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2383
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: distributed-cache, documentation
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2383.r1.diff


 Users find the various methods in DistributedCache confusing - it's not 
 clearly documented what the difference is between addArchiveToClassPath vs 
 addFileToClassPath. We should improve the docs to clarify this and perhaps 
 add an example that uses the DistributedCache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2383) Improve documentation of DistributedCache methods

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2383:
-

Fix Version/s: 0.23.0
   Status: Patch Available  (was: Open)

 Improve documentation of DistributedCache methods
 -

 Key: MAPREDUCE-2383
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2383
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: distributed-cache, documentation
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2383.r1.diff


 Users find the various methods in DistributedCache confusing - it's not 
 clearly documented what the difference is between addArchiveToClassPath vs 
 addFileToClassPath. We should improve the docs to clarify this and perhaps 
 add an example that uses the DistributedCache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2410:
-

Status: Open  (was: Patch Available)

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2410:
-

Attachment: MAPREDUCE-2410.r2.diff

Incorporated the suggestion (#1) from Dieter.

Would adding a link to an external be a good idea from the ASF POV? I think its 
better if that goes to the wiki pages instead? I may be being too paranoid, so 
let me know :P

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff, MAPREDUCE-2410.r2.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031845#comment-13031845
 ] 

Harsh J Chouraria commented on MAPREDUCE-2410:
--

Dieter - I've added some info here: 
http://wiki.apache.org/hadoop/HadoopStreaming/AlternativeInterfaces

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff, MAPREDUCE-2410.r2.diff, 
 MAPREDUCE-2410.r3.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2410:
-

Attachment: MAPREDUCE-2410.r3.diff

Add a wiki page link.

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff, MAPREDUCE-2410.r2.diff, 
 MAPREDUCE-2410.r3.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2410) document multiple keys per reducer oddity in hadoop streaming FAQ

2011-05-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2410:
-

Release Note: Add an FAQ entry regarding the differences between Java API 
and Streaming development of MR programs.
  Status: Patch Available  (was: Open)

 document multiple keys per reducer oddity in hadoop streaming FAQ
 -

 Key: MAPREDUCE-2410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2410
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, documentation
Affects Versions: 0.20.2
Reporter: Dieter Plaetinck
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2410.r1.diff, MAPREDUCE-2410.r2.diff, 
 MAPREDUCE-2410.r3.diff

   Original Estimate: 40m
  Remaining Estimate: 40m

 Hi,
 for a newcomer to hadoop streaming, it comes as a surprise that the reducer 
 receives arbitrary keys, unlike the real hadoop where a reducer works on a 
 single key.
 An explanation for this is @ 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/browser
 I suggest to add this to the FAQ of hadoop streaming

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-08 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030471#comment-13030471
]

Harsh J Chouraria commented on MAPREDUCE-2384:
--

1. is an easy one to fix (basically to move job spec check a step up). I have a
patch for this in pipeline.

2. as per OP, is to not build JIP until after the config checks. I think it is
alright to have it the way it is now, since to check pre-JIP construct would
still require one extra lookup to occur (the config props that are to be
checked, are also used elsewhere later). Besides, its easier to read/maintain
with the JIP methods and I don't think the construction time (a few property
loads, some array decls) would take much time.

What are your thoughts on 2.; would we benefit enough to refactor the parts to
not use JIP (and construct it only after validity is verified)?

Can MR make error response Immediately?
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2423) Monitoring the job tracker ui of hadoop using other open source monitoring tools like Nagios

2011-05-08 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030473#comment-13030473
 ] 

Harsh J Chouraria commented on MAPREDUCE-2423:
--

Saurabh - You can use the JMX metrics Hadoop pushes out via a plugin like 
check_jmx in Nagios, as Allen pointed out: 
http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx/details

Would that work for you?

 Monitoring the job tracker ui of hadoop using other open source monitoring 
 tools like Nagios
 

 Key: MAPREDUCE-2423
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2423
 Project: Hadoop Map/Reduce
  Issue Type: Wish
  Components: jobtracker
Reporter: Saurabh Mishra
Priority: Trivial

 I just wish if there is a way I can write monitors to check my hadoop job 
 tracker UI using my existing Nagios infrastructure. As this would help me in 
 keeping everything centrally located and hence under manageable limits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-05-07 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030378#comment-13030378
 ] 

Harsh J Chouraria commented on MAPREDUCE-1347:
--

Hrm, I agree and apologize, that was _silly_. Would using a synchronized map 
solve this?

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-07 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J Chouraria reassigned MAPREDUCE-2384:

Assignee: Harsh J Chouraria

Can MR make error response Immediately?
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-2474) Add docs to the new API Partitioner on how to access Job Configuration data

2011-05-05 Thread Harsh J Chouraria (JIRA)

Add docs to the new API Partitioner on how to access Job Configuration data
---

 Key: MAPREDUCE-2474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2474
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.21.0
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.22.0, 0.23.0


The new Partitioner interface does not extend Configurable classes as the old 
one and thus need to carry a tip on how to implement a custom partitioner that 
needs to feed off the Job Configuration data to work.

Attaching a patch that adds in the javadoc fix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2474) Add docs to the new API Partitioner on how to access Job Configuration data

2011-05-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2474:
-

Status: Patch Available  (was: Open)

Marking as P-A.

 Add docs to the new API Partitioner on how to access Job Configuration data
 ---

 Key: MAPREDUCE-2474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2474
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.21.0
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: documentation, partitioners
 Fix For: 0.22.0, 0.23.0

 Attachments: MAPREDUCE-2474.r1.diff

   Original Estimate: 1m
  Remaining Estimate: 1m

 The new Partitioner interface does not extend Configurable classes as the old 
 one and thus need to carry a tip on how to implement a custom partitioner 
 that needs to feed off the Job Configuration data to work.
 Attaching a patch that adds in the javadoc fix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2474) Add docs to the new API Partitioner on how to access Job Configuration data

2011-05-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2474:
-

Attachment: MAPREDUCE-2474.r1.diff

Doc-fix patch.

 Add docs to the new API Partitioner on how to access Job Configuration data
 ---

 Key: MAPREDUCE-2474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2474
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.21.0
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Minor
  Labels: documentation, partitioners
 Fix For: 0.22.0, 0.23.0

 Attachments: MAPREDUCE-2474.r1.diff

   Original Estimate: 1m
  Remaining Estimate: 1m

 The new Partitioner interface does not extend Configurable classes as the old 
 one and thus need to carry a tip on how to implement a custom partitioner 
 that needs to feed off the Job Configuration data to work.
 Attaching a patch that adds in the javadoc fix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2170) Send out last-minute load averages in TaskTrackerStatus

2011-05-02 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2170:
-

Fix Version/s: 0.23.0
   Issue Type: New Feature  (was: Improvement)

 Send out last-minute load averages in TaskTrackerStatus
 ---

 Key: MAPREDUCE-2170
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2170
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobtracker
Affects Versions: 0.22.0
 Environment: GNU/Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0

 Attachments: mapreduce.loadaverage.r3.diff, 
 mapreduce.loadaverage.r4.diff, mapreduce.loadaverage.r5.diff

   Original Estimate: 20m
  Remaining Estimate: 20m

 Load averages could be useful in scheduling. This patch looks to extend the 
 existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
 load averages of the last one minute via the TaskTrackerStatus.
 Patch is up for review, with test cases added, at: 
 https://reviews.apache.org/r/20/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-05-02 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027582#comment-13027582
 ] 

Harsh J Chouraria commented on MAPREDUCE-1720:
--

Amar, it would seem that what you ask for may be present in NG's MAPREDUCE-2399

Am not sure how that ticket affects this and all other pending UI issues, but 
it looks like a welcome change from the usual JSP pages.

  'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker 
 UI
 

 Key: MAPREDUCE-1720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: mapred.failed.killed.difference.png, 
 mapreduce.unsuccessfuljobs.ui.r1.diff


 The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job 
 status has been separated from Failed as part of HADOOP-3924, so the UI needs 
 to be updated to reflect the same.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2011-04-09 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2236:
-

Status: Patch Available  (was: Open)

 No task may execute due to an Integer overflow possibility
 --

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2236.r1.diff


 If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
 inside TaskInProgress, and thereby no task is attempted by the cluster and 
 the map tasks stay in pending state forever.
 For example, here's a job driver that causes this:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapred.FileInputFormat;
 import org.apache.hadoop.mapred.JobClient;
 import org.apache.hadoop.mapred.JobConf;
 import org.apache.hadoop.mapred.TextInputFormat;
 import org.apache.hadoop.mapred.lib.IdentityMapper;
 import org.apache.hadoop.mapred.lib.NullOutputFormat;
 @SuppressWarnings(deprecation)
 public class IntegerOverflow {
   /**
* @param args
* @throws IOException 
*/
   @SuppressWarnings(deprecation)
   public static void main(String[] args) throws IOException {
   JobConf conf = new JobConf();
   
   Path inputPath = new Path(ignore);
   FileSystem fs = FileSystem.get(conf);
   if (!fs.exists(inputPath)) {
   FSDataOutputStream out = fs.create(inputPath);
   out.writeChars(Test);
   out.close();
   }
   
   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(NullOutputFormat.class);
   FileInputFormat.addInputPath(conf, inputPath);
   
   conf.setMapperClass(IdentityMapper.class);
   conf.setNumMapTasks(1);
   // Problem inducing line follows.
   conf.setMaxMapAttempts(Integer.MAX_VALUE);
   
   // No reducer in this test, although setMaxReduceAttempts leads 
 to the same problem.
   conf.setNumReduceTasks(0);
   
   JobClient.runJob(conf);
   }
 }
 {code}
 The above code will not let any map task run. Additionally, a log would be 
 created inside JobTracker logs with the following information that clearly 
 shows the overflow:
 {code}
 2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: 
 Exceeded limit of -2147483648 (plus 0 killed) attempts for the tip 
 'task_201012300058_0001_m_00'
 {code}
 The issue lies inside the TaskInProgress class 
 (/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
 getTaskToRun(String taskTracker) method.
 {code}
   public Task getTaskToRun(String taskTracker) throws IOException {   
 // Create the 'taskid'; do not count the 'killed' tasks against the job!
 TaskAttemptID taskid = null;
 /*  THIS LINE v == */
 if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
 /*  THIS LINE ^== */
   // Make sure that the attempts are unqiue across restarts
   int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
 nextTaskId;
   taskid = new TaskAttemptID( id, attemptId);
   ++nextTaskId;
 } else {
   LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
(plus  + numKilledTasks +  killed)  + 
attempts for the tip ' + getTIPId() + ');
   return null;
 }
 {code}
 Since all three variables being added are integer in type, one of them being 
 Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
 and returning a null as the result is negative.
 One solution would be to make one of these variables into a long, so the 
 addition does not overflow?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2011-04-09 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2236:
-

Attachment: MAPREDUCE-2236.r1.diff

Patch that caps maximum attempts at 100.

 No task may execute due to an Integer overflow possibility
 --

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2236.r1.diff


 If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
 inside TaskInProgress, and thereby no task is attempted by the cluster and 
 the map tasks stay in pending state forever.
 For example, here's a job driver that causes this:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapred.FileInputFormat;
 import org.apache.hadoop.mapred.JobClient;
 import org.apache.hadoop.mapred.JobConf;
 import org.apache.hadoop.mapred.TextInputFormat;
 import org.apache.hadoop.mapred.lib.IdentityMapper;
 import org.apache.hadoop.mapred.lib.NullOutputFormat;
 @SuppressWarnings(deprecation)
 public class IntegerOverflow {
   /**
* @param args
* @throws IOException 
*/
   @SuppressWarnings(deprecation)
   public static void main(String[] args) throws IOException {
   JobConf conf = new JobConf();
   
   Path inputPath = new Path(ignore);
   FileSystem fs = FileSystem.get(conf);
   if (!fs.exists(inputPath)) {
   FSDataOutputStream out = fs.create(inputPath);
   out.writeChars(Test);
   out.close();
   }
   
   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(NullOutputFormat.class);
   FileInputFormat.addInputPath(conf, inputPath);
   
   conf.setMapperClass(IdentityMapper.class);
   conf.setNumMapTasks(1);
   // Problem inducing line follows.
   conf.setMaxMapAttempts(Integer.MAX_VALUE);
   
   // No reducer in this test, although setMaxReduceAttempts leads 
 to the same problem.
   conf.setNumReduceTasks(0);
   
   JobClient.runJob(conf);
   }
 }
 {code}
 The above code will not let any map task run. Additionally, a log would be 
 created inside JobTracker logs with the following information that clearly 
 shows the overflow:
 {code}
 2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: 
 Exceeded limit of -2147483648 (plus 0 killed) attempts for the tip 
 'task_201012300058_0001_m_00'
 {code}
 The issue lies inside the TaskInProgress class 
 (/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
 getTaskToRun(String taskTracker) method.
 {code}
   public Task getTaskToRun(String taskTracker) throws IOException {   
 // Create the 'taskid'; do not count the 'killed' tasks against the job!
 TaskAttemptID taskid = null;
 /*  THIS LINE v == */
 if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
 /*  THIS LINE ^== */
   // Make sure that the attempts are unqiue across restarts
   int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
 nextTaskId;
   taskid = new TaskAttemptID( id, attemptId);
   ++nextTaskId;
 } else {
   LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
(plus  + numKilledTasks +  killed)  + 
attempts for the tip ' + getTIPId() + ');
   return null;
 }
 {code}
 Since all three variables being added are integer in type, one of them being 
 Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
 and returning a null as the result is negative.
 One solution would be to make one of these variables into a long, so the 
 addition does not overflow?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2397) Allow user to sort jobs in different sections (Completed, Failed, etc.) by the various columns available

2011-04-09 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13017906#comment-13017906
 ] 

Harsh J Chouraria commented on MAPREDUCE-2397:
--

MAPREDUCE-2399 (Part of the MAPREDUCE-279 tree) appears to have this as part of 
its revamped interface.

 Allow user to sort jobs in different sections (Completed, Failed, etc.) by 
 the various columns available
 

 Key: MAPREDUCE-2397
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2397
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Stephen Tunney
Priority: Trivial
  Labels: interface, jsp, page, user, web

 It would be nice (IMHO) to be able to sort the tables on the jobtracker.jsp 
 page by any column (jobID would be most logical at first) so that one could 
 eliminate scrolling all of the time.  Perhaps also have the page save the 
 user's sorting preferences per table too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (MAPREDUCE-486) JobTracker web UI counts COMMIT_PENDING tasks as Running

2011-03-16 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-486:
---

Assignee: Harsh J Chouraria

 JobTracker web UI counts COMMIT_PENDING tasks as Running
 

 Key: MAPREDUCE-486
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-486
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
Priority: Minor

 In jobdetails.jsp, tasks in COMMIT_PENDING state are listed as Running. I 
 propose creating another column in this table for COMMIT_PENDING tasks, since 
 users find it confusing that a given job can have more tasks Running than 
 their total cluster capacity.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Created: (MAPREDUCE-2390) JobTracker and TaskTrackers fail with a misleading error if one of the mapreduce.cluster.dir has unusable permissions / is unavailable.

2011-03-16 Thread Harsh J Chouraria (JIRA)

JobTracker and TaskTrackers fail with a misleading error if one of the 
mapreduce.cluster.dir has unusable permissions / is unavailable.
---

 Key: MAPREDUCE-2390
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2390
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker, tasktracker
Affects Versions: 0.20.2
 Environment: CDH3 and Apache 0.20 || Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria


To reproduce, have a mapred.local.dir property set to a few directories. Before 
starting up the JT, set one of these directories' permission as 'd-', 
and then start the JT/TT. The JT, although it tries to ignore this directory, 
fails with an odd and misleading message claiming that its configured address 
in use.

Fixing the permission clears this issue!

This was also reported in the mailing lists by Ted Yu, quite a few months ago. 
But I had forgotten about filing a bug for it here. Still seems to happen. A 
log is attached below.

{code}
2011-03-17 00:40:32,321 WARN org.apache.hadoop.mapred.JobTracker: Error 
starting tracker: java.io.IOException: Cannot create toBeDeleted in 
/home/hack/.tmplocalz/2
at 
org.apache.hadoop.util.MRAsyncDiskService.init(MRAsyncDiskService.java:86)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2189)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2022)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:276)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:268)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4712)

2011-03-17 00:40:33,322 INFO 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Updating the current master key for generating delegation tokens
2011-03-17 00:40:33,322 INFO 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Starting expired delegation token remover thread, tokenRemoverScanInterval=60 
min(s)
2011-03-17 00:40:33,322 INFO 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Updating the current master key for generating delegation tokens
2011-03-17 00:40:33,322 INFO org.apache.hadoop.mapred.JobTracker: Scheduler 
configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, 
limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1)
2011-03-17 00:40:33,322 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2011-03-17 00:40:33,350 INFO org.apache.hadoop.mapred.JobTracker: Starting 
jobtracker with owner as hack
2011-03-17 00:40:33,351 FATAL org.apache.hadoop.mapred.JobTracker: 
java.net.BindException: Problem binding to localhost/127.0.0.1:8021 : Address 
already in use
at org.apache.hadoop.ipc.Server.bind(Server.java:227)
at org.apache.hadoop.ipc.Server$Listener.init(Server.java:314)
at org.apache.hadoop.ipc.Server.init(Server.java:1411)
at org.apache.hadoop.ipc.RPC$Server.init(RPC.java:510)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:471)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2112)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2022)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:276)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:268)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4712)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:225)
... 9 more

2011-03-17 00:40:33,352 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: 
/
SHUTDOWN_MSG: Shutting down JobTracker at QDuo/127.0.0.1
/
{code}

The list conversation in context, at {{search-hadoop.com}}:
http://search-hadoop.com/m/FzN7iqreL/problem+starting+cdh3b2+jobtrackersubj=problem+starting+cdh3b2+jobtracker

I'll try to investigate and post the exact problem / solution soon.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2011-03-15 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007091#comment-13007091
 ] 

Harsh J Chouraria commented on MAPREDUCE-2236:
--

I'm wondering on how to cap this? Would it be best capped at the set level, or 
checked and capped at the get level?

I'm thinking 'get' is better.

 No task may execute due to an Integer overflow possibility
 --

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0


 If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
 inside TaskInProgress, and thereby no task is attempted by the cluster and 
 the map tasks stay in pending state forever.
 For example, here's a job driver that causes this:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapred.FileInputFormat;
 import org.apache.hadoop.mapred.JobClient;
 import org.apache.hadoop.mapred.JobConf;
 import org.apache.hadoop.mapred.TextInputFormat;
 import org.apache.hadoop.mapred.lib.IdentityMapper;
 import org.apache.hadoop.mapred.lib.NullOutputFormat;
 @SuppressWarnings(deprecation)
 public class IntegerOverflow {
   /**
* @param args
* @throws IOException 
*/
   @SuppressWarnings(deprecation)
   public static void main(String[] args) throws IOException {
   JobConf conf = new JobConf();
   
   Path inputPath = new Path(ignore);
   FileSystem fs = FileSystem.get(conf);
   if (!fs.exists(inputPath)) {
   FSDataOutputStream out = fs.create(inputPath);
   out.writeChars(Test);
   out.close();
   }
   
   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(NullOutputFormat.class);
   FileInputFormat.addInputPath(conf, inputPath);
   
   conf.setMapperClass(IdentityMapper.class);
   conf.setNumMapTasks(1);
   // Problem inducing line follows.
   conf.setMaxMapAttempts(Integer.MAX_VALUE);
   
   // No reducer in this test, although setMaxReduceAttempts leads 
 to the same problem.
   conf.setNumReduceTasks(0);
   
   JobClient.runJob(conf);
   }
 }
 {code}
 The above code will not let any map task run. Additionally, a log would be 
 created inside JobTracker logs with the following information that clearly 
 shows the overflow:
 {code}
 2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: 
 Exceeded limit of -2147483648 (plus 0 killed) attempts for the tip 
 'task_201012300058_0001_m_00'
 {code}
 The issue lies inside the TaskInProgress class 
 (/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
 getTaskToRun(String taskTracker) method.
 {code}
   public Task getTaskToRun(String taskTracker) throws IOException {   
 // Create the 'taskid'; do not count the 'killed' tasks against the job!
 TaskAttemptID taskid = null;
 /*  THIS LINE v == */
 if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
 /*  THIS LINE ^== */
   // Make sure that the attempts are unqiue across restarts
   int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
 nextTaskId;
   taskid = new TaskAttemptID( id, attemptId);
   ++nextTaskId;
 } else {
   LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
(plus  + numKilledTasks +  killed)  + 
attempts for the tip ' + getTIPId() + ');
   return null;
 }
 {code}
 Since all three variables being added are integer in type, one of them being 
 Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
 and returning a null as the result is negative.
 One solution would be to make one of these variables into a long, so the 
 addition does not overflow?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2272) Job ACL file should not be executable

2011-03-04 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2272:
-

 Tags: acls, job
Fix Version/s: 0.23.0
 Release Note: Job ACL files now have permissions set to 600 (previously 
700).
   Status: Patch Available  (was: Open)

 Job ACL file should not be executable
 -

 Key: MAPREDUCE-2272
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2272
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.23.0

 Attachments: mapreduce.2272.r1.diff


 For some reason the job ACL file is localized with permissions 700. This 
 doesn't make sense, since it's not executable. It should be 600.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2272) Job ACL file should not be executable

2011-03-04 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2272:
-

Attachment: mapreduce.2272.r2.diff

Ah, I did not know about that test case (Tried finding one earlier, but 
apparently this one is testing it rather differently and I did not go through 
these classes carefully enough).

Attaching a new patch with a test case change.

 Job ACL file should not be executable
 -

 Key: MAPREDUCE-2272
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2272
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.23.0

 Attachments: mapreduce.2272.r1.diff, mapreduce.2272.r2.diff


 For some reason the job ACL file is localized with permissions 700. This 
 doesn't make sense, since it's not executable. It should be 600.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-993) bin/hadoop job -events jobid from-event-# #-of-events help message is confusing

2011-03-01 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000919#comment-13000919
 ] 

Harsh J Chouraria commented on MAPREDUCE-993:
-

A change in help-doc-string, does not require a new/modified test case.

 bin/hadoop job -events jobid from-event-# #-of-events help message is 
 confusing
 -

 Key: MAPREDUCE-993
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-993
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Iyappan Srinivasan
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.23.0

 Attachments: mapreduce.993.r1.diff


 More explanation needs to be there like a) events always start from 1 b) the 
 message could be like from-event-number to-event-number where 
 from-event-number starts from 1. This will give teh end user idea as to 
 what to enter.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-1242) Chain APIs error misleading

2011-02-28 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000405#comment-13000405
 ] 

Harsh J Chouraria commented on MAPREDUCE-1242:
--

Just change of strings in an exception message. No test cases should be 
required for that.

 Chain APIs error misleading
 ---

 Key: MAPREDUCE-1242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Amogh Vasekar
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1242.patch, MAPREDUCE-1242.r2.patch


 Hi,
 I was using the Chain[Mapper/Reducer] APIs , and in Class Chain line 207 the 
 error thrown : 
 The Mapper output key class does not match the previous Mapper input key 
 class
 Shouldn't this be The Mapper *input* key class does not match the previous 
 Mapper *Output* key class ? Sort of misleads :) 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2225) MultipleOutputs should not require the use of 'Writable'

2011-02-28 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2225:
-

Attachment: multipleoutputs.nowritables.r2.diff

Sure, it should avoid a raw types warning I  think? Here's a patch for that 
update.

 MultipleOutputs should not require the use of 'Writable'
 

 Key: MAPREDUCE-2225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2225
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.20.1
 Environment: Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0

 Attachments: multipleoutputs.nowritables.r1.diff, 
 multipleoutputs.nowritables.r2.diff, multipleoutputs.nowritables.r2.diff

   Original Estimate: 1m
  Remaining Estimate: 1m

 MultipleOutputs right now requires for Key/Value classes to utilize the 
 Writable and WritableComparable interfaces, and fails if the associated 
 key/value classes aren't doing so.
 With support for alternates like Avro serialization, using Writables isn't 
 necessary and thus the MO class must not strictly check for them.
 And since comparators may be given separately, key class doesn't need to be 
 checked for implementing a comparable (although it is good design if the key 
 class does implement Comparable at least).
 Am not sure if this brings about an incompatible change (does Java have BIC? 
 No idea).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-1932) record skipping doesn't work with the new map/reduce api

2011-02-27 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1932:
-

Attachment: mapreduce.1932.skippingreader.r1.diff

Here's the first attempt at this to get it rolling (smells like a regression!).

Will add a test case for this soon and up a fresh patch post-verification.

 record skipping doesn't work with the new map/reduce api
 

 Key: MAPREDUCE-1932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Harsh J Chouraria
 Attachments: mapreduce.1932.skippingreader.r1.diff


 The new HADOOP-1230 map/reduce api doesn't support the record skipping 
 features.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (MAPREDUCE-1932) record skipping doesn't work with the new map/reduce api

2011-02-27 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1932:


Assignee: Harsh J Chouraria

 record skipping doesn't work with the new map/reduce api
 

 Key: MAPREDUCE-1932
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1932
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Harsh J Chouraria
 Attachments: mapreduce.1932.skippingreader.r1.diff


 The new HADOOP-1230 map/reduce api doesn't support the record skipping 
 features.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2251) Remove mapreduce.job.userhistorylocation config

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2251:
-

Attachment: mapreduce.2251.jobhistorylocremove.r1.diff

Patch that removes all refs of the said property in Map/Reduce. HadoopArchives 
was using it to disable generation of history, but that's not possible now and 
hence removed from that class as well.

 Remove mapreduce.job.userhistorylocation config
 ---

 Key: MAPREDUCE-2251
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2251
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Attachments: mapreduce.2251.jobhistorylocremove.r1.diff


 Best I can tell, this config parameter is no longer used as of MAPREDUCE-157 
 but still exists in the code and in mapred-default.xml. We should remove it 
 to avoid user confusion.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (MAPREDUCE-2251) Remove mapreduce.job.userhistorylocation config

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2251:


Assignee: Harsh J Chouraria

 Remove mapreduce.job.userhistorylocation config
 ---

 Key: MAPREDUCE-2251
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2251
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: mapreduce.2251.jobhistorylocremove.r1.diff


 Best I can tell, this config parameter is no longer used as of MAPREDUCE-157 
 but still exists in the code and in mapred-default.xml. We should remove it 
 to avoid user confusion.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2251) Remove mapreduce.job.userhistorylocation config

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2251:
-

 Tags: jobhistory
Fix Version/s: 0.23.0
 Release Note: Remove the now defunct property 
`mapreduce.job.userhistorylocation`.
   Status: Patch Available  (was: Open)

Marking as PA.

 Remove mapreduce.job.userhistorylocation config
 ---

 Key: MAPREDUCE-2251
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2251
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: mapreduce.2251.jobhistorylocremove.r1.diff


 Best I can tell, this config parameter is no longer used as of MAPREDUCE-157 
 but still exists in the code and in mapred-default.xml. We should remove it 
 to avoid user confusion.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: mapreduce.trunk.findbugs.r5.diff

Oops, looks like my diff failed to contain the Holder file. Forgot to add it to 
svn before diff.

Uploading a proper patch and setting to 'In Progress'.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, hadoop-findbugs-report.html, 
 mapreduce.trunk.findbugs.r1.diff, mapreduce.trunk.findbugs.r2.diff, 
 mapreduce.trunk.findbugs.r3.diff, mapreduce.trunk.findbugs.r4.diff, 
 mapreduce.trunk.findbugs.r5.diff, mapreduce.trunk.findbugs.r5.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Status: In Progress  (was: Patch Available)

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, hadoop-findbugs-report.html, 
 mapreduce.trunk.findbugs.r1.diff, mapreduce.trunk.findbugs.r2.diff, 
 mapreduce.trunk.findbugs.r3.diff, mapreduce.trunk.findbugs.r4.diff, 
 mapreduce.trunk.findbugs.r5.diff, mapreduce.trunk.findbugs.r5.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: mapreduce.trunk.findbugs.r6.diff

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, hadoop-findbugs-report.html, 
 mapreduce.trunk.findbugs.r1.diff, mapreduce.trunk.findbugs.r2.diff, 
 mapreduce.trunk.findbugs.r3.diff, mapreduce.trunk.findbugs.r4.diff, 
 mapreduce.trunk.findbugs.r5.diff, mapreduce.trunk.findbugs.r5.diff, 
 mapreduce.trunk.findbugs.r6.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: mapreduce.trunk.findbugs.r5.diff

New patch (still leaves a couple of sync warnings left, as introduced by 
MAPREDUCE-2026).

@Chris - I've made the changes you suggested. Although, even with those 
changes, is it required for {{getTasks(TaskType)}} to be sync'd? I've sync'd it 
in this patch, but let me know if it can be removed, and the warning ignored 
(possibly inaccurate, as admitted by Findbugs).

@Todd - Shall I open a new JIRA for other places HolderT may be used at?

The additional two findbug warnings generated by the issue pointed by Priyo 
(MAPREDUCE-2026 - Interesting why it does not show up in the test-patch?) needs 
to be ignored/re-reviewed as well (for {{getMapCounters}}/{{getReduceCounters}} 
are left synchronized - why?).

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff, mapreduce.trunk.findbugs.r3.diff, 
 mapreduce.trunk.findbugs.r4.diff, mapreduce.trunk.findbugs.r5.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: hadoop-findbugs-report.html

Fresh Findbugs report for previous patch.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, hadoop-findbugs-report.html, 
 mapreduce.trunk.findbugs.r1.diff, mapreduce.trunk.findbugs.r2.diff, 
 mapreduce.trunk.findbugs.r3.diff, mapreduce.trunk.findbugs.r4.diff, 
 mapreduce.trunk.findbugs.r5.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-1811) Job.monitorAndPrintJob() should print status of the job at completion

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1811:
-

Tags: job
Release Note: Print the resultant status of a Job on completion instead of 
simply saying 'Complete'.
  Status: Patch Available  (was: Open)

Setting as available.

 Job.monitorAndPrintJob() should print status of the job at completion
 -

 Key: MAPREDUCE-1811
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1811
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.1
Reporter: Amareshwari Sriramadasu
Assignee: Harsh J Chouraria
Priority: Minor
 Attachments: mapreduce.job.monitor.status.r1.diff


 Job.monitorAndPrintJob() just prints Job Complete at the end of the job. It 
 should print the state whether the job SUCCEEDED/FAILED/KILLED.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-993) bin/hadoop job -events jobid from-event-# #-of-events help message is confusing

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-993:


 Tags: events
Fix Version/s: 0.23.0
 Release Note: Added a helpful description message to the `mapred job 
-events` command.
   Status: Patch Available  (was: Open)

Marking as patch-available.

 bin/hadoop job -events jobid from-event-# #-of-events help message is 
 confusing
 -

 Key: MAPREDUCE-993
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-993
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Iyappan Srinivasan
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.23.0

 Attachments: mapreduce.993.r1.diff


 More explanation needs to be there like a) events always start from 1 b) the 
 message could be like from-event-number to-event-number where 
 from-event-number starts from 1. This will give teh end user idea as to 
 what to enter.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1347:


Assignee: Harsh J Chouraria

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
 Attachments: mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1347:
-

Attachment: mapreduce.1347.r1.diff

Here's a patch that attempts to solve this issue.

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
 Attachments: mapreduce.1347.r1.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-955) CombineFileRecordReader should pass a InputSplit in the constructor instead of CombineFileSplit

2011-02-25 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999726#comment-12999726
 ] 

Harsh J Chouraria commented on MAPREDUCE-955:
-

Why does it need to be so in the Old API?

The new API uses the default {{initialize()}} signature with an InputSplit.

 CombineFileRecordReader should pass a InputSplit in the constructor instead 
 of CombineFileSplit
 ---

 Key: MAPREDUCE-955
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-955
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Namit Jain

 The specific reader can always cast the class as needed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-1360) Record skipping should work with more serializations

2011-02-25 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999728#comment-12999728
 ] 

Harsh J Chouraria commented on MAPREDUCE-1360:
--

This sounded interesting for me to work upon. So here's some questions after a 
brief investigation:

I investigated what goes on inside of the SkippingRecordReader but couldn't 
find any area that gets Writable limited. The sequence files for skipped 
records are created with the Key/Value classes which are then used to load 
their acceptable serialization classes. Which part of the skipping framework is 
Writable limited now?

 Record skipping should work with more serializations
 

 Key: MAPREDUCE-1360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1360
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball

 Record skipping currently supports WritableSerialization, but cannot handle 
 non-class-based serialization systems (e.g., AvroSerialization). The record 
 skipping mechanism should be made compatible with the metadata-based 
 serialization configuration.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2272) Job ACL file should not be executable

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2272:
-

Attachment: mapreduce.2272.r1.diff

Patch that performs this trivial change.

ant test-patch results:
{code}
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile.
{code}

 Job ACL file should not be executable
 -

 Key: MAPREDUCE-2272
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2272
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Trivial
 Attachments: mapreduce.2272.r1.diff


 For some reason the job ACL file is localized with permissions 700. This 
 doesn't make sense, since it's not executable. It should be 600.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (MAPREDUCE-2272) Job ACL file should not be executable

2011-02-25 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2272:


Assignee: Harsh J Chouraria

 Job ACL file should not be executable
 -

 Key: MAPREDUCE-2272
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2272
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
Priority: Trivial
 Attachments: mapreduce.2272.r1.diff


 For some reason the job ACL file is localized with permissions 700. This 
 doesn't make sense, since it's not executable. It should be 600.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-1159) Limit Job name on jobtracker.jsp to be 80 char long

2011-02-11 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993583#comment-12993583
 ] 

Harsh J Chouraria commented on MAPREDUCE-1159:
--

I can add in a test case for the shortener, but I'm stumped as to why the 
TestClusterMapReduceTestCase (Age 1) would fail for this moderately trivial 
patch.

 Limit Job name on jobtracker.jsp to be 80 char long
 ---

 Key: MAPREDUCE-1159
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1159
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Zheng Shao
Assignee: Zheng Shao
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1159.r1.patch, MAPREDUCE-1159.r2.patch, 
 MAPREDUCE-1159.trunk.patch


 Sometimes a user submits a job with a very long job name. That made 
 jobtracker.jsp very hard to read.
 We should limit the size of the job name. User can see the full name when 
 they click on the job.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-02-11 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993622#comment-12993622
 ] 

Harsh J Chouraria commented on MAPREDUCE-2193:
--

It may be ignored if it is acceptable to do so, but via the findbugs filter 
XML, there's no way of doing selective-ignore without ignoring the whole field 
itself (no line number sub-filter either). I don't think ignoring the fields 
forever would be a good idea for changes that are yet to come in that class.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff, mapreduce.trunk.findbugs.r3.diff, 
 mapreduce.trunk.findbugs.r4.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-2309) While querying the Job Statics from the command-line, if we give wrong status name then there is no warning or response.

2011-02-09 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992563#comment-12992563
 ] 

Harsh J Chouraria commented on MAPREDUCE-2309:
--

Which sub-command of {{mapred}} is this exactly? :)

 While querying the Job Statics from the command-line, if we give wrong status 
 name then there is no warning or response.
 

 Key: MAPREDUCE-2309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.23.0
Reporter: Devaraj K
Priority: Minor

 If we try to get the jobs information by giving the wrong status name from 
 the command line interface, it is not giving any warning or response.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-2310) If we stop Job Tracker, Task Tracker is also getting stopped.

2011-02-09 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992634#comment-12992634
 ] 

Harsh J Chouraria commented on MAPREDUCE-2310:
--

Do we really have a stop-jobtracker.sh in 0.20? I couldn't see one in the 0.20 
branch under common nor in Y!'s 0.20.100; which release has it?

 If we stop Job Tracker, Task Tracker is also getting stopped.
 -

 Key: MAPREDUCE-2310
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2310
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.2
Reporter: Devaraj K
Priority: Minor

 If we execute stop-jobtracker.sh for stopping Job Tracker, Task Tracker is 
 also stopping.
 This is not applicable for the latest (trunk) code because stop-jobtracker.sh 
 file is not coming.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-2310) If we stop Job Tracker, Task Tracker is also getting stopped.

2011-02-09 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992653#comment-12992653
 ] 

Harsh J Chouraria commented on MAPREDUCE-2310:
--

None in CDH2/3 either, nor in Yahoo's one at GitHub.

There's a stop-jobhistoryserver.sh available in 0.20.100, however. Is that 
probably it?

 If we stop Job Tracker, Task Tracker is also getting stopped.
 -

 Key: MAPREDUCE-2310
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2310
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.2
Reporter: Devaraj K
Priority: Minor

 If we execute stop-jobtracker.sh for stopping Job Tracker, Task Tracker is 
 also stopping.
 This is not applicable for the latest (trunk) code because stop-jobtracker.sh 
 file is not coming.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2011-02-01 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J Chouraria updated MAPREDUCE-2293:
-

Attachment: mapreduce.mo.removecheck.r1.diff

For extensions to be applied, I think one needs to make a change in the
OutputFormat class chosen. MultipleOutputs can only handle the part before the
'-X' (partition) numbering in Map/Reduce outputs, not after it.

I've posted a patch that removes the check (from both old and new API). I can't
post the result of an {{ant test-patch}} since that isn't working for me right
now (Mumak build is failing for some reason, in MR trunk). I'll post that when
I get it working.

This should be marked as an Incompatible Change in my opinion, as it is a
removal of a strong validation. People may also be relying on the
MultiName_OutputName-Partition syntax via string splits, etc. in the Stable API
MO class.

Also, I'm curious to see if allowing any character to go in is a good idea
'Path' wise. Does HDFS have any restrictions on Filenames? I've not seen a
documentation on it (although I think it is pretty POSIX compliant), but
HDFS-13 points out that there may be some trouble, any thoughts on that?

Enhance MultipleOutputs to allow additional characters in the named output
name
---

Key: MAPREDUCE-2293
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
Project: Hadoop Map/Reduce
Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J Chouraria
Priority: Minor
Attachments: mapreduce.mo.removecheck.r1.diff

Currently you are only allowed to use alpha-numeric characters in a named
output name in the MultipleOutputs class. This is a bit of an onerous
restriction, as it would be extremely convenient to be able to use non
alpha-numerics in the name too. (E.g., a '.' character would be very
helpful, so that you can use the named output name for holding a file
name/extension. Perhaps '-' and a '_' characters as well.)
The restriction seems to be somewhat arbitrary - it appears to be only
enforced in the checkTokenName method. (Though I don't know if there's any
downstream impact by loosening this restriction.)
Would be extremely helpful/useful to have this fixed though!

--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: mapreduce.trunk.findbugs.r4.diff

Another new patch (Rev 4.)

* Moved Holder from o.a.h.mapreduce to o.a.h.mapreduce.util
* Added the forgotten-about ASF license and API interface annotations, as per 
Todd's comments

ant clean findbugs -Dfindbugs.home=/opt/findbugs - 0, as before.

Waiting for another review for the 'synchronized' change made in the {{TIP[] 
JobInProgress::getTasks(TaskType)}} method.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff, mapreduce.trunk.findbugs.r3.diff, 
 mapreduce.trunk.findbugs.r4.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1013) MapReduce Project page does not show 0.20.1 documentation/release information.

2011-01-26 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987138#action_12987138
 ] 

Harsh J Chouraria commented on MAPREDUCE-1013:
--

I believe this issue doesn't apply anymore, ever since 0.20.2 went current and 
stable. Could this be marked Invalid (Invalid now, after 0.20.2)?

  MapReduce Project page does not show 0.20.1 documentation/release 
 information.
 ---

 Key: MAPREDUCE-1013
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1013
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.20.1
Reporter: Andy Sautins
 Attachments: MAPREDUCE-1013.patch


 The MapReduce Project page shows the documentation for 0.20.0 even though the 
 latest stable release version is 0.20.1. The releases page also shows all the 
 pre 0.20.1 releases, but does not show 0.20.1 eventhough if you click on the 
 Download a release now! link the mirror links are for hadoop/core.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1159) Limit Job name on jobtracker.jsp to be 80 char long

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1159:
-

Fix Version/s: 0.22.0
 Release Note: Job names on jobtracker.jsp should be 80 characters long at 
most.
   Status: Patch Available  (was: Open)

A patch is available. Forgot to mark as so.

 Limit Job name on jobtracker.jsp to be 80 char long
 ---

 Key: MAPREDUCE-1159
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1159
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Zheng Shao
Assignee: Zheng Shao
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1159.r1.patch, MAPREDUCE-1159.r2.patch, 
 MAPREDUCE-1159.trunk.patch


 Sometimes a user submits a job with a very long job name. That made 
 jobtracker.jsp very hard to read.
 We should limit the size of the job name. User can see the full name when 
 they click on the job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1242) Chain APIs error misleading

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1242:


Assignee: Harsh J Chouraria

 Chain APIs error misleading
 ---

 Key: MAPREDUCE-1242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amogh Vasekar
Assignee: Harsh J Chouraria
Priority: Trivial
 Attachments: MAPREDUCE-1242.patch, MAPREDUCE-1242.r2.patch


 Hi,
 I was using the Chain[Mapper/Reducer] APIs , and in Class Chain line 207 the 
 error thrown : 
 The Mapper output key class does not match the previous Mapper input key 
 class
 Shouldn't this be The Mapper *input* key class does not match the previous 
 Mapper *Output* key class ? Sort of misleads :) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1242) Chain APIs error misleading

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1242:
-

Attachment: MAPREDUCE-1242.r2.patch

New patch that adds in the other pointer as requested by Amareshwari. Hope this 
clears up the doc issue.

 Chain APIs error misleading
 ---

 Key: MAPREDUCE-1242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amogh Vasekar
Assignee: Harsh J Chouraria
Priority: Trivial
 Attachments: MAPREDUCE-1242.patch, MAPREDUCE-1242.r2.patch


 Hi,
 I was using the Chain[Mapper/Reducer] APIs , and in Class Chain line 207 the 
 error thrown : 
 The Mapper output key class does not match the previous Mapper input key 
 class
 Shouldn't this be The Mapper *input* key class does not match the previous 
 Mapper *Output* key class ? Sort of misleads :) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1242) Chain APIs error misleading

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1242:
-

Fix Version/s: 0.22.0
Affects Version/s: 0.20.2
 Release Note: Fix a misleading exception message in case the Chained 
Mappers have mismatch in input/output Key/Value pairs between them.
   Status: Patch Available  (was: Open)

Patch available that fixes this minor docs issue.

 Chain APIs error misleading
 ---

 Key: MAPREDUCE-1242
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1242
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Amogh Vasekar
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1242.patch, MAPREDUCE-1242.r2.patch


 Hi,
 I was using the Chain[Mapper/Reducer] APIs , and in Class Chain line 207 the 
 error thrown : 
 The Mapper output key class does not match the previous Mapper input key 
 class
 Shouldn't this be The Mapper *input* key class does not match the previous 
 Mapper *Output* key class ? Sort of misleads :) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1996) API: Reducer.reduce() method detail misstatement

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1996:
-

Fix Version/s: (was: 0.20.2)
   0.22.0
 Release Note: Fix a misleading documentation note about the usage of 
Reporter objects in Reducers.
   Status: Patch Available  (was: Open)

Patch is available for this trivial doc-fix.

 API: Reducer.reduce() method detail misstatement
 

 Key: MAPREDUCE-1996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1996
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
 Environment: Seen in Hadoop 0.20.2 API and Hadoop 0.19.x API
Reporter: Glynn Durham
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1996.r1.diff

   Original Estimate: 0.08h
  Remaining Estimate: 0.08h

 method detail for Reducer.reduce() method has paragraph starting:
 Applications can use the Reporter  provided to report progress or just 
 indicate that they are alive. In scenarios where the application takes an 
 insignificant amount of time to process individual key/value pairs, this is 
 crucial since the framework might assume that the task has timed-out and kill 
 that task. 
 s/an insignificant amount of time/a significant amount of time/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1996) API: Reducer.reduce() method detail misstatement

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1996:
-

Affects Version/s: 0.20.1

 API: Reducer.reduce() method detail misstatement
 

 Key: MAPREDUCE-1996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1996
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.1
 Environment: Seen in Hadoop 0.20.2 API and Hadoop 0.19.x API
Reporter: Glynn Durham
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1996.r1.diff

   Original Estimate: 0.08h
  Remaining Estimate: 0.08h

 method detail for Reducer.reduce() method has paragraph starting:
 Applications can use the Reporter  provided to report progress or just 
 indicate that they are alive. In scenarios where the application takes an 
 insignificant amount of time to process individual key/value pairs, this is 
 crucial since the framework might assume that the task has timed-out and kill 
 that task. 
 s/an insignificant amount of time/a significant amount of time/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1996) API: Reducer.reduce() method detail misstatement

2011-01-20 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1996:


Assignee: Harsh J Chouraria

 API: Reducer.reduce() method detail misstatement
 

 Key: MAPREDUCE-1996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1996
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.1
 Environment: Seen in Hadoop 0.20.2 API and Hadoop 0.19.x API
Reporter: Glynn Durham
Assignee: Harsh J Chouraria
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1996.r1.diff

   Original Estimate: 0.08h
  Remaining Estimate: 0.08h

 method detail for Reducer.reduce() method has paragraph starting:
 Applications can use the Reporter  provided to report progress or just 
 indicate that they are alive. In scenarios where the application takes an 
 insignificant amount of time to process individual key/value pairs, this is 
 crucial since the framework might assume that the task has timed-out and kill 
 that task. 
 s/an insignificant amount of time/a significant amount of time/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-14 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Status: In Progress  (was: Patch Available)

Thanks for the review Todd. I've made the getTasks synchronized now and also 
have removed the ignores.

About the Holder, must I write the class myself? And if it is going to be used 
in other places as well, where must I put it (package/project wise)?

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-14 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Status: Patch Available  (was: In Progress)

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff, mapreduce.trunk.findbugs.r3.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-14 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: mapreduce.trunk.findbugs.r3.diff

New patch that tries to resolve Todd's additional comments.

* Adds a new class 'HolderT' added in o.a.h.mapreduce package, for use right 
now only in Localizer.java. Re-added synchronized block in Localizer.java.

* Removed ignores on maps/reduces/etc TIP arrays in JobInProgress.java and made 
the getTasks() method synchronized.

Checkstyle passes on Holder.java with 0 errors/warnings (via the report of `ant 
checkstyle`). Findbugs reports 0 errors, like the previous attachment.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff, mapreduce.trunk.findbugs.r3.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-01-11 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980227#action_12980227
]

Harsh J Chouraria commented on MAPREDUCE-1720:
--

I actually like the concept of having all (retained) jobs listed in one page.
This way, if you are monitoring one, and it actually fails or is killed, it
still remains on the same page; not requiring an inquisitive search for where
it really went. With browser search, one can also lookup the ID on a single
page, without having to switch any context for the resultant state (be it tab
or page). But yes, the representation could use a little work as the pages tend
to get longer and longer over load/time.

About job history/details, since the current job details page gets pretty large
thanks to counters and charts, it wouldn't look good even if it were expanded
inline, I think. Although we can have a short summary (defn.?) which can be
shown inline when clicked/hovered.

'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker
UI

Key: MAPREDUCE-1720
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobtracker
Affects Versions: 0.20.1
Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
Fix For: 0.23.0

Attachments: mapred.failed.killed.difference.png,
mapreduce.unsuccessfuljobs.ui.r1.diff

The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job
status has been separated from Failed as part of HADOOP-3924, so the UI needs
to be updated to reflect the same.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: hadoop-findbugs-report.html
mapreduce.trunk.findbugs.r2.diff

I had a look at JIP's use of those TIP arrays again, and I guess that it is 
alright to ignore non sync access to them via getTasks() as getTasks() is 
thread safe. But I may be wrong, so please review. Marking as PA.

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-11 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2193:
-

  Tags: findbugs
Status: Patch Available  (was: Open)

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 hadoop-findbugs-report.html, mapreduce.trunk.findbugs.r1.diff, 
 mapreduce.trunk.findbugs.r2.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2011-01-06 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978400#action_12978400
 ] 

Harsh J Chouraria commented on MAPREDUCE-2236:
--

100 sounds reasonable to me; would be a good cap for smaller clusters too.

 No task may execute due to an Integer overflow possibility
 --

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0


 If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
 inside TaskInProgress, and thereby no task is attempted by the cluster and 
 the map tasks stay in pending state forever.
 For example, here's a job driver that causes this:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapred.FileInputFormat;
 import org.apache.hadoop.mapred.JobClient;
 import org.apache.hadoop.mapred.JobConf;
 import org.apache.hadoop.mapred.TextInputFormat;
 import org.apache.hadoop.mapred.lib.IdentityMapper;
 import org.apache.hadoop.mapred.lib.NullOutputFormat;
 @SuppressWarnings(deprecation)
 public class IntegerOverflow {
   /**
* @param args
* @throws IOException 
*/
   @SuppressWarnings(deprecation)
   public static void main(String[] args) throws IOException {
   JobConf conf = new JobConf();
   
   Path inputPath = new Path(ignore);
   FileSystem fs = FileSystem.get(conf);
   if (!fs.exists(inputPath)) {
   FSDataOutputStream out = fs.create(inputPath);
   out.writeChars(Test);
   out.close();
   }
   
   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(NullOutputFormat.class);
   FileInputFormat.addInputPath(conf, inputPath);
   
   conf.setMapperClass(IdentityMapper.class);
   conf.setNumMapTasks(1);
   // Problem inducing line follows.
   conf.setMaxMapAttempts(Integer.MAX_VALUE);
   
   // No reducer in this test, although setMaxReduceAttempts leads 
 to the same problem.
   conf.setNumReduceTasks(0);
   
   JobClient.runJob(conf);
   }
 }
 {code}
 The above code will not let any map task run. Additionally, a log would be 
 created inside JobTracker logs with the following information that clearly 
 shows the overflow:
 {code}
 2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: 
 Exceeded limit of -2147483648 (plus 0 killed) attempts for the tip 
 'task_201012300058_0001_m_00'
 {code}
 The issue lies inside the TaskInProgress class 
 (/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
 getTaskToRun(String taskTracker) method.
 {code}
   public Task getTaskToRun(String taskTracker) throws IOException {   
 // Create the 'taskid'; do not count the 'killed' tasks against the job!
 TaskAttemptID taskid = null;
 /*  THIS LINE v == */
 if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
 /*  THIS LINE ^== */
   // Make sure that the attempts are unqiue across restarts
   int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
 nextTaskId;
   taskid = new TaskAttemptID( id, attemptId);
   ++nextTaskId;
 } else {
   LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
(plus  + numKilledTasks +  killed)  + 
attempts for the tip ' + getTIPId() + ');
   return null;
 }
 {code}
 Since all three variables being added are integer in type, one of them being 
 Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
 and returning a null as the result is negative.
 One solution would be to make one of these variables into a long, so the 
 addition does not overflow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2225) MultipleOutputs should not require the use of 'Writable'

2011-01-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2225:
-

Attachment: multipleoutputs.nowritables.r2.diff

New patch, with test cases for testing JavaSerialization with MO.

 MultipleOutputs should not require the use of 'Writable'
 

 Key: MAPREDUCE-2225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2225
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.20.1
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: multipleoutputs.nowritables.r1.diff, 
 multipleoutputs.nowritables.r2.diff

   Original Estimate: 0.02h
  Remaining Estimate: 0.02h

 MultipleOutputs right now requires for Key/Value classes to utilize the 
 Writable and WritableComparable interfaces, and fails if the associated 
 key/value classes aren't doing so.
 With support for alternates like Avro serialization, using Writables isn't 
 necessary and thus the MO class must not strictly check for them.
 And since comparators may be given separately, key class doesn't need to be 
 checked for implementing a comparable (although it is good design if the key 
 class does implement Comparable at least).
 Am not sure if this brings about an incompatible change (does Java have BIC? 
 No idea).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2225) MultipleOutputs should not require the use of 'Writable'

2011-01-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2225:
-

Release Note: MultipleOutputs should not require the use/check of 
'Writable' interfaces in key and value classes.  (was: MultipleOutputs do not 
require the use of 'Writable' interfaces.)
  Status: Patch Available  (was: Open)

Setting state to PA.

'ant test-patch' passes with:
{code}
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile.
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 23 minutes 14 seconds
{code}

 MultipleOutputs should not require the use of 'Writable'
 

 Key: MAPREDUCE-2225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2225
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.20.1
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: multipleoutputs.nowritables.r1.diff, 
 multipleoutputs.nowritables.r2.diff

   Original Estimate: 0.02h
  Remaining Estimate: 0.02h

 MultipleOutputs right now requires for Key/Value classes to utilize the 
 Writable and WritableComparable interfaces, and fails if the associated 
 key/value classes aren't doing so.
 With support for alternates like Avro serialization, using Writables isn't 
 necessary and thus the MO class must not strictly check for them.
 And since comparators may be given separately, key class doesn't need to be 
 checked for implementing a comparable (although it is good design if the key 
 class does implement Comparable at least).
 Am not sure if this brings about an incompatible change (does Java have BIC? 
 No idea).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2225) MultipleOutputs should not require the use of 'Writable'

2011-01-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2225:
-

  Environment: Linux
Fix Version/s: (was: 0.23.0)
   0.22.0

Perhaps this could go in 0.22 itself, if improvements are still accepted?

 MultipleOutputs should not require the use of 'Writable'
 

 Key: MAPREDUCE-2225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2225
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.20.1
 Environment: Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
 Fix For: 0.22.0

 Attachments: multipleoutputs.nowritables.r1.diff, 
 multipleoutputs.nowritables.r2.diff

   Original Estimate: 0.02h
  Remaining Estimate: 0.02h

 MultipleOutputs right now requires for Key/Value classes to utilize the 
 Writable and WritableComparable interfaces, and fails if the associated 
 key/value classes aren't doing so.
 With support for alternates like Avro serialization, using Writables isn't 
 necessary and thus the MO class must not strictly check for them.
 And since comparators may be given separately, key class doesn't need to be 
 checked for implementing a comparable (although it is good design if the key 
 class does implement Comparable at least).
 Am not sure if this brings about an incompatible change (does Java have BIC? 
 No idea).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2225) MultipleOutputs should not require the use of 'Writable'

2011-01-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2225:
-

Priority: Blocker  (was: Major)

Setting to BLOCKER as per Nigel's mail.

 MultipleOutputs should not require the use of 'Writable'
 

 Key: MAPREDUCE-2225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2225
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.20.1
 Environment: Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0

 Attachments: multipleoutputs.nowritables.r1.diff, 
 multipleoutputs.nowritables.r2.diff

   Original Estimate: 0.02h
  Remaining Estimate: 0.02h

 MultipleOutputs right now requires for Key/Value classes to utilize the 
 Writable and WritableComparable interfaces, and fails if the associated 
 key/value classes aren't doing so.
 With support for alternates like Avro serialization, using Writables isn't 
 necessary and thus the MO class must not strictly check for them.
 And since comparators may be given separately, key class doesn't need to be 
 checked for implementing a comparable (although it is good design if the key 
 class does implement Comparable at least).
 Am not sure if this brings about an incompatible change (does Java have BIC? 
 No idea).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2170) Send out last-minute load averages in TaskTrackerStatus

2011-01-05 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2170:
-

Priority: Blocker  (was: Minor)

Setting to BLOCKER as per Nigel's mail.

 Send out last-minute load averages in TaskTrackerStatus
 ---

 Key: MAPREDUCE-2170
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2170
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.22.0
 Environment: GNU/Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0

 Attachments: mapreduce.loadaverage.r3.diff, 
 mapreduce.loadaverage.r4.diff, mapreduce.loadaverage.r5.diff

   Original Estimate: 0.33h
  Remaining Estimate: 0.33h

 Load averages could be useful in scheduling. This patch looks to extend the 
 existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
 load averages of the last one minute via the TaskTrackerStatus.
 Patch is up for review, with test cases added, at: 
 https://reviews.apache.org/r/20/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-01-03 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976656#action_12976656
 ] 

Harsh J Chouraria commented on MAPREDUCE-1720:
--

I think that's a good idea. Jobs can be killed for a reason (say, hadoop job 
-kill JobID reason why we're killing it?), which can be included into the 
JobStatus data, same with the whois of the killer; but when it comes to 
failure, how do we deduce the 'reason' of failure -- just task numbers as 
explained in the MAPREDUCE-343 ticket?

I also think that for Unsuccessful Jobs, displaying map and reduce progress 
percentages is not a good thing, as it is not very indicative of the actual 
progress (Always shows 100 or 0). We could remove this and claim some good real 
estate to display reasons (limited characters of it).

  'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker 
 UI
 

 Key: MAPREDUCE-1720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: mapred.failed.killed.difference.png, 
 mapreduce.unsuccessfuljobs.ui.r1.diff


 The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job 
 status has been separated from Failed as part of HADOOP-3924, so the UI needs 
 to be updated to reflect the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-03 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-2193:


Assignee: Harsh J Chouraria

 13 Findbugs warnings on trunk and branch-0.22
 -

 Key: MAPREDUCE-2193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Assignee: Harsh J Chouraria
Priority: Blocker
 Fix For: 0.22.0, 0.23.0

 Attachments: findbugsWarnings.html, hadoop-findbugs-report.html, 
 mapreduce.trunk.findbugs.r1.diff


 There are 13 findbugs warnings on trunk.  See attached html file.  These must 
 be fixed or filtered out to get back to 0 warnings.  The OK_FINDBUGS_WARNINGS 
 property in src/test/test-patch.properties should also be set to 0 in the 
 patch that fixes this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2193) 13 Findbugs warnings on trunk and branch-0.22

2011-01-03 Thread Harsh J Chouraria (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J Chouraria updated MAPREDUCE-2193:
-

Attachment: hadoop-findbugs-report.html
mapreduce.trunk.findbugs.r1.diff

Attempting this cause it looks like an interesting task for a newcomer.

Attached a patch that filters and fixes some of these warnings. [Also attached
the resultant warning HTML page].
Please review :)

Regarding the four remaining IS2_INCONSISTENT_SYNC warnings, I'm unable to
think of a proper way to exclude them (as I think getTasks() doesn't require to
be synchronized).
I tried performing a Match on the Method, but findbugs is only interested in
the Field as a whole. Ignoring the Fields (maps, reduces, setup, cleanup) for
the entire JIP class doesn't look like a good idea to me. Any tips on how to
resolve this filtering?

13 Findbugs warnings on trunk and branch-0.22
-

Key: MAPREDUCE-2193
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2193
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Nigel Daley
Priority: Blocker
Fix For: 0.22.0, 0.23.0

Attachments: findbugsWarnings.html, hadoop-findbugs-report.html,
mapreduce.trunk.findbugs.r1.diff

There are 13 findbugs warnings on trunk. See attached html file. These must
be fixed or filtered out to get back to 0 warnings. The OK_FINDBUGS_WARNINGS
property in src/test/test-patch.properties should also be set to 0 in the
patch that fixes this issue.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1811) Job.monitorAndPrintJob() should print status of the job at completion

2011-01-03 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1811:
-

Attachment: mapreduce.job.monitor.status.r1.diff

This can be a very helpful addition.
Attaching a trivial patch that can spit out the job state in the final logs.

I've moved the log call after the counters display; which I think makes more 
sense as a place for state + ID.

 Job.monitorAndPrintJob() should print status of the job at completion
 -

 Key: MAPREDUCE-1811
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1811
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.1
Reporter: Amareshwari Sriramadasu
Priority: Minor
 Fix For: 0.22.0

 Attachments: mapreduce.job.monitor.status.r1.diff


 Job.monitorAndPrintJob() just prints Job Complete at the end of the job. It 
 should print the state whether the job SUCCEEDED/FAILED/KILLED.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1811) Job.monitorAndPrintJob() should print status of the job at completion

2011-01-03 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1811:


Assignee: Harsh J Chouraria

 Job.monitorAndPrintJob() should print status of the job at completion
 -

 Key: MAPREDUCE-1811
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1811
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.1
Reporter: Amareshwari Sriramadasu
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.22.0

 Attachments: mapreduce.job.monitor.status.r1.diff


 Job.monitorAndPrintJob() just prints Job Complete at the end of the job. It 
 should print the state whether the job SUCCEEDED/FAILED/KILLED.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-01-02 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1720:


Assignee: Harsh J Chouraria

  'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker 
 UI
 

 Key: MAPREDUCE-1720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
 Fix For: 0.23.0


 The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job 
 status has been separated from Failed as part of HADOOP-3924, so the UI needs 
 to be updated to reflect the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-01-02 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1720:
-

Fix Version/s: (was: 0.20.3)
   0.23.0

  'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker 
 UI
 

 Key: MAPREDUCE-1720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
 Fix For: 0.23.0


 The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job 
 status has been separated from Failed as part of HADOOP-3924, so the UI needs 
 to be updated to reflect the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1720) 'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker UI

2011-01-02 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1720:
-

Attachment: mapred.failed.killed.difference.png
mapreduce.unsuccessfuljobs.ui.r1.diff

Attaching a patch that changes the Failed jobs in the UI to Unsucessful 
jobs and displays a reason column that clearly indicates whether the failed 
job failed on its own or got killed.

Attached PNG image shows a screenshot of the same while executing 
mapred/TestJobKillAndFail

!mapred.failed.killed.difference.png|thumbnail!

[Also cleaned up the JSPUtil.generateJobTable(...) method as I was modifying 
it.]

  'Killed' jobs and 'Failed' jobs should be displayed seperately in JobTracker 
 UI
 

 Key: MAPREDUCE-1720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: all
Reporter: Subramaniam Krishnan
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: mapred.failed.killed.difference.png, 
 mapreduce.unsuccessfuljobs.ui.r1.diff


 The JobTracker UI shows both Failed/Killed Jobs as Failed. The Killed job 
 status has been separated from Failed as part of HADOOP-3924, so the UI needs 
 to be updated to reflect the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2010-12-29 Thread Harsh J Chouraria (JIRA)

No task may execute due to an Integer overflow possibility
--

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0


If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
inside TaskInProgress, and thereby no task is attempted by the cluster and the 
map tasks stay in pending state forever.

For example, here's a job driver that causes this:
{code}
import java.io.IOException;

import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.lib.IdentityMapper;
import org.apache.hadoop.mapred.lib.NullOutputFormat;


@SuppressWarnings(deprecation)
public class IntegerOverflow {

/**
 * @param args
 * @throws IOException 
 */
@SuppressWarnings(deprecation)
public static void main(String[] args) throws IOException {
JobConf conf = new JobConf();

Path inputPath = new Path(ignore);
FileSystem fs = FileSystem.get(conf);
if (!fs.exists(inputPath)) {
FSDataOutputStream out = fs.create(inputPath);
out.writeChars(Test);
out.close();
}

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(NullOutputFormat.class);
FileInputFormat.addInputPath(conf, inputPath);

conf.setMapperClass(IdentityMapper.class);
conf.setNumMapTasks(1);
// Problem inducing line follows.
conf.setMaxMapAttempts(Integer.MAX_VALUE);

// No reducer in this test, although setMaxReduceAttempts leads 
to the same problem.
conf.setNumReduceTasks(0);

JobClient.runJob(conf);
}

}
{code}

The above code will not let any map task run. Additionally, a log would be 
created inside JobTracker logs with the following information that clearly 
shows the overflow:
{code}
2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: Exceeded 
limit of -2147483648 (plus 0 killed) attempts for the tip 
'task_201012300058_0001_m_00'
{code}

The issue lies inside the TaskInProgress class 
(/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
getTaskToRun(String taskTracker) method.
{code}
  public Task getTaskToRun(String taskTracker) throws IOException {   
// Create the 'taskid'; do not count the 'killed' tasks against the job!
TaskAttemptID taskid = null;
/*  THIS LINE v == */
if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
/*  THIS LINE ^== */
  // Make sure that the attempts are unqiue across restarts
  int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
nextTaskId;
  taskid = new TaskAttemptID( id, attemptId);
  ++nextTaskId;
} else {
  LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
   (plus  + numKilledTasks +  killed)  + 
   attempts for the tip ' + getTIPId() + ');
  return null;
}
{code}

Since all three variables being added are integer in type, one of them being 
Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
and returning a null as the result is negative.

One solution would be to make one of these variables into a long, so the 
addition does not overflow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-2236) No task may execute due to an Integer overflow possibility

2010-12-29 Thread Harsh J Chouraria (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12975886#action_12975886
 ] 

Harsh J Chouraria commented on MAPREDUCE-2236:
--

We can also set a hard cap on the attempts amount when its being set; although 
how much should that be must be discussed. Integer.MAX_VALUE / 2 should do I 
guess?

 No task may execute due to an Integer overflow possibility
 --

 Key: MAPREDUCE-2236
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2236
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.2
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Critical
 Fix For: 0.23.0


 If the attempts is configured to use Integer.MAX_VALUE, an overflow occurs 
 inside TaskInProgress, and thereby no task is attempted by the cluster and 
 the map tasks stay in pending state forever.
 For example, here's a job driver that causes this:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapred.FileInputFormat;
 import org.apache.hadoop.mapred.JobClient;
 import org.apache.hadoop.mapred.JobConf;
 import org.apache.hadoop.mapred.TextInputFormat;
 import org.apache.hadoop.mapred.lib.IdentityMapper;
 import org.apache.hadoop.mapred.lib.NullOutputFormat;
 @SuppressWarnings(deprecation)
 public class IntegerOverflow {
   /**
* @param args
* @throws IOException 
*/
   @SuppressWarnings(deprecation)
   public static void main(String[] args) throws IOException {
   JobConf conf = new JobConf();
   
   Path inputPath = new Path(ignore);
   FileSystem fs = FileSystem.get(conf);
   if (!fs.exists(inputPath)) {
   FSDataOutputStream out = fs.create(inputPath);
   out.writeChars(Test);
   out.close();
   }
   
   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(NullOutputFormat.class);
   FileInputFormat.addInputPath(conf, inputPath);
   
   conf.setMapperClass(IdentityMapper.class);
   conf.setNumMapTasks(1);
   // Problem inducing line follows.
   conf.setMaxMapAttempts(Integer.MAX_VALUE);
   
   // No reducer in this test, although setMaxReduceAttempts leads 
 to the same problem.
   conf.setNumReduceTasks(0);
   
   JobClient.runJob(conf);
   }
 }
 {code}
 The above code will not let any map task run. Additionally, a log would be 
 created inside JobTracker logs with the following information that clearly 
 shows the overflow:
 {code}
 2010-12-30 00:59:07,836 WARN org.apache.hadoop.mapred.TaskInProgress: 
 Exceeded limit of -2147483648 (plus 0 killed) attempts for the tip 
 'task_201012300058_0001_m_00'
 {code}
 The issue lies inside the TaskInProgress class 
 (/o/a/h/mapred/TaskInProgress.java), at line 1018 (trunk), part of the 
 getTaskToRun(String taskTracker) method.
 {code}
   public Task getTaskToRun(String taskTracker) throws IOException {   
 // Create the 'taskid'; do not count the 'killed' tasks against the job!
 TaskAttemptID taskid = null;
 /*  THIS LINE v == */
 if (nextTaskId  (MAX_TASK_EXECS + maxTaskAttempts + numKilledTasks)) {
 /*  THIS LINE ^== */
   // Make sure that the attempts are unqiue across restarts
   int attemptId = job.getNumRestarts() * NUM_ATTEMPTS_PER_RESTART + 
 nextTaskId;
   taskid = new TaskAttemptID( id, attemptId);
   ++nextTaskId;
 } else {
   LOG.warn(Exceeded limit of  + (MAX_TASK_EXECS + maxTaskAttempts) +
(plus  + numKilledTasks +  killed)  + 
attempts for the tip ' + getTIPId() + ');
   return null;
 }
 {code}
 Since all three variables being added are integer in type, one of them being 
 Integer.MAX_VALUE makes the condition fail with an overflow, thereby logging 
 and returning a null as the result is negative.
 One solution would be to make one of these variables into a long, so the 
 addition does not overflow?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1591) Add better javadocs for RawComparator interface

2010-12-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria reassigned MAPREDUCE-1591:


Assignee: Harsh J Chouraria

 Add better javadocs for RawComparator interface
 ---

 Key: MAPREDUCE-1591
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1591
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Harsh J Chouraria
Priority: Trivial
 Attachments: common.rawcomparator.jdoc.r1.diff


 The RawComparator interface is very important to understand for users 
 implementing their own serialization classes. Right now the javadoc is 
 woefully sparse. We should improve that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1591) Add better javadocs for RawComparator interface

2010-12-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-1591:
-

Attachment: common.rawcomparator.jdoc.r1.diff

Attaching a javadoc patch that hopefully adds more details to RawComparator. 
Please review (correct me if am wrong anywhere too!).

 Add better javadocs for RawComparator interface
 ---

 Key: MAPREDUCE-1591
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1591
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Trivial
 Attachments: common.rawcomparator.jdoc.r1.diff


 The RawComparator interface is very important to understand for users 
 implementing their own serialization classes. Right now the javadoc is 
 woefully sparse. We should improve that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2170) Send out last-minute load averages in TaskTrackerStatus

2010-12-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2170:
-

Status: In Progress  (was: Patch Available)

Fixing the cause that led to a failing core test. [New patch upped -- r5]

 Send out last-minute load averages in TaskTrackerStatus
 ---

 Key: MAPREDUCE-2170
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2170
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.22.0
 Environment: GNU/Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.22.0

 Attachments: mapreduce.loadaverage.r3.diff, 
 mapreduce.loadaverage.r4.diff, mapreduce.loadaverage.r5.diff

   Original Estimate: 0.33h
  Remaining Estimate: 0.33h

 Load averages could be useful in scheduling. This patch looks to extend the 
 existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
 load averages of the last one minute via the TaskTrackerStatus.
 Patch is up for review, with test cases added, at: 
 https://reviews.apache.org/r/20/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-2170) Send out last-minute load averages in TaskTrackerStatus

2010-12-26 Thread Harsh J Chouraria (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2170:
-

Status: Patch Available  (was: In Progress)

This should pass the tests. Had not added the TaskTracker.java diff previously, 
oops.

 Send out last-minute load averages in TaskTrackerStatus
 ---

 Key: MAPREDUCE-2170
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2170
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.22.0
 Environment: GNU/Linux
Reporter: Harsh J Chouraria
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.22.0

 Attachments: mapreduce.loadaverage.r3.diff, 
 mapreduce.loadaverage.r4.diff, mapreduce.loadaverage.r5.diff

   Original Estimate: 0.33h
  Remaining Estimate: 0.33h

 Load averages could be useful in scheduling. This patch looks to extend the 
 existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
 load averages of the last one minute via the TaskTrackerStatus.
 Patch is up for review, with test cases added, at: 
 https://reviews.apache.org/r/20/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 >

1 - 100 of 117 matches

Mail list logo