[jira] Updated: (MAPREDUCE-848) TestCapacityScheduler is failing

2009-08-12 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-848:
-

Status: Patch Available  (was: Open)

 TestCapacityScheduler is failing
 

 Key: MAPREDUCE-848
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-848
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Affects Versions: 0.21.0
Reporter: Devaraj Das
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPREDUCE-848-v1.0.patch


 Looks like the commit of HADOOP-805 broke the CapacityScheduler testcase. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2

2009-08-12 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742257#action_12742257
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-767:
---

Some comments:
1. Validators should not be removed. move them into a method and call them from 
streaming code.
2. Can you check passing -jobconf x=y x1=y1 throws an exception? Can you also 
verify if it is easy to split the values from streaming and add them?
3. Pipes options should also be tested

 to remove mapreduce dependency on commons-cli2
 --

 Key: MAPREDUCE-767
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-767
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
Reporter: Giridharan Kesavan
Assignee: Amar Kamat
 Attachments: MAPREDUCE-767-v1.1.patch


 mapreduce, streaming and eclipse plugin depends on common-cli2 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-842) Per-job local data on the TaskTracker node should have right access-control

2009-08-12 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742260#action_12742260
 ] 

Vinod K V commented on MAPREDUCE-842:
-

I just ran the test cases that touch the changes in the last patch. I also 
generated docs and verified the documentation changes.

The contrib test failures reported by Hudson are unrelated. Plz see 
[MAPREDUCE-848] and [MAPREDUCE-699].

This patch is committable.

 Per-job local data on the TaskTracker node should have right access-control
 ---

 Key: MAPREDUCE-842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-842
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Attachments: HADOOP-4491-20090623-common.1.txt, 
 HADOOP-4491-20090623-mapred.1.txt, HADOOP-4491-20090703-common.1.txt, 
 HADOOP-4491-20090703-common.txt, HADOOP-4491-20090703.1.txt, 
 HADOOP-4491-20090703.txt, HADOOP-4491-20090707-common.txt, 
 HADOOP-4491-20090707.txt, HADOOP-4491-20090716-mapred.txt, 
 HADOOP-4491-20090803.1.txt, HADOOP-4491-20090803.txt, 
 HADOOP-4491-20090806.txt, HADOOP-4491-20090807.2.txt, 
 HADOOP-4491-20090810.1.txt, HADOOP-4491-20090810.3.txt, 
 HADOOP-4491-20090811.txt, HADOOP-4491-20090812.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-849) Renaming of configuration property names in mapreduce

2009-08-12 Thread Amareshwari Sriramadasu (JIRA)
Renaming of configuration property names in mapreduce
-

 Key: MAPREDUCE-849
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-849
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0


In-line with HDFS-531, property names in configuration files should be 
standardized in MAPREDUCE. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742266#action_12742266
 ] 

Devaraj Das commented on MAPREDUCE-430:
---

I propose that we catch Exception instead of Throwable in Child.java. Whatever 
is currently done inside the catch Throwable block is retained (just that the 
block will get executed if an Exception is caught). That's the only change we 
do in the patch.

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.6-branch-0.20.patch, 
 MAPREDUCE-430-v1.6.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-848) TestCapacityScheduler is failing

2009-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742283#action_12742283
 ] 

Hadoop QA commented on MAPREDUCE-848:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12416284/MAPREDUCE-848-v1.0.patch
  against trunk revision 803231.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/470/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/470/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/470/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/470/console

This message is automatically generated.

 TestCapacityScheduler is failing
 

 Key: MAPREDUCE-848
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-848
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Affects Versions: 0.21.0
Reporter: Devaraj Das
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPREDUCE-848-v1.0.patch


 Looks like the commit of HADOOP-805 broke the CapacityScheduler testcase. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-685) Sqoop will fail with OutOfMemory on large tables using mysql

2009-08-12 Thread Martin Dittus (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742345#action_12742345
 ] 

Martin Dittus commented on MAPREDUCE-685:
-

We just found that PostgreSQL shows the same behaviour. What do you think of 
making this a generic fix instead? It seems Postgres has the same mechanism to 
enable streaming of ResultSets:

http://jdbc.postgresql.org/documentation/83/query.html -- Changing code to 
cursor mode is as simple as setting the fetch size of the Statement to the 
appropriate size. Setting the fetch size back to 0 will cause all rows to be 
cached (the default behaviour).

 Sqoop will fail with OutOfMemory on large tables using mysql
 

 Key: MAPREDUCE-685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-685.3.patch, MAPREDUCE-685.patch, 
 MAPREDUCE-685.patch.2


 The default MySQL JDBC client behavior is to buffer the entire ResultSet in 
 the client before allowing the user to use the ResultSet object. On large 
 SELECTs, this can cause OutOfMemory exceptions, even when the client intends 
 to close the ResultSet after reading only a few rows. The MySQL ConnManager 
 should configure its connection to use row-at-a-time delivery of results to 
 the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-814) Move completed Job history files to HDFS

2009-08-12 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-814:
-

Release Note: Provides an ability to move completed job history files to a 
HDFS location via  configuring mapred.job.tracker.history.completed.location. 
If the directory location does not already exist, it would be created by 
jobtracker.  (was: Provides an ability to move completed job history files to a 
HDFS location via  configuring mapred.job.tracker.history.completed.location.)

 Move completed Job history files to HDFS
 

 Key: MAPREDUCE-814
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-814
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobtracker
Reporter: Sharad Agarwal
Assignee: Sharad Agarwal
 Fix For: 0.21.0

 Attachments: 814_v1.patch, 814_v2.patch, 814_v3.patch, 814_v4.patch, 
 814_v5.patch, 814_ydist.patch


 Currently completed job history files remain on the jobtracker node. Having 
 the files available on HDFS will enable clients to access these files more 
 easily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-850) PriorityScheduler should use TaskTrackerManager.killJob() instead of JobInProgress.kill()

2009-08-12 Thread Amar Kamat (JIRA)
PriorityScheduler should use TaskTrackerManager.killJob() instead of 
JobInProgress.kill()
-

 Key: MAPREDUCE-850
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-850
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amar Kamat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-817) Add a cache for retired jobs with minimal job info and provide a way to access history file url

2009-08-12 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-817:
-

Attachment: 817_ydist_new.patch

New patch for Yahoo's distribution. It does NOT introduce client side API 
changes.

 Add a cache for retired jobs with minimal job info and provide a way to 
 access history file url
 ---

 Key: MAPREDUCE-817
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-817
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: client, jobtracker
Reporter: Sharad Agarwal
Assignee: Sharad Agarwal
 Fix For: 0.21.0

 Attachments: 817_v1.patch, 817_v2.patch, 817_v3.patch, 
 817_ydist.patch, 817_ydist_new.patch


 MAPREDUCE-814 will provide a way to keep the job history files in HDFS. There 
 should be a way to get the url for the completed job history fie. The 
 completed jobs can be purged from memory more aggressively from jobtracker 
 since the clients can retrieve the information from history file. Jobtracker 
 can just maintain the very basic info about the completed jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-842) Per-job local data on the TaskTracker node should have right access-control

2009-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742366#action_12742366
 ] 

Hadoop QA commented on MAPREDUCE-842:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12416292/HADOOP-4491-20090812.txt
  against trunk revision 803231.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 50 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/471/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/471/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/471/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/471/console

This message is automatically generated.

 Per-job local data on the TaskTracker node should have right access-control
 ---

 Key: MAPREDUCE-842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-842
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Attachments: HADOOP-4491-20090623-common.1.txt, 
 HADOOP-4491-20090623-mapred.1.txt, HADOOP-4491-20090703-common.1.txt, 
 HADOOP-4491-20090703-common.txt, HADOOP-4491-20090703.1.txt, 
 HADOOP-4491-20090703.txt, HADOOP-4491-20090707-common.txt, 
 HADOOP-4491-20090707.txt, HADOOP-4491-20090716-mapred.txt, 
 HADOOP-4491-20090803.1.txt, HADOOP-4491-20090803.txt, 
 HADOOP-4491-20090806.txt, HADOOP-4491-20090807.2.txt, 
 HADOOP-4491-20090810.1.txt, HADOOP-4491-20090810.3.txt, 
 HADOOP-4491-20090811.txt, HADOOP-4491-20090812.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-840) DBInputFormat leaves open transaction

2009-08-12 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-840:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Aaron!

 DBInputFormat leaves open transaction
 -

 Key: MAPREDUCE-840
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Minor
 Fix For: 0.21.0

 Attachments: MAPREDUCE-840.patch


 DBInputFormat.getSplits() does not connection.commit() after the COUNT query. 
 This can leave an open transaction against the database which interferes with 
 other connections to the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-706) Support for FIFO pools in the fair scheduler

2009-08-12 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742414#action_12742414
 ] 

Matei Zaharia commented on MAPREDUCE-706:
-

Contrib test failures are again unrelated.

 Support for FIFO pools in the fair scheduler
 

 Key: MAPREDUCE-706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-706
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/fair-share
Reporter: Matei Zaharia
Assignee: Matei Zaharia
 Attachments: fsdesigndoc.pdf, fsdesigndoc.tex, mapreduce-706.patch, 
 mapreduce-706.v1.patch, mapreduce-706.v2.patch, mapreduce-706.v3.patch, 
 mapreduce-706.v4.patch


 The fair scheduler should support making the internal scheduling algorithm 
 for some pools be FIFO instead of fair sharing in order to work better for 
 batch workloads. FIFO pools will behave exactly like the current default 
 scheduler, sorting jobs by priority and then submission time. Pools will have 
 their scheduling algorithm set through the pools config file, and it will be 
 changeable at runtime.
 To support this feature, I'm also changing the internal logic of the fair 
 scheduler to no longer use deficits. Instead, for fair sharing, we will 
 assign tasks to the job farthest below its share as a ratio of its share. 
 This is easier to combine with other scheduling algorithms and leads to a 
 more stable sharing situation, avoiding unfairness issues brought up in 
 MAPREDUCE-543 and MAPREDUCE-544 that happen when some jobs have long tasks. 
 The new preemption (MAPREDUCE-551) will ensure that critical jobs can gain 
 their fair share within a bounded amount of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-851) Static job propery accessors don't accept Configuration but various JobContext sub-classes

2009-08-12 Thread Chris K Wensel (JIRA)
Static job propery accessors don't accept Configuration but various JobContext 
sub-classes
--

 Key: MAPREDUCE-851
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-851
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 0.20.1
Reporter: Chris K Wensel


The current method of accepting only JobContext or one of its sub-classes adds 
much complexity to dynamic job configuration builders that manipulate the 
Configuration object in order to dynamically configure Hadoop jobs, and 
influence internal Hadoop sub-systems during runtime to provide higher level 
functions and features.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-851) Static job propery accessors don't accept Configuration but various JobContext sub-classes

2009-08-12 Thread Chris K Wensel (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris K Wensel updated MAPREDUCE-851:
-

Issue Type: Bug  (was: Improvement)

 Static job propery accessors don't accept Configuration but various 
 JobContext sub-classes
 --

 Key: MAPREDUCE-851
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-851
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.1
Reporter: Chris K Wensel

 The current method of accepting only JobContext or one of its sub-classes 
 adds much complexity to dynamic job configuration builders that manipulate 
 the Configuration object in order to dynamically configure Hadoop jobs, and 
 influence internal Hadoop sub-systems during runtime to provide higher level 
 functions and features.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-685) Sqoop will fail with OutOfMemory on large tables using mysql

2009-08-12 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742426#action_12742426
 ] 

Aaron Kimball commented on MAPREDUCE-685:
-

Because it's not actually the same fix ;) Postgresql wants you to do 
{{statement.setFetchSize(something_reasonable)}} e.g., 40.

MySQL wants you to do {{statement.setFetchSize(INT_MIN)}}. The only cursor 
modes MySQL supports are fully buffered (fetch size = 0) and fully row-wise 
cursors (fetch_size = INT_MIN).

That having been said, I have just finished a postgresql patch ready to post up 
here this week :) Just waiting for some existing patches to get committed first 
so that it applies cleanly.

 Sqoop will fail with OutOfMemory on large tables using mysql
 

 Key: MAPREDUCE-685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-685.3.patch, MAPREDUCE-685.patch, 
 MAPREDUCE-685.patch.2


 The default MySQL JDBC client behavior is to buffer the entire ResultSet in 
 the client before allowing the user to use the ResultSet object. On large 
 SELECTs, this can cause OutOfMemory exceptions, even when the client intends 
 to close the ResultSet after reading only a few rows. The MySQL ConnManager 
 should configure its connection to use row-at-a-time delivery of results to 
 the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-842) Per-job local data on the TaskTracker node should have right access-control

2009-08-12 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-842:
---

   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Vinod !

 Per-job local data on the TaskTracker node should have right access-control
 ---

 Key: MAPREDUCE-842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-842
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Fix For: 0.21.0

 Attachments: HADOOP-4491-20090623-common.1.txt, 
 HADOOP-4491-20090623-mapred.1.txt, HADOOP-4491-20090703-common.1.txt, 
 HADOOP-4491-20090703-common.txt, HADOOP-4491-20090703.1.txt, 
 HADOOP-4491-20090703.txt, HADOOP-4491-20090707-common.txt, 
 HADOOP-4491-20090707.txt, HADOOP-4491-20090716-mapred.txt, 
 HADOOP-4491-20090803.1.txt, HADOOP-4491-20090803.txt, 
 HADOOP-4491-20090806.txt, HADOOP-4491-20090807.2.txt, 
 HADOOP-4491-20090810.1.txt, HADOOP-4491-20090810.3.txt, 
 HADOOP-4491-20090811.txt, HADOOP-4491-20090812.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-801) MAPREDUCE framework should issue warning with too many locations for a split

2009-08-12 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742443#action_12742443
 ] 

Hong Tang commented on MAPREDUCE-801:
-

Yet another solution, which I think is more general, is proposed and a jira 
MAPREDUCE-841 is created for track it.

 MAPREDUCE framework should issue warning with too many locations for a split
 

 Key: MAPREDUCE-801
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-801
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Hong Tang

 Customized input-format may be buggy and report misleading locations through 
 input-split, an example of which is PIG-878. When an input split returns too 
 many locations, it would not only artificially inflate the percentage of data 
 local or rack local maps, but also force scheduler to use more memory and 
 work harder to conduct task assignment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-706) Support for FIFO pools in the fair scheduler

2009-08-12 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742445#action_12742445
 ] 

Tom White commented on MAPREDUCE-706:
-

+1

What testing did you carry out on this?

Agree the documentation is excellent. Can you add it to the source tree?



 Support for FIFO pools in the fair scheduler
 

 Key: MAPREDUCE-706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-706
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/fair-share
Reporter: Matei Zaharia
Assignee: Matei Zaharia
 Attachments: fsdesigndoc.pdf, fsdesigndoc.tex, mapreduce-706.patch, 
 mapreduce-706.v1.patch, mapreduce-706.v2.patch, mapreduce-706.v3.patch, 
 mapreduce-706.v4.patch


 The fair scheduler should support making the internal scheduling algorithm 
 for some pools be FIFO instead of fair sharing in order to work better for 
 batch workloads. FIFO pools will behave exactly like the current default 
 scheduler, sorting jobs by priority and then submission time. Pools will have 
 their scheduling algorithm set through the pools config file, and it will be 
 changeable at runtime.
 To support this feature, I'm also changing the internal logic of the fair 
 scheduler to no longer use deficits. Instead, for fair sharing, we will 
 assign tasks to the job farthest below its share as a ratio of its share. 
 This is easier to combine with other scheduling algorithms and leads to a 
 more stable sharing situation, avoiding unfairness issues brought up in 
 MAPREDUCE-543 and MAPREDUCE-544 that happen when some jobs have long tasks. 
 The new preemption (MAPREDUCE-551) will ensure that critical jobs can gain 
 their fair share within a bounded amount of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-845) build.xml hard codes findbugs heap size, in some configurations 512M is insufficient to successfully build

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742483#action_12742483
 ] 

Hudson commented on MAPREDUCE-845:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 build.xml hard codes findbugs heap size, in some configurations 512M is 
 insufficient to successfully build
 --

 Key: MAPREDUCE-845
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-845
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.0
 Environment: building on RHEL5 with both javadoc and findbugs in the 
 same line
Reporter: Lee Tucker
Assignee: Lee Tucker
Priority: Minor
 Fix For: 0.21.0

 Attachments: MAPRED-845.patch

   Original Estimate: 0.03h
  Remaining Estimate: 0.03h

 When attempting the build with the hardcoded value of 512M for findbugs heap 
 size, the build fails with:
  [findbugs] Java Result: -1
  [xslt] Processing 
 /grid/0/gs/gridre/SpringMapRedLevel2/build/test/findbugs/hadoop-findbugs-report.xml
  to 
 /grid/0/gs/gridre/SpringMapRedLevel2/build/test/findbugs/hadoop-findbugs-report.html
  [xslt] Loading stylesheet 
 /homes/hadoopqa/tools/findbugs/latest/src/xsl/default.xsl
  [xslt] : Error! Premature end of file.
  [xslt] : Error! 
 com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end 
 of file.
  [xslt] Failed to process 
 /grid/0/gs/gridre/SpringMapRedLevel2/build/test/findbugs/hadoop-findbugs-report.xml
 BUILD FAILED

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-805) Deadlock in Jobtracker

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742485#action_12742485
 ] 

Hudson commented on MAPREDUCE-805:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 Deadlock in Jobtracker
 --

 Key: MAPREDUCE-805
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-805
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Michael Tamm
 Fix For: 0.20.1

 Attachments: MAPREDUCE-805-v1.1.patch, 
 MAPREDUCE-805-v1.11-branch-0.20.patch, MAPREDUCE-805-v1.11.patch, 
 MAPREDUCE-805-v1.12-branch-0.20.patch, MAPREDUCE-805-v1.12.patch, 
 MAPREDUCE-805-v1.2.patch, MAPREDUCE-805-v1.3.patch, MAPREDUCE-805-v1.6.patch, 
 MAPREDUCE-805-v1.7.patch


 We are running a hadoop cluster (version 0.20.0) and have detected the 
 following deadlock on our jobtracker:
 {code}
 IPC Server handler 51 on 9001:
   at 
 org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:943)
   - waiting to lock 0x7f2b6fb46130 (a 
 org.apache.hadoop.mapred.JobInProgress)
   at 
 org.apache.hadoop.mapred.JobTracker.getJobCounters(JobTracker.java:3102)
   - locked 0x7f2b5f026000 (a org.apache.hadoop.mapred.JobTracker)
   at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
  pool-1-thread-2:
   at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2017)
   - waiting to lock 0x7f2b5f026000 (a 
 org.apache.hadoop.mapred.JobTracker)
   at 
 org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2483)
   - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress)
   at 
 org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2152)
   - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress)
   at 
 org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:2169)
   - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress)
   at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2245)
   - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress)
   at 
 org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:86)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742482#action_12742482
 ] 

Hudson commented on MAPREDUCE-372:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-372-1.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-840) DBInputFormat leaves open transaction

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742489#action_12742489
 ] 

Hudson commented on MAPREDUCE-840:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])
. DBInputFormat leaves open transaction. Contributed by Aaron Kimball.


 DBInputFormat leaves open transaction
 -

 Key: MAPREDUCE-840
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Minor
 Fix For: 0.21.0

 Attachments: MAPREDUCE-840.patch


 DBInputFormat.getSplits() does not connection.commit() after the COUNT query. 
 This can leave an open transaction against the database which interferes with 
 other connections to the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-838) Task succeeds even when committer.commitTask fails with IOException

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742487#action_12742487
 ] 

Hudson commented on MAPREDUCE-838:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 Task succeeds even when committer.commitTask fails with IOException
 ---

 Key: MAPREDUCE-838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-838
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.20.1
Reporter: Koji Noguchi
Assignee: Amareshwari Sriramadasu
Priority: Blocker
 Fix For: 0.20.1

 Attachments: patch-838-0.20.txt, patch-838-1-0.20.txt, 
 patch-838-1.txt, patch-838.txt


 In MAPREDUCE-837, job succeeded with empty output even though all the tasks 
 were throwing IOException at commiter.commitTask.
 {noformat}
 2009-08-07 17:51:47,458 INFO org.apache.hadoop.mapred.TaskRunner: Task 
 attempt_200907301448_8771_r_00_0 is allowed to commit now
 2009-08-07 17:51:47,466 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
 committing: java.io.IOException: Can not get the relative path: \
 base = 
 hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
  \
 child = 
 hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
   at 
 org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
   at org.apache.hadoop.mapred.Task.commit(Task.java:768)
   at org.apache.hadoop.mapred.Task.done(Task.java:692)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
   at org.apache.hadoop.mapred.Child.main(Child.java:170)
 2009-08-07 17:51:47,468 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
 asking whether task can commit: java.io.IOException: \
 Can not get the relative path: base = 
 hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
  \
 child = 
 hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
   at 
 org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
   at 
 org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
   at org.apache.hadoop.mapred.Task.commit(Task.java:768)
   at org.apache.hadoop.mapred.Task.done(Task.java:692)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
   at org.apache.hadoop.mapred.Child.main(Child.java:170)
 2009-08-07 17:51:47,469 INFO org.apache.hadoop.mapred.TaskRunner: Task 
 attempt_200907301448_8771_r_00_0 is allowed to commit now
 2009-08-07 17:51:47,472 INFO org.apache.hadoop.mapred.TaskRunner: Task 
 'attempt_200907301448_8771_r_00_0' done.
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-848) TestCapacityScheduler is failing

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742491#action_12742491
 ] 

Hudson commented on MAPREDUCE-848:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])
. Fixes a problem to do with TestCapacityScheduler failing. Contributed by 
Amar Kamat.


 TestCapacityScheduler is failing
 

 Key: MAPREDUCE-848
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-848
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched
Affects Versions: 0.21.0
Reporter: Devaraj Das
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPREDUCE-848-v1.0.patch


 Looks like the commit of HADOOP-805 broke the CapacityScheduler testcase. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-789) Oracle support for Sqoop

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742493#action_12742493
 ] 

Hudson commented on MAPREDUCE-789:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])
. Oracle support for Sqoop. Contributed by Aaron Kimball.


 Oracle support for Sqoop
 

 Key: MAPREDUCE-789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-789
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-789.patch


 A separate ConnManager is needed for Oracle to support its slightly different 
 syntax and configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742492#action_12742492
 ] 

Hudson commented on MAPREDUCE-779:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 Add node health failures into JobTrackerStatistics
 --

 Key: MAPREDUCE-779
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Sreekanth Ramakrishnan
Assignee: Sreekanth Ramakrishnan
 Fix For: 0.21.0

 Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch, 
 mapreduce-779-3.patch, mapreduce-779-4.patch


 Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-808) Buffer objects incorrectly serialized to typed bytes

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742488#action_12742488
 ] 

Hudson commented on MAPREDUCE-808:
--

Integrated in Hadoop-Mapreduce-trunk #46 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/46/])


 Buffer objects incorrectly serialized to typed bytes
 

 Key: MAPREDUCE-808
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-808
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.21.0
Reporter: Klaas Bosteels
Assignee: Klaas Bosteels
 Fix For: 0.21.0

 Attachments: MAPREDUCE-808.patch


 {{TypedBytesOutput.write()}} should do something like
 {code}
 Buffer buf = (Buffer) obj;
 writeBytes(buf.get(), 0, bug.getCount());
 {code}
 instead of
 {code}
 writeBytes(((Buffer) obj).get());
 {code}
 since the bytes returned by {{Buffer.get()}} are only valid between 0 and 
 getCount() - 1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-852) ExampleDriver is incorrectly set as a Main-Class in tools in build.xml

2009-08-12 Thread Tsz Wo (Nicholas), SZE (JIRA)
ExampleDriver is incorrectly set as a Main-Class in tools in build.xml
--

 Key: MAPREDUCE-852
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-852
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Tsz Wo (Nicholas), SZE


In build.xml,
{code}
  target name=examples depends=jar, compile-examples description=Make the 
Hadoop examples jar.
...

  target name=tools-jar depends=jar, compile-tools 
  description=Make the Hadoop tools jar.
jar jarfile=${build.dir}/${tools.final.name}.jar
 basedir=${build.tools}
  manifest
attribute name=Main-Class 
   value=org/apache/hadoop/examples/ExampleDriver/
  /manifest
/jar
  /target
{code}
- ExampleDriver should not be a Main-Class of tools
- Should we rename the target name from tools-jar to tools, so that the 
name would be consistent with the examples target?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-852) ExampleDriver is incorrectly set as a Main-Class in tools in build.xml

2009-08-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-852:
-

Attachment: m852_20090812.patch

m852_20090812.patch: renamed tools-jar to tools and removed ExampleDriver from 
tools.

 ExampleDriver is incorrectly set as a Main-Class in tools in build.xml
 --

 Key: MAPREDUCE-852
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-852
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Tsz Wo (Nicholas), SZE
 Attachments: m852_20090812.patch


 In build.xml,
 {code}
   target name=examples depends=jar, compile-examples description=Make 
 the Hadoop examples jar.
 ...
   target name=tools-jar depends=jar, compile-tools 
   description=Make the Hadoop tools jar.
 jar jarfile=${build.dir}/${tools.final.name}.jar
  basedir=${build.tools}
   manifest
 attribute name=Main-Class 
value=org/apache/hadoop/examples/ExampleDriver/
   /manifest
 /jar
   /target
 {code}
 - ExampleDriver should not be a Main-Class of tools
 - Should we rename the target name from tools-jar to tools, so that the 
 name would be consistent with the examples target?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-825) JobClient completion poll interval of 5s causes slow tests in local mode

2009-08-12 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742639#action_12742639
 ] 

Todd Lipcon commented on MAPREDUCE-825:
---

Patch looks good to me. +1

 JobClient completion poll interval of 5s causes slow tests in local mode
 

 Key: MAPREDUCE-825
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-825
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Minor
 Attachments: completion-poll-interval.patch, MAPREDUCE-825.2.patch


 The JobClient.NetworkedJob.waitForCompletion() method polls for job 
 completion every 5 seconds. When running a set of short tests in 
 pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted 
 time. When bandwidth is not scarce, setting the poll interval to 100 ms 
 results in a 4x speedup in some tests.  This interval should be parametrized 
 to allow users to control the interval for testing purposes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-478) separate jvm param for mapper and reducer

2009-08-12 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742681#action_12742681
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-478:
--

Also on second thought, in my opinion 
[HADOOP-6105|http://issues.apache.org/jira/browse/HADOOP-6105] actually helps 
the issue which is mentioned here, it provides you automatic facility to split 
old key into two new keys.

 separate jvm param for mapper and reducer
 -

 Key: MAPREDUCE-478
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-478
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Koji Noguchi
Assignee: Arun C Murthy
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-5684_0_20090420.patch, 
 MAPREDUCE-478_0_20090804.patch, MAPREDUCE-478_0_20090804_yhadoop20.patch, 
 MAPREDUCE-478_1_20090806.patch, MAPREDUCE-478_1_20090806_yhadoop20.patch


 Memory footprint of mapper and reducer can differ. 
 It would be nice if we can pass different jvm param (mapred.child.java.opts) 
 for mappers and reducers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-08-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742712#action_12742712
 ] 

Owen O'Malley commented on MAPREDUCE-157:
-

I'm confused what the goal of using Avro here would be.

Let's review the goals:
  1. Get an easily parseable text format.
  2. Not require excessive amounts of time for logging
 2a. Not require excessive object allocations.

It seems like to use Avro, we'd need to create the Avro objects and then write 
them out. I'd rather just use a JsonWriter to write the events out to the 
stream. Of course reading is the reverse. I would be like writing xml files by 
generating the necessary DOM objects. You can do it (and in fact Configuration 
is written that way. *sigh*), but it costs a lot of time.

Not having seen the Avro text format, I can't evaluation how much overhead it 
adds. None of the features of Avro seem compelling in this case, and could 
easily lead to unfortunate choices.

Furthermore, I don't know if there are any guarantees about the Avro text 
format's stability. We need stability in this format.

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan

 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-08-12 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742714#action_12742714
 ] 

Philip Zeyliger commented on MAPREDUCE-157:
---

Avro would force you in to a schema, and I think having a schema is the only 
way to get stability in the format.  Yes, there's probably overhead, but if 
we're using Avro for other things (i.e., all RPCs), we may as well fix those 
overheads when we get to them.  (It may also be a net win to store the data in 
binary avro format, and write an avrocat to deserialize into text before 
pushing to tools like awk, but I do understand the desire for a text format.)

All that said, you have specific needs in mind here, and I'm mostly waxing 
poetical, so I'll certainly defer.

-- Philip

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan

 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-854) JobInProgress.initTasks() should not throw KillInterruptedException

2009-08-12 Thread Amar Kamat (JIRA)
 JobInProgress.initTasks() should not throw KillInterruptedException


 Key: MAPREDUCE-854
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-854
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat
Assignee: Amar Kamat


JobInProgress.initTasks() throws KillInterruptedException if its killed in 
init. This is a bad programming practice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-855) Testcases faking TaskTrackerManager might result into NPE

2009-08-12 Thread Amar Kamat (JIRA)
Testcases faking TaskTrackerManager might result into NPE 
--

 Key: MAPREDUCE-855
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-855
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amar Kamat
Assignee: Amar Kamat


JobInProgress uses JobTracker.getClock() assuming that the JobTracker is 
initialized before creating JobInProgress. This might not be true as the 
testcase might fake TaskTrackerManager. In such cases the JobInProgress might 
result into NPE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-853) Support a hierarchy of queues in the Map/Reduce framework

2009-08-12 Thread Hemanth Yamijala (JIRA)
Support a hierarchy of queues in the Map/Reduce framework
-

 Key: MAPREDUCE-853
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-853
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobtracker
Reporter: Hemanth Yamijala
 Fix For: 0.21.0


In MAPREDUCE-824, we proposed introducing a hierarchy of queues in the capacity 
scheduler. Currently, the M/R framework provides the notion of job queues and 
handles some functionality related to queues in a scheduler-agnostic manner. 
This functionality includes:
- Managing the list of ACLs for queues
- Managing the run state of queues - running or stopped
- Displaying scheduling information about queues in the jobtracker web UI and 
job client CLI
- Displaying list of jobs in a queue in the jobtracker web UI and job client CLI
- Providing APIs for list queues and queue information in JobClient.

Since it would be beneficial to extend this functionality to hierarchical 
queues, this JIRA is proposing introducing the concept into the map/reduce 
framework as well. We could treat this as an umbrella JIRA and file additional 
tasks for each of the changes involved, sticking to the high level approach in 
this JIRA.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-828) Provide a mechanism to pause the jobtracker

2009-08-12 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742726#action_12742726
 ] 

eric baldeschwieler commented on MAPREDUCE-828:
---

 Jobs submitted to the JT will be queued up. However, if the job client fails 
 to write the job 
 files to the DFS (the step before job submission), those jobs will be lost.

Presumably the client can detect this and fail?

What does the client do if it tries to submit a job to a paused JT?

Why would the JT not process heartbeats normally (without scheduling new 
tasks)?  the new state seems hackish.



 Provide a mechanism to pause the jobtracker
 ---

 Key: MAPREDUCE-828
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-828
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobtracker
Reporter: Hemanth Yamijala

 We've seen scenarios when we have needed to stop the namenode for a 
 maintenance activity. In such scenarios, if the jobtracker (JT) continues to 
 run, jobs would fail due to initialization or task failures (due to DFS). We 
 could restart the JT enabling job recovery, during such scenarios. But 
 restart has proved to be a very intrusive activity, particularly if the JT is 
 not at fault itself and does not require a restart. The ask is for a 
 admin-controlled feature to pause the JT which would take it to a state 
 somewhat analogous to the safe mode of DFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.