[jira] Updated: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called

2009-12-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1152:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Sharad!

 JobTrackerInstrumentation.killed{Map/Reduce} is never called
 

 Key: MAPREDUCE-1152
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Sharad Agarwal
 Fix For: 0.22.0

 Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch


 JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of 
 MAPREDUCE-1103 is not captured

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-372:
--

Status: Patch Available  (was: Open)

 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, 
 patch-372-1.txt, patch-372-2.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-372:
--

Attachment: patch-372-2.txt

Patch with review comments incorporated.

 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, 
 patch-372-1.txt, patch-372-2.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words

2009-12-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1174:
-

Status: Open  (was: Patch Available)

Unfortunately, the patch has gone stale. Could you regenerate it?

 Sqoop improperly handles table/column names which are reserved sql words
 

 Key: MAPREDUCE-1174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1174.patch


 In some databases it is legal to name tables and columns with terms that 
 overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such 
 cases, the database allows you to escape the table and column names. We 
 should always escape table and column names when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1234) getJobID() returns null on org.apache.hadoop.mapreduce.Job after job was submitted

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu resolved MAPREDUCE-1234.


Resolution: Duplicate

Duplicate of MAPREDUCE-118

 getJobID() returns null on org.apache.hadoop.mapreduce.Job after job was 
 submitted
 --

 Key: MAPREDUCE-1234
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1234
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.1
 Environment: Run on Win XP, but will propably occur on any system
Reporter: Thomas Kathmann
Priority: Minor
   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 After an instance of org.apache.hadoop.mapreduce.Job is submitted via 
 submit() the method getJobID() returns null.
 The code of the submit() method should include something like:
 setJobID(info.getJobID());
 after
 info = jobClient.submitJobInternal(conf);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-118:
--

 Priority: Blocker  (was: Major)
Affects Version/s: 0.20.1
Fix Version/s: 0.20.2

 Job.getJobID() will always return null
 --

 Key: MAPREDUCE-118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.2


 JobContext is used for a read-only view of job's info. Hence all the readonly 
 fields in JobContext are set in the constructor. Job extends JobContext. When 
 a Job is created, jobid is not known and hence there is no way to set JobID 
 once Job is created. JobID is obtained only when the JobClient queries the 
 jobTracker for a job-id., which happens later i.e upon job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1257) Ability to grab the number of spills

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785822#action_12785822
 ] 

Hadoop QA commented on MAPREDUCE-1257:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426868/mapreduce-1257.txt
  against trunk revision 887061.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/console

This message is automatically generated.

 Ability to grab the number of spills
 

 Key: MAPREDUCE-1257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1257
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 0.22.0
Reporter: Sriranjan Manjunath
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: mapreduce-1257.txt


 The counters should have information about the number of spills in addition 
 to the number of spill records.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-118:
--

Attachment: patch-118-0.20.txt

Patch for branch 0.20

 Job.getJobID() will always return null
 --

 Key: MAPREDUCE-118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.2

 Attachments: patch-118-0.20.txt


 JobContext is used for a read-only view of job's info. Hence all the readonly 
 fields in JobContext are set in the constructor. Job extends JobContext. When 
 a Job is created, jobid is not known and hence there is no way to set JobID 
 once Job is created. JobID is obtained only when the JobClient queries the 
 jobTracker for a job-id., which happens later i.e upon job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-118:
--

Attachment: patch-118.txt
patch-118-0.21.txt

Patch for branch 0.21 and trunk, renaming getID to getJobID, sothat it 
overrides the method in JobContext.

 Job.getJobID() will always return null
 --

 Key: MAPREDUCE-118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.2

 Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt


 JobContext is used for a read-only view of job's info. Hence all the readonly 
 fields in JobContext are set in the constructor. Job extends JobContext. When 
 a Job is created, jobid is not known and hence there is no way to set JobID 
 once Job is created. JobID is obtained only when the JobClient queries the 
 jobTracker for a job-id., which happens later i.e upon job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce

2009-12-04 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan reassigned MAPREDUCE-1084:
-

Assignee: Sreekanth Ramakrishnan

 Implementing aspects development and fault injeciton framework for MapReduce
 

 Key: MAPREDUCE-1084
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build, test
Reporter: Konstantin Boudnik
Assignee: Sreekanth Ramakrishnan

 Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of 
 injection framework for MapReduce.
 After HADOOP-6204 is in place this particular modification should be very 
 trivial and would take importing (via svn:external) of src/test/build and 
 some tweaking of the build.xml file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null

2009-12-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-118:
--

Status: Patch Available  (was: Open)

 Job.getJobID() will always return null
 --

 Key: MAPREDUCE-118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.2

 Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt


 JobContext is used for a read-only view of job's info. Hence all the readonly 
 fields in JobContext are set in the constructor. Job extends JobContext. When 
 a Job is created, jobid is not known and hence there is no way to set JobID 
 once Job is created. JobID is obtained only when the JobClient queries the 
 jobTracker for a job-id., which happens later i.e upon job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce

2009-12-04 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-1084:
--

Attachment: mapreduce-1084-1-withoutsvnexternals.patch
mapreduce-1084-1.patch

Attaching the patch implementing the fault injection in mapreduce project.

There are two patches with svn external and without svn external.  Svn external 
patch when applied over workspace does not create the appropriate folder 
structure with links even tho' the property and folder is added into version 
control.


 Implementing aspects development and fault injeciton framework for MapReduce
 

 Key: MAPREDUCE-1084
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build, test
Reporter: Konstantin Boudnik
Assignee: Sreekanth Ramakrishnan
 Attachments: mapreduce-1084-1-withoutsvnexternals.patch, 
 mapreduce-1084-1.patch


 Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of 
 injection framework for MapReduce.
 After HADOOP-6204 is in place this particular modification should be very 
 trivial and would take importing (via svn:external) of src/test/build and 
 some tweaking of the build.xml file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.

2009-12-04 Thread ZhuGuanyin (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785838#action_12785838
 ] 

ZhuGuanyin commented on MAPREDUCE-1254:
---

Because the local inexpensive disks are not reliable, and we once found the non 
zero file became zero length, but the os kernel message has no warning, while 
some minutes later, the kernel message report the disk failtures. Durining that 
time,  the read operation return success without throw any IOException. 

In current implementation, it would throw IOException if the job.xml missing, 
but it couldn't detect the configuration file has corrupted or has being 
truncated.

 job.xml should add crc check in tasktracker and sub jvm.
 

 Key: MAPREDUCE-1254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: task, tasktracker
Affects Versions: 0.22.0
Reporter: ZhuGuanyin

 Currently job.xml in tasktracker and subjvm are write to local disk through 
 ChecksumFilesystem, and already had crc checksum information, but load the 
 job.xml file without crc check. It would cause the mapred job finished 
 successful but with wrong data because of disk error.  Example: The 
 tasktracker and sub task jvm would load the default configuration if it 
 doesn't successfully load the job.xml which maybe replace the mapper with 
 IdentityMapper. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1114:
-

Status: Open  (was: Patch Available)

The patch is stale.

The long build times are a problem and ivy's a big part of that, but I agree 
with your assessment: this is a hack. I don't think the 15 second payoff 
justifies the maintenance cost of a custom caching layer for ivy.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1161) NotificationTestCase should not lock current thread

2009-12-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1161:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I committed this. Thanks, Owen!

 NotificationTestCase should not lock current thread
 ---

 Key: MAPREDUCE-1161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.21.0

 Attachments: mr-1161.patch


 There are 3 instances where NotificationTestCase is locking 
 Thread.currentThread() is being locked and calling sleep on it. There is also 
 a method stdPrintln that doesn't do anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785860#action_12785860
 ] 

Hadoop QA commented on MAPREDUCE-1241:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426838/mapreduce-1241.txt
  against trunk revision 887061.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 160 release audit warnings 
(more than the trunk's current 159 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/console

This message is automatically generated.

 JobTracker should not crash when mapred-queues.xml does not exist
 -

 Key: MAPREDUCE-1241
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1241.txt


 Currently, if you bring up the JobTracker on an old configuration directory, 
 it gets a NullPointerException looking for the mapred-queues.xml file. It 
 should just assume a default queue and continue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1075) getQueue(String queue) in JobTracker would return NPE for invalid queue name

2009-12-04 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785874#action_12785874
 ] 

Hemanth Yamijala commented on MAPREDUCE-1075:
-

In an offline discussion with Vinod, we concluded that there is no provision to 
marshal exceptions in Hadoop's RPC right now. Hence, we are deciding in favor 
of returning null in the queue APIs.

With this context I looked at the new patch. One minor NIT is that I would 
suggest we test the API JobClient.getQueueInfo instead of Cluster.getQueue, as 
it covers more code path that's changed. Can you please make this change and 
run the patch through Hudson so I can commit once it passes ?

 getQueue(String queue) in JobTracker would return NPE for invalid queue name
 

 Key: MAPREDUCE-1075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: V.V.Chaitanya Krishna
Assignee: V.V.Chaitanya Krishna
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1075-1.patch, MAPREDUCE-1075-2.patch, 
 MAPREDUCE-1075-3.patch, MAPREDUCE-1075-4.patch, MAPREDUCE-1075-5.patch, 
 MAPREDUCE-1075-6.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785878#action_12785878
 ] 

Hudson commented on MAPREDUCE-1152:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #144 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/144/])
. Distinguish between failed and killed tasks in
JobTrackerInstrumentation. Contributed by Sharad Agarwal


 JobTrackerInstrumentation.killed{Map/Reduce} is never called
 

 Key: MAPREDUCE-1152
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Sharad Agarwal
 Fix For: 0.22.0

 Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch


 JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of 
 MAPREDUCE-1103 is not captured

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1161) NotificationTestCase should not lock current thread

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785877#action_12785877
 ] 

Hudson commented on MAPREDUCE-1161:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #144 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/144/])
. Remove ineffective synchronization in NotificationTestCase.
Contributed by Owen O'Malley


 NotificationTestCase should not lock current thread
 ---

 Key: MAPREDUCE-1161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.21.0

 Attachments: mr-1161.patch


 There are 3 instances where NotificationTestCase is locking 
 Thread.currentThread() is being locked and calling sleep on it. There is also 
 a method stdPrintln that doesn't do anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-12-04 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785897#action_12785897
 ] 

Sharad Agarwal commented on MAPREDUCE-372:
--

Looked at the ChainBlockingQueue part of the code. Some comments:
1. Can we avoid the casting in Chain#stopAllThreads? One way could be to 
override interrupt() in MapRunner and ReduceRunner. Also interruptAllThreads 
would be a better name IMO.
2. I think instead of interrupting the runners and then calling interrupt on 
both readers and writers, it would be simpler we can directly interrupt all the 
blocking queues.

 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, 
 patch-372-1.txt, patch-372-2.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1082) Command line UI for queues' information is broken with hierarchical queues.

2009-12-04 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785899#action_12785899
 ] 

Hemanth Yamijala commented on MAPREDUCE-1082:
-

Looking close. Some final comments:
- We are assuming the job statuses cannot be null in QueueInfo. I think we 
should check this in setJobStatuses. If it is null, we can set an empty array.
- The test case should call APIs like setRootQueues. getQueue is not passing 
through the code path change you made in JobTracker.getQueueInfoArray

 Command line UI for queues' information is broken with hierarchical queues.
 ---

 Key: MAPREDUCE-1082
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1082
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, jobtracker
Affects Versions: 0.21.0
Reporter: Vinod K V
Assignee: V.V.Chaitanya Krishna
Priority: Blocker
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1082-1.txt, MAPREDUCE-1082-2.patch, 
 MAPREDUCE-1082-3.patch


 When the command ./bin/mapred --config ~/tmp/conf/ queue -list is run, it 
 just hangs. I can see the following in the JT logs:
 {code}
 2009-10-08 13:19:26,762 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 1 on 5 caught: java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.QueueInfo.write(QueueInfo.java:217)
 at org.apache.hadoop.mapreduce.QueueInfo.write(QueueInfo.java:223)
 at 
 org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:159)
 at 
 org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:126)
 at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:70)
 at org.apache.hadoop.ipc.Server.setupResponse(Server.java:1074)
 at org.apache.hadoop.ipc.Server.access$2400(Server.java:77)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:983)
 {code}
 Same is the case with ./bin/mapred --config ~/tmp/conf/ queue -info 
 any-container-queue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) Secure job submission

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785905#action_12785905
 ] 

Hadoop QA commented on MAPREDUCE-181:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426876/181-4.patch
  against trunk revision 887096.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 78 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/console

This message is automatically generated.

 Secure job submission 
 --

 Key: MAPREDUCE-181
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amar Kamat
Assignee: Devaraj Das
 Fix For: 0.22.0

 Attachments: 181-1.patch, 181-2.patch, 181-3.patch, 181-3.patch, 
 181-4.patch, hadoop-3578-branch-20-example-2.patch, 
 hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
 HADOOP-3578-v2.7.patch, MAPRED-181-v3.32.patch, MAPRED-181-v3.8.patch


 Currently the jobclient accesses the {{mapred.system.dir}} to add job 
 details. Hence the {{mapred.system.dir}} has the permissions of 
 {{rwx-wx-wx}}. This could be a security loophole where the job files might 
 get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1185) URL to JT webconsole for running job and job history should be the same

2009-12-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785911#action_12785911
 ] 

Arun C Murthy commented on MAPREDUCE-1185:
--

+1

 URL to JT webconsole for running job and job history should be the same
 ---

 Key: MAPREDUCE-1185
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1185
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Sharad Agarwal
Assignee: Sharad Agarwal
 Attachments: 1185_v1.patch, 1185_v2.patch, 1185_v3.patch, 
 1185_v4.patch, 1185_v5.patch, 1185_v6.patch, 1185_v7.patch, 
 patch-1185-1-ydist.txt, patch-1185-2-ydist.txt, patch-1185-3-ydist.txt, 
 patch-1185-ydist.txt


 The tracking url for running jobs and the jobs which are retired is 
 different. This creates problem for clients which caches the job running url 
 because soon it becomes invalid when job is retired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1264) Error Recovery failed, task will continue but run forever as new data only comes in very very slowly

2009-12-04 Thread Thibaut (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thibaut updated MAPREDUCE-1264:
---

  Description: 
Hi,

Sometimes, some of my jobs (It normally always happens in the reducers and on 
random basis) will not finish and will run forever. I have to manually fail the 
task so the task will be started and be finished.

The error log on the node is full of entries like:
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
The error entries all refer to the same data block.

Unfortunately, the reduce function still seems to be called in the reducer with 
valid data (although very very slowly), so the task will never been killed and 
restarted and will take forever to run!

If I kill the task, the job will finish without any problems. 

I experienced the same problem under version 0.20.0 as well.


Thanks,
Thibaut

  was:
Hi,

Sometimes, some of my jobs (It normally always happens in the reducers and on 
random basis) will not finish and will run forever. I have to manually fail the 
task so the task will be started and be finished.

The error log on the node is full of entries like:
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 
failed  because recovery from primary datanode 192.168.0.3:50011 failed 6 
times.  Pipeline was 192.168.0.3:50011. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)
The error entries all refer to the same data block.

Unfortunally, the reduce function still seems to be called in the reducer with 
valid data (allthough very very slowly), so the task will never been killed and 
restarted and will take forever to run!

I experienced the same problem under version 0.20.0 as well.


Thanks,
Thibaut

Fix Version/s: 0.20.2

 Error Recovery failed, task will continue but run forever as new data only 
 comes in very very slowly
 

 Key: MAPREDUCE-1264
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1264
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Thibaut

[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases

2009-12-04 Thread Omer Trajman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omer Trajman updated MAPREDUCE-1230:


Fix Version/s: 0.21.0

 Vertica streaming adapter doesn't handle nulls in all cases
 ---

 Key: MAPREDUCE-1230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+
Reporter: Omer Trajman
Assignee: Omer Trajman
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1230.patch


 Test user reported that Vertica adapter throws an npe when retrieving null 
 values for certain types (binary, numeric both reported).  There is no 
 special case handling when serializing nulls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases

2009-12-04 Thread Omer Trajman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omer Trajman updated MAPREDUCE-1230:


Status: Patch Available  (was: Open)

 Vertica streaming adapter doesn't handle nulls in all cases
 ---

 Key: MAPREDUCE-1230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+
Reporter: Omer Trajman
Assignee: Omer Trajman
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1230.patch


 Test user reported that Vertica adapter throws an npe when retrieving null 
 values for certain types (binary, numeric both reported).  There is no 
 special case handling when serializing nulls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases

2009-12-04 Thread Omer Trajman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omer Trajman updated MAPREDUCE-1230:


Status: Open  (was: Patch Available)

 Vertica streaming adapter doesn't handle nulls in all cases
 ---

 Key: MAPREDUCE-1230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+
Reporter: Omer Trajman
Assignee: Omer Trajman
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1230.patch


 Test user reported that Vertica adapter throws an npe when retrieving null 
 values for certain types (binary, numeric both reported).  There is no 
 special case handling when serializing nulls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-896) Users can set non-writable permissions on temporary files for TT and can abuse disk usage.

2009-12-04 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785948#action_12785948
 ] 

Ravi Gummadi commented on MAPREDUCE-896:


The new TaskController command is ENABLE_TASK_FOR_CLEANUP.

There is a change in JVMManager where the workdir for the last task was being 
deleted inline, but now we delete it asynchronously. This should be fine.

The change in setupWorkDir fixes the issue of trying to delete workDir, which 
is the current working dir. Only contents of workDir are deleted, leaving the 
workDir as empty. A testcase is added to validate this cleanup of workDir.

Removing check_group as this wouldn't work if user changes the group of workDir.

createFileAndSetPermissions sets a=rx for subDir and file in subDir sothat no 
one can delete them without doing chmod.

Am fine with the other comments.



 Users can set non-writable permissions on temporary files for TT and can 
 abuse disk usage.
 --

 Key: MAPREDUCE-896
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-896
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.21.0
Reporter: Vinod K V
Assignee: Ravi Gummadi
 Fix For: 0.21.0

 Attachments: MR-896.patch, MR-896.v1.patch


 As of now, irrespective of the TaskController in use, TT itself does a full 
 delete on local files created by itself or job tasks. This step, depending 
 upon TT's umask and the permissions set by files by the user, for e.g in 
 job-work/task-work or child.tmp directories, may or may not go through 
 successful completion fully. Thus is left an opportunity for abusing disk 
 space usage either accidentally or intentionally by TT/users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785954#action_12785954
 ] 

Hadoop QA commented on MAPREDUCE-372:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426877/patch-372-2.txt
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/console

This message is automatically generated.

 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, 
 patch-372-1.txt, patch-372-2.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-118) Job.getJobID() will always return null

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785974#action_12785974
 ] 

Hadoop QA commented on MAPREDUCE-118:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426883/patch-118.txt
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/console

This message is automatically generated.

 Job.getJobID() will always return null
 --

 Key: MAPREDUCE-118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Amar Kamat
Priority: Blocker
 Fix For: 0.20.2

 Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt


 JobContext is used for a read-only view of job's info. Hence all the readonly 
 fields in JobContext are set in the constructor. Job extends JobContext. When 
 a Job is created, jobid is not known and hence there is no way to set JobID 
 once Job is created. JobID is obtained only when the JobClient queries the 
 jobTracker for a job-id., which happens later i.e upon job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1249) mapreduce.reduce.shuffle.read.timeout's default value should be 3 minutes, in mapred-default.xml

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786043#action_12786043
 ] 

Hudson commented on MAPREDUCE-1249:
---

Integrated in Hadoop-Mapreduce-trunk #164 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/])
. Update config default value for socket read timeout to
match code default. Contributed by Amareshwari Sriramadasu


 mapreduce.reduce.shuffle.read.timeout's default value should be 3 minutes, in 
 mapred-default.xml
 

 Key: MAPREDUCE-1249
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1249
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.21.0
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
Priority: Blocker
 Fix For: 0.21.0

 Attachments: patch-1249-1.txt, patch-1249.txt


 mapreduce.reduce.shuffle.read.timeout has a value of 30,000 (30 seconds) in 
 mapred-default.xml, whereas the default value in Fetcher code is 3 minutes.
 It should be 3 minutes by default, as it was in pre MAPREDUCE-353.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1161) NotificationTestCase should not lock current thread

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786042#action_12786042
 ] 

Hudson commented on MAPREDUCE-1161:
---

Integrated in Hadoop-Mapreduce-trunk #164 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/])
. Remove ineffective synchronization in NotificationTestCase.
Contributed by Owen O'Malley


 NotificationTestCase should not lock current thread
 ---

 Key: MAPREDUCE-1161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.21.0

 Attachments: mr-1161.patch


 There are 3 instances where NotificationTestCase is locking 
 Thread.currentThread() is being locked and calling sleep on it. There is also 
 a method stdPrintln that doesn't do anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786045#action_12786045
 ] 

Hudson commented on MAPREDUCE-1152:
---

Integrated in Hadoop-Mapreduce-trunk #164 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/])
. Distinguish between failed and killed tasks in
JobTrackerInstrumentation. Contributed by Sharad Agarwal


 JobTrackerInstrumentation.killed{Map/Reduce} is never called
 

 Key: MAPREDUCE-1152
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Sharad Agarwal
 Fix For: 0.22.0

 Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch


 JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of 
 MAPREDUCE-1103 is not captured

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1260) Update Eclipse configuration to match changes to Ivy configuration

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786041#action_12786041
 ] 

Hudson commented on MAPREDUCE-1260:
---

Integrated in Hadoop-Mapreduce-trunk #164 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/])


 Update Eclipse configuration to match changes to Ivy configuration
 --

 Key: MAPREDUCE-1260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Edwin Chan
 Fix For: 0.22.0

 Attachments: mapReduceClasspath.patch


 The .eclipse_templates/.classpath file doesn't match the Ivy configuration, 
 so I've updated it to match.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1119) When tasks fail to report status, show tasks's stack dump before killing

2009-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786044#action_12786044
 ] 

Hudson commented on MAPREDUCE-1119:
---

Integrated in Hadoop-Mapreduce-trunk #164 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/])
. When tasks fail to report status, show tasks's stack dump before killing. 
Contributed by Aaron Kimball.


 When tasks fail to report status, show tasks's stack dump before killing
 

 Key: MAPREDUCE-1119
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1119
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron Kimball
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1119.2.patch, MAPREDUCE-1119.3.patch, 
 MAPREDUCE-1119.4.patch, MAPREDUCE-1119.5.patch, MAPREDUCE-1119.6.patch, 
 MAPREDUCE-1119.patch


 When the TT kills tasks that haven't reported status, it should somehow 
 gather a stack dump for the task. This could be done either by sending a 
 SIGQUIT (so the dump ends up in stdout) or perhaps something like JDI to 
 gather the stack directly from Java. This may be somewhat tricky since the 
 child may be running as another user (so the SIGQUIT would have to go through 
 LinuxTaskController). This feature would make debugging these kinds of 
 failures much easier, especially if we could somehow get it into the 
 TaskDiagnostic message

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1265:
--

Description: 
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine for example, a blacklisted node.
We have to use the task attempt id to find these information. This is not very 
convenient. 

It will be nice if  we can also log the tasktracker which cauces this error.
This way we can just grep the hostname to quickly find all the relevant error 
message.

  was:
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine or a job. We have to use the task attempt id to 
find these information.

It will be much more convenient if  we can just log them together.
This way we can just grep the jobId or hostname to quickly find all the 
relevant error message.

Summary: Include tasktracker name in the task attempt error log  (was: 
Include jobId and hostname in the task attempt error log)

 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a blacklisted node.
 We have to use the task attempt id to find these information. This is not 
 very convenient. 
 It will be nice if  we can also log the tasktracker which cauces this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786066#action_12786066
 ] 

Scott Chen commented on MAPREDUCE-1265:
---

I just realized that job id is just part of task attempt id so we can easily 
obtain that.
So we need to log tasktracker name here only.

So, here is the log after change:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1 *on tracker_m01.aaa.com*: Error: 
java.lang.OutOfMemoryError: Java heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0 *on tracker_m02.aaa.com*: Task 
attempt_2009__m_000478_0 failed to report status for 601 seconds. 
Killing!

 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a blacklisted node.
 We have to use the task attempt id to find these information. This is not 
 very convenient. 
 It will be nice if  we can also log the tasktracker which cauces this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1265:
--

Fix Version/s: 0.22.0
Affects Version/s: 0.22.0
   Status: Patch Available  (was: Open)

 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a blacklisted node.
 We have to use the task attempt id to find these information. This is not 
 very convenient. 
 It will be nice if  we can also log the tasktracker which cauces this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1265:
--

Description: 
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine for example, a blacklisted node.
We have to use the task attempt id to find the TT. This is not very convenient. 

It will be nice if  we can also log the tasktracker which cauces this error.
This way we can just grep the hostname to quickly find all the relevant error 
message.

  was:
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine for example, a blacklisted node.
We have to use the task attempt id to find these information. This is not very 
convenient. 

It will be nice if  we can also log the tasktracker which cauces this error.
This way we can just grep the hostname to quickly find all the relevant error 
message.


 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a blacklisted node.
 We have to use the task attempt id to find the TT. This is not very 
 convenient. 
 It will be nice if  we can also log the tasktracker which cauces this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1265:
--

Description: 
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine for example, a node has been blacklisted in the 
past few days.
We have to use the task attempt id to find the TT. This is not very convenient. 

It will be nice if  we can also log the tasktracker which causes this error.
This way we can just grep the hostname to quickly find all the relevant error 
message.

  was:
When task attempt receive an error, TaskInProgress will log the task attempt id 
and diagnosis string in the JobTracker log.
Ex:
2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java 
heap space
2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
failed to report status for 601 seconds. Killing!

When we want to debug a machine for example, a blacklisted node.
We have to use the task attempt id to find the TT. This is not very convenient. 

It will be nice if  we can also log the tasktracker which cauces this error.
This way we can just grep the hostname to quickly find all the relevant error 
message.


 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a node has been blacklisted in 
 the past few days.
 We have to use the task attempt id to find the TT. This is not very 
 convenient. 
 It will be nice if  we can also log the tasktracker which causes this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-967) TaskTracker does not need to fully unjar job jars

2009-12-04 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786077#action_12786077
 ] 

Tom White commented on MAPREDUCE-967:
-

+1 This looks good to me.

bq. One question for reviewer: the constant for the new configuration key is in 
JobContext, whereas the default is in JobConf. I was following some other 
examples from the code, but it seems a little bit messy here. Where are the 
right places to add new configuration parameters that work in both APIs?

The key should certainly go in JobContext, but where the default is located is 
less clear. Defaults tend to be defined in the class that they are used, which 
is JobConf in this case. However, JobConf is deprecated and will disappear, 
although it may still be used by the implementation (i.e. not be a part of the 
public API), in which case what you have done is fine.



 TaskTracker does not need to fully unjar job jars
 -

 Key: MAPREDUCE-967
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.21.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt, 
 mapreduce-967.txt, mapreduce-967.txt


 In practice we have seen some users submitting job jars that consist of 
 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning 
 up after them has a significant cost (both in wall clock and in unnecessary 
 heavy disk utilization). This cost can be easily avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) Secure job submission

2009-12-04 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786085#action_12786085
 ] 

Devaraj Das commented on MAPREDUCE-181:
---

On the failing tests, failure of TestGridmixSubmission is a known issue. The 
other two tests don't fail on my local machine..

 Secure job submission 
 --

 Key: MAPREDUCE-181
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amar Kamat
Assignee: Devaraj Das
 Fix For: 0.22.0

 Attachments: 181-1.patch, 181-2.patch, 181-3.patch, 181-3.patch, 
 181-4.patch, hadoop-3578-branch-20-example-2.patch, 
 hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
 HADOOP-3578-v2.7.patch, MAPRED-181-v3.32.patch, MAPRED-181-v3.8.patch


 Currently the jobclient accesses the {{mapred.system.dir}} to add job 
 details. Hence the {{mapred.system.dir}} has the permissions of 
 {{rwx-wx-wx}}. This could be a security loophole where the job files might 
 get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-177) Hadoop performance degrades significantly as more and more jobs complete

2009-12-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786104#action_12786104
 ] 

Allen Wittenauer commented on MAPREDUCE-177:


What is the latest status of this patch?  It doesn't appear to be committed or, 
heck, even resolved as to how the fix is going to be applied.

 Hadoop performance degrades significantly as more and more jobs complete
 

 Key: MAPREDUCE-177
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-177
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Runping Qi
Assignee: Ioannis Koltsidas
Priority: Critical
 Attachments: HADOOP-4766-v1.patch, HADOOP-4766-v2.10.patch, 
 HADOOP-4766-v2.4.patch, HADOOP-4766-v2.6.patch, HADOOP-4766-v2.7-0.18.patch, 
 HADOOP-4766-v2.7-0.19.patch, HADOOP-4766-v2.7.patch, 
 HADOOP-4766-v2.8-0.18.patch, HADOOP-4766-v2.8-0.19.patch, 
 HADOOP-4766-v2.8.patch, HADOOP-4766-v3.4-0.19.patch, map_scheduling_rate.txt


 When I ran the gridmix 2 benchmark load on a fresh cluster of 500 nodes with 
 hadoop trunk, 
 the gridmix load, consisting of 202 map/reduce jobs of various sizes, 
 completed in 32 minutes. 
 Then I ran the same set of the jobs on the same cluster, yhey completed in 43 
 minutes.
 When I ran them the third times, it took (almost) forever --- the job tracker 
 became non-responsive.
 The job  tracker's heap size was set to 2GB. 
 The cluster is configured to keep up to 500 jobs in memory.
 The job tracker kept one cpu busy all the time. Look like it was due to GC.
 I believe the release 0.18/0.19 have the similar behavior.
 I believe 0.18 and 0.18 also have the similar behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-1241:
---

Attachment: mapreduce-1241.txt

Adds license to mapred-queues-default.xml. Since we're now treating them as 
separate files, I also got rid of all of the documentation-y comments from 
-default.

 JobTracker should not crash when mapred-queues.xml does not exist
 -

 Key: MAPREDUCE-1241
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1241.txt, mapreduce-1241.txt


 Currently, if you bring up the JobTracker on an old configuration directory, 
 it gets a NullPointerException looking for the mapred-queues.xml file. It 
 should just assume a default queue and continue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786163#action_12786163
 ] 

Hadoop QA commented on MAPREDUCE-1230:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425750/MAPREDUCE-1230.patch
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/console

This message is automatically generated.

 Vertica streaming adapter doesn't handle nulls in all cases
 ---

 Key: MAPREDUCE-1230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+
Reporter: Omer Trajman
Assignee: Omer Trajman
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1230.patch


 Test user reported that Vertica adapter throws an npe when retrieving null 
 values for certain types (binary, numeric both reported).  There is no 
 special case handling when serializing nulls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words

2009-12-04 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1174:
-

Attachment: MAPREDUCE-1174.2.patch

Freshly cut patch.

 Sqoop improperly handles table/column names which are reserved sql words
 

 Key: MAPREDUCE-1174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch


 In some databases it is legal to name tables and columns with terms that 
 overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such 
 cases, the database allows you to escape the table and column names. We 
 should always escape table and column names when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words

2009-12-04 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1174:
-

Assignee: zhiyong zhang  (was: Aaron Kimball)
  Status: Patch Available  (was: Open)

 Sqoop improperly handles table/column names which are reserved sql words
 

 Key: MAPREDUCE-1174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: zhiyong zhang
 Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch


 In some databases it is legal to name tables and columns with terms that 
 overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such 
 cases, the database allows you to escape the table and column names. We 
 should always escape table and column names when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-1244) eclipse-plugin fails with missing dependencies

2009-12-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reopened MAPREDUCE-1244:
--


We need to apply this to 0.21 also.

 eclipse-plugin fails with missing dependencies
 --

 Key: MAPREDUCE-1244
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1244
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Giridharan Kesavan
Assignee: Giridharan Kesavan
 Fix For: 0.21.0, 0.22.0

 Attachments: mapred-1244.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1244) eclipse-plugin fails with missing dependencies

2009-12-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved MAPREDUCE-1244.
--

   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]

We need to  apply this fix to 0.21 also.

 eclipse-plugin fails with missing dependencies
 --

 Key: MAPREDUCE-1244
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1244
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Giridharan Kesavan
Assignee: Giridharan Kesavan
 Fix For: 0.21.0, 0.22.0

 Attachments: mapred-1244.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786201#action_12786201
 ] 

Hadoop QA commented on MAPREDUCE-1265:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/console

This message is automatically generated.

 Include tasktracker name in the task attempt error log
 --

 Key: MAPREDUCE-1265
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch


 When task attempt receive an error, TaskInProgress will log the task attempt 
 id and diagnosis string in the JobTracker log.
 Ex:
 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: 
 Java heap space
 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
 from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 
 failed to report status for 601 seconds. Killing!
 When we want to debug a machine for example, a node has been blacklisted in 
 the past few days.
 We have to use the task attempt id to find the TT. This is not very 
 convenient. 
 It will be nice if  we can also log the tasktracker which causes this error.
 This way we can just grep the hostname to quickly find all the relevant error 
 message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-04 Thread Todd Lipcon (JIRA)
Allow heartbeat interval smaller than 3 seconds for tiny clusters
-

 Key: MAPREDUCE-1266
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, task, tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Minor


For small clusters, the heartbeat interval has a large effect on job latency. 
This is especially true on pseudo-distributed or other tiny (5 nodes) 
clusters. It's not a big deal for production, but new users would have a 
happier first experience if Hadoop seemed snappier.

I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
0.5 seconds (but have it governed by an undocumented config parameter in case 
people don't like this change). The cluster size-based ramp up of interval will 
maintain the current scalable behavior for large clusters with no negative 
effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786211#action_12786211
 ] 

Todd Lipcon commented on MAPREDUCE-1114:


bq. I don't think the 15 second payoff justifies the maintenance cost of a 
custom caching layer for ivy.

Comparing the 15 second payoff to the full build time isn't particular 
important to me. For me, the ability to quickly iterate on code while 
recompiling and rerunning unit tests is the big payoff - so I look at this as a 
60% speedup in my development cycle rather than a few % speedup in the full 
build.

I may be in the minority, though, as I don't use eclipse or anything other 
fancy IDE that does incremental compilation.

Anyone else care to chime in?

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786213#action_12786213
 ] 

Allen Wittenauer commented on MAPREDUCE-1266:
-

I'm probably be forgetful, but.. we have:

a) heartbeat interval
b) minimum heartbeat interval

such that

a  b, always.

If someone doesn't like b, does it matter?  Wouldn't they just tune a?  I guess 
i'm asking: why make b configurable at all?

 Allow heartbeat interval smaller than 3 seconds for tiny clusters
 -

 Key: MAPREDUCE-1266
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, task, tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Minor

 For small clusters, the heartbeat interval has a large effect on job latency. 
 This is especially true on pseudo-distributed or other tiny (5 nodes) 
 clusters. It's not a big deal for production, but new users would have a 
 happier first experience if Hadoop seemed snappier.
 I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
 0.5 seconds (but have it governed by an undocumented config parameter in case 
 people don't like this change). The cluster size-based ramp up of interval 
 will maintain the current scalable behavior for large clusters with no 
 negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1257) Ability to grab the number of spills

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786216#action_12786216
 ] 

Todd Lipcon commented on MAPREDUCE-1257:


Chris: I don't feel strongly about this. I like it for the exact reason you 
mentioned - makes it easier to tune io.sort.record.percent (or at least see at 
a glance whether such tuning could help). My plan was to backport it into our 
distribution for 20, where a backport of MAPREDUCE-64 is pretty unlikely since 
that change is much riskier.

If no one else wants this, happy to resolve as wontfix. Would be interested to 
hear from the original reporter, though, before doing so.

 Ability to grab the number of spills
 

 Key: MAPREDUCE-1257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1257
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 0.22.0
Reporter: Sriranjan Manjunath
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: mapreduce-1257.txt


 The counters should have information about the number of spills in addition 
 to the number of spill records.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5

2009-12-04 Thread Omer Trajman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omer Trajman updated MAPREDUCE-1097:


Status: Open  (was: Patch Available)

wrong target

 Changes/fixes to support Vertica 3.5
 

 Key: MAPREDUCE-1097
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.5
Reporter: Omer Trajman
Assignee: Omer Trajman
Priority: Minor
 Attachments: MAPREDUCE-1097.patch


 Vertica 3.5 includes three changes that the formatters should handle:
 1) deploy_design function that handles much of the logic in the optimize 
 method.  This improvement uses deploy_design if the server version supports 
 it instead of orchestrating in the formatter function.
 2) truncate table instead of recreating the table
 3) numeric, decimal, money, number types (all the same path)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5

2009-12-04 Thread Omer Trajman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omer Trajman updated MAPREDUCE-1097:


Fix Version/s: 0.21.0
   Status: Patch Available  (was: Open)

 Changes/fixes to support Vertica 3.5
 

 Key: MAPREDUCE-1097
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.5
Reporter: Omer Trajman
Assignee: Omer Trajman
Priority: Minor
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1097.patch


 Vertica 3.5 includes three changes that the formatters should handle:
 1) deploy_design function that handles much of the logic in the optimize 
 method.  This improvement uses deploy_design if the server version supports 
 it instead of orchestrating in the formatter function.
 2) truncate table instead of recreating the table
 3) numeric, decimal, money, number types (all the same path)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786219#action_12786219
 ] 

Todd Lipcon commented on MAPREDUCE-1266:


Well, actually, in trunk there's mapreduce.jobtracker.heartbeats.in.second 
which sets the individual trackers such that that number of heartbeats arrive 
every second. The default is 100, which would be a 10ms interval for a 
pseudo-distributed cluster, which is silly. So there's a minimum as well, 
hardcoded. Here's the relevant code:
{code}
int heartbeatInterval =  Math.max(
(int)(1000 * HEARTBEATS_SCALING_FACTOR *
  Math.ceil((double)clusterSize /
NUM_HEARTBEATS_IN_SECOND)),
HEARTBEAT_INTERVAL_MIN) ;
{code}

HEARTBEAT_INTERVAL_MIN is hard coded to 3 seconds in MRConstants.java.

Maybe I'm misunderstanding your question - are you in support of lowering the 
minimum and just asking why make it undocumented-configurable instead of 
hardcoded? I was offering the undocumented configuration option just in case 
someone had an argument against this change. If everyone's for it, happy to 
just change the constant.

 Allow heartbeat interval smaller than 3 seconds for tiny clusters
 -

 Key: MAPREDUCE-1266
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, task, tasktracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Minor

 For small clusters, the heartbeat interval has a large effect on job latency. 
 This is especially true on pseudo-distributed or other tiny (5 nodes) 
 clusters. It's not a big deal for production, but new users would have a 
 happier first experience if Hadoop seemed snappier.
 I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
 0.5 seconds (but have it governed by an undocumented config parameter in case 
 people don't like this change). The cluster size-based ramp up of interval 
 will maintain the current scalable behavior for large clusters with no 
 negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786225#action_12786225
 ] 

Doug Cutting commented on MAPREDUCE-1114:
-

 I look at this as a 60% speedup in my development cycle rather than a few % 
 speedup in the full build.

I agree with this logic.  My most common development cycle is to run a single 
unit test.  For Avro this takes just a few seconds, and I'm willing to wait 
without finding a new task to work on.  With Hadoop this takes long enough that 
I switch to doing something else, lose my context, etc.  Improving this 
significantly will significantly improve many developers productivity.

I wonder if we can simply check if build/ivy/lib/Hadoop-Hdfs/{common,test} 
exist, and, if they do, assumes they're up-to-date, and only runs Ivy 
otherwise.  Might that be simpler?


 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1267) Fix typo in mapred-default.xml

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-1267:
---

Attachment: mapreduce-1267.txt

Should be committed to both 0.21 and trunk

 Fix typo in mapred-default.xml
 --

 Key: MAPREDUCE-1267
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1267.txt


 There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of 
 mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial 
 patch to fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1267) Fix typo in mapred-default.xml

2009-12-04 Thread Todd Lipcon (JIRA)
Fix typo in mapred-default.xml
--

 Key: MAPREDUCE-1267
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.21.0, 0.22.0
 Attachments: mapreduce-1267.txt

There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of 
mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial patch 
to fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1267) Fix typo in mapred-default.xml

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-1267:
---

Status: Patch Available  (was: Open)

 Fix typo in mapred-default.xml
 --

 Key: MAPREDUCE-1267
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1267.txt


 There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of 
 mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial 
 patch to fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1155) Streaming tests swallow exceptions

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786232#action_12786232
 ] 

Todd Lipcon commented on MAPREDUCE-1155:


Chris: mind if we do that in a separate JIRA? I opened MAPREDUCE-1268. We may 
as well fix the broken tests now when there's a patch that applies and passes, 
and worry about style separately.

 Streaming tests swallow exceptions
 --

 Key: MAPREDUCE-1155
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1155
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.20.1, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1155.txt


 Many of the streaming tests (including TestMultipleArchiveFiles) catch 
 exceptions and print their stack trace rather than failing the job. This 
 means that tests do not fail even when the job fails.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1268) Update streaming tests to JUnit 4 style

2009-12-04 Thread Todd Lipcon (JIRA)
Update streaming tests to JUnit 4 style
---

 Key: MAPREDUCE-1268
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1268
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming, test
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Suggested by Chris in MAPREDUCE-1155

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786234#action_12786234
 ] 

Todd Lipcon commented on MAPREDUCE-1114:


Doug: the slowness is actually in the resolve task which generates the various 
classpath properties in ant. Without caching those properties to disk, there's 
no way to get around running ivy that I can think of. This patch essentially 
persists them to disk between runs, since the majority of the time they don't 
change.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-576) writing to status reporter before consuming standard input causes task failure.

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned MAPREDUCE-576:
-

Assignee: Todd Lipcon

 writing to status reporter before consuming standard input causes task 
 failure.
 ---

 Key: MAPREDUCE-576
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-576
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.20.1
 Environment: amazon ec2 instance created with the given scripts 
 (fedora, small)
Reporter: Karl Anderson
Assignee: Todd Lipcon

 A Hadoop Streaming task which writes a status reporter line before consuming 
 input causes the task to fail.  Writing after consuming input does not fail.
 I caused this failure using a Python reducer and writing a 
 reporter:status:foo\n line to stderr.  Didn't try writing anything else.
 The reducer script which fails:
   #!/usr/bin/env python
   import sys
   if __name__ == __main__:
   sys.stderr.write('reporter:status:foo\n')
   sys.stderr.flush()
   for line in sys.stdin:
   print line
 The reducer script which succeeds:
   #!/usr/bin/env python
   import sys
   if __name__ == __main__:
   for line in sys.stdin:
   sys.stderr.write('reporter:status:foo\n')
   sys.stderr.flush()
   print line
 The hadoop invocation which I used:
 hadoop jar 
 /usr/local/hadoop-0.18.1/contrib/streaming/hadoop-0.18.1-streaming.jar 
 -mapper cat -reducer ./reducer_foo.py -input vectors -output clusters_1 
 -jobconf mapred.map.tasks=512 -jobconf mapred.reduce.tasks=512 -file 
 ./reducer_foo.py
 This is on a 64 node hadoop-ec2 cluster.
 One of the errors listed on the failures page (they all appear to be the 
 same):
 java.io.IOException: subprocess exited successfully
 R/W/S=1/0/0 in:0=1/41 [rec/s] out:0=0/41 [rec/s]
 minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
 HOST=null
 USER=root
 HADOOP_USER=null
 last Hadoop input: |null|
 last tool output: |null|
 Date: Mon Oct 20 19:13:38 EDT 2008
 MROutput/MRErrThread failed:java.lang.NullPointerException
   at 
 org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.setStatus(PipeMapRed.java:497)
   at 
 org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:429)
   at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:103)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 The stderr log for a failed task:
 Exception in thread Timer thread for monitoring mapred 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.xdr_string(GangliaContext.java:195)
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.emitMetric(GangliaContext.java:138)
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.emitRecord(GangliaContext.java:123)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.emitRecords(AbstractMetricsContext.java:304)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:290)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249)
   at java.util.TimerThread.mainLoop(Timer.java:512)
   at java.util.TimerThread.run(Timer.java:462)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786249#action_12786249
 ] 

Hadoop QA commented on MAPREDUCE-1241:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426955/mapreduce-1241.txt
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/console

This message is automatically generated.

 JobTracker should not crash when mapred-queues.xml does not exist
 -

 Key: MAPREDUCE-1241
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1241.txt, mapreduce-1241.txt


 Currently, if you bring up the JobTracker on an old configuration directory, 
 it gets a NullPointerException looking for the mapred-queues.xml file. It 
 should just assume a default queue and continue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-576) writing to status reporter before consuming standard input causes task failure.

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved MAPREDUCE-576.
---

Resolution: Duplicate

 writing to status reporter before consuming standard input causes task 
 failure.
 ---

 Key: MAPREDUCE-576
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-576
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.20.1
 Environment: amazon ec2 instance created with the given scripts 
 (fedora, small)
Reporter: Karl Anderson
Assignee: Todd Lipcon

 A Hadoop Streaming task which writes a status reporter line before consuming 
 input causes the task to fail.  Writing after consuming input does not fail.
 I caused this failure using a Python reducer and writing a 
 reporter:status:foo\n line to stderr.  Didn't try writing anything else.
 The reducer script which fails:
   #!/usr/bin/env python
   import sys
   if __name__ == __main__:
   sys.stderr.write('reporter:status:foo\n')
   sys.stderr.flush()
   for line in sys.stdin:
   print line
 The reducer script which succeeds:
   #!/usr/bin/env python
   import sys
   if __name__ == __main__:
   for line in sys.stdin:
   sys.stderr.write('reporter:status:foo\n')
   sys.stderr.flush()
   print line
 The hadoop invocation which I used:
 hadoop jar 
 /usr/local/hadoop-0.18.1/contrib/streaming/hadoop-0.18.1-streaming.jar 
 -mapper cat -reducer ./reducer_foo.py -input vectors -output clusters_1 
 -jobconf mapred.map.tasks=512 -jobconf mapred.reduce.tasks=512 -file 
 ./reducer_foo.py
 This is on a 64 node hadoop-ec2 cluster.
 One of the errors listed on the failures page (they all appear to be the 
 same):
 java.io.IOException: subprocess exited successfully
 R/W/S=1/0/0 in:0=1/41 [rec/s] out:0=0/41 [rec/s]
 minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
 HOST=null
 USER=root
 HADOOP_USER=null
 last Hadoop input: |null|
 last tool output: |null|
 Date: Mon Oct 20 19:13:38 EDT 2008
 MROutput/MRErrThread failed:java.lang.NullPointerException
   at 
 org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.setStatus(PipeMapRed.java:497)
   at 
 org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:429)
   at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:103)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 The stderr log for a failed task:
 Exception in thread Timer thread for monitoring mapred 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.xdr_string(GangliaContext.java:195)
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.emitMetric(GangliaContext.java:138)
   at 
 org.apache.hadoop.metrics.ganglia.GangliaContext.emitRecord(GangliaContext.java:123)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.emitRecords(AbstractMetricsContext.java:304)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:290)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50)
   at 
 org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249)
   at java.util.TimerThread.mainLoop(Timer.java:512)
   at java.util.TimerThread.run(Timer.java:462)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786256#action_12786256
 ] 

Todd Lipcon commented on MAPREDUCE-1241:


bq. -1 core tests. The patch failed core unit tests.

Failed org.apache.hadoop.mapred.TestMiniMRWithDFS.testWithDFSWithDefaultPort 
which is different than the failure in the last build, and entirely unrelated.

 JobTracker should not crash when mapred-queues.xml does not exist
 -

 Key: MAPREDUCE-1241
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1241.txt, mapreduce-1241.txt


 Currently, if you bring up the JobTracker on an old configuration directory, 
 it gets a NullPointerException looking for the mapred-queues.xml file. It 
 should just assume a default queue and continue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.

2009-12-04 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786259#action_12786259
 ] 

Zheng Shao commented on MAPREDUCE-1254:
---

Got it. It seems a good idea to read and check the checksum.
Will you upload a patch including a simple test case?


 job.xml should add crc check in tasktracker and sub jvm.
 

 Key: MAPREDUCE-1254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: task, tasktracker
Affects Versions: 0.22.0
Reporter: ZhuGuanyin

 Currently job.xml in tasktracker and subjvm are write to local disk through 
 ChecksumFilesystem, and already had crc checksum information, but load the 
 job.xml file without crc check. It would cause the mapred job finished 
 successful but with wrong data because of disk error.  Example: The 
 tasktracker and sub task jvm would load the default configuration if it 
 doesn't successfully load the job.xml which maybe replace the mapper with 
 IdentityMapper. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786261#action_12786261
 ] 

Todd Lipcon commented on MAPREDUCE-1254:


Curious why the XML reading doesn't fail for an empty file. Emptiness is not 
valid XML, right?

 job.xml should add crc check in tasktracker and sub jvm.
 

 Key: MAPREDUCE-1254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: task, tasktracker
Affects Versions: 0.22.0
Reporter: ZhuGuanyin

 Currently job.xml in tasktracker and subjvm are write to local disk through 
 ChecksumFilesystem, and already had crc checksum information, but load the 
 job.xml file without crc check. It would cause the mapred job finished 
 successful but with wrong data because of disk error.  Example: The 
 tasktracker and sub task jvm would load the default configuration if it 
 doesn't successfully load the job.xml which maybe replace the mapper with 
 IdentityMapper. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786265#action_12786265
 ] 

Chris Douglas commented on MAPREDUCE-1114:
--

bq. Comparing the 15 second payoff to the full build time isn't particular 
important to me. For me, the ability to quickly iterate on code while 
recompiling and rerunning unit tests is the big payoff

As a vi user, I got that. I haven't argued that the long build times are 
unimportant, but that a hack introducing a custom caching layer for classpaths 
is not, in my mind, a justifiable tradeoff in complexity. Maintaining black 
magic in the build is tedious and avoidable.

bq. the slowness is actually in the resolve task which generates the various 
classpath properties in ant

Aren't the classpaths named? Would there be a way to short-circuit the 
resolution if it created/checked for a file mapped to that path?

bq. My most common development cycle is to run a single unit test. For Avro 
this takes just a few seconds, and I'm willing to wait without finding a new 
task to work on.

As a workaround: depending on how often I'm running it, adding a {{main}} to 
the unit test is sometimes worthwhile.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned MAPREDUCE-1209:
--

Assignee: Todd Lipcon

 Move common specific part of the test TestReflectionUtils out of mapred into 
 common
 ---

 Key: MAPREDUCE-1209
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: test
Reporter: Vinod K V
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0


 As commented by Tom here 
 (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058),
  TestReflectionUtils has a single test testSetConf() to test backward 
 compatibility of ReflectionUtils for JobConfigurable objects. 
 TestReflectionUtils can be spilt into two tests - one on common and one in 
 mapred - this single test may reside in mapred till the mapred package is 
 removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786268#action_12786268
 ] 

Todd Lipcon commented on MAPREDUCE-1114:


bq. Aren't the classpaths named? Would there be a way to short-circuit the 
resolution if it created/checked for a file mapped to that path?

That is exactly what this patch does...

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common

2009-12-04 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-1209:
---

Fix Version/s: 0.22.0
Affects Version/s: 0.22.0
   0.21.0
   Status: Patch Available  (was: Open)

 Move common specific part of the test TestReflectionUtils out of mapred into 
 common
 ---

 Key: MAPREDUCE-1209
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.21.0, 0.22.0
Reporter: Vinod K V
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.21.0, 0.22.0

 Attachments: mapreduce-1209.txt


 As commented by Tom here 
 (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058),
  TestReflectionUtils has a single test testSetConf() to test backward 
 compatibility of ReflectionUtils for JobConfigurable objects. 
 TestReflectionUtils can be spilt into two tests - one on common and one in 
 mapred - this single test may reside in mapred till the mapred package is 
 removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1050) Introduce a mock object testing framework

2009-12-04 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786273#action_12786273
 ] 

Konstantin Boudnik commented on MAPREDUCE-1050:
---

Well, still fails on my BSD machine. The message is 
{noformat}
TestLostTaskTracker
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.791 sec
- Standard Output ---
2009-12-04 16:43:13,581 INFO  mapred.JobTracker (JobTracker.java:init(1334)) 
- Starting jobtracker with owner as cos and supergroup as supergroup
2009-12-04 16:43:13,587 INFO  mapred.JobTracker 
(JobTracker.java:initializeTaskMemoryRelatedConfig(4086)) - Scheduler 
configured with (memSizeForMapSlotOnJT, memSizeForRedu
ceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1)
2009-12-04 16:43:13,590 INFO  util.HostsFileReader 
(HostsFileReader.java:refresh(81)) - Refreshing hosts (include/exclude) list
2009-12-04 16:43:13,607 INFO  mapred.QueueConfigurationParser 
(QueueConfigurationParser.java:parseResource(170)) - Bad conf file: top-level 
element not queues
-  ---

Testcase: testLostTaskTrackerCalledAfterExpiryTime took 0.763 sec
Caused an ERROR
No queues defined 
java.lang.RuntimeException: No queues defined 
at 
org.apache.hadoop.mapred.QueueConfigurationParser.parseResource(QueueConfigurationParser.java:171)
at 
org.apache.hadoop.mapred.QueueConfigurationParser.loadResource(QueueConfigurationParser.java:163)
at 
org.apache.hadoop.mapred.QueueConfigurationParser.init(QueueConfigurationParser.java:92)
at 
org.apache.hadoop.mapred.QueueManager.getQueueConfigurationParser(QueueManager.java:126)
at org.apache.hadoop.mapred.QueueManager.init(QueueManager.java:146)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1376)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1325)
at 
org.apache.hadoop.mapred.TestLostTaskTracker.setUp(TestLostTaskTracker.java:58)
{noformat}

The other problem: TestLostTaskTracker is JUnit v3 test (it extends TestCase, 
etc.). Please make it to be JUnit v4 (like the other two tests are).

 Introduce a mock object testing framework
 -

 Key: MAPREDUCE-1050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1050
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-1050.patch, MAPREDUCE-1050.patch, 
 MAPREDUCE-1050.patch, MAPREDUCE-1050.patch, MAPREDUCE-1050.patch


 Using mock objects in unit tests can improve code quality (see e.g. 
 http://www.mockobjects.com/). Hadoop would benefit from having a mock object 
 framework for developers to write unit tests with. Doing so will allow a 
 wider range of failure conditions to be tested and the tests will run faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-744) Support in DistributedCache to share cache files with other users after HADOOP-4493

2009-12-04 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-744:
--

Attachment: 744-early.patch

Attaching a preliminary patch for review. What's done there is that the cache 
files are checked at the client side for public/private access, and that 
information (booleans - true for public, false for private) is passed in the 
configuration. The TaskTrackers look at the configuration for each file during 
localization, and, if the file was public they localize it to a common space. 
If not, then the file is localized to the user's private directory.
Testcase is not there yet.

 Support in DistributedCache to share cache files with other users after 
 HADOOP-4493
 ---

 Key: MAPREDUCE-744
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-744
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Vinod K V
 Attachments: 744-early.patch


 HADOOP-4493 aims to completely privatize the files distributed to TT via 
 DistributedCache. This jira issues focuses on sharing some/all of these files 
 with all other users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-744) Support in DistributedCache to share cache files with other users after HADOOP-4493

2009-12-04 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786279#action_12786279
 ] 

Devaraj Das commented on MAPREDUCE-744:
---

I should add that public means world-readable. The entire hierarchy of the 
cache file path is checked for that (starting from the leaf filename to /).

 Support in DistributedCache to share cache files with other users after 
 HADOOP-4493
 ---

 Key: MAPREDUCE-744
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-744
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Vinod K V
 Attachments: 744-early.patch


 HADOOP-4493 aims to completely privatize the files distributed to TT via 
 DistributedCache. This jira issues focuses on sharing some/all of these files 
 with all other users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786305#action_12786305
 ] 

Chris Douglas commented on MAPREDUCE-1114:
--

Then I'm missing something. What is being cached?

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786309#action_12786309
 ] 

Todd Lipcon commented on MAPREDUCE-1114:


When the classpath is resolved, it's written out to a text file named for that 
variable. Then when it needs to be resolved again, if that file exists, it's 
loaded rather than re-resolving.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1262) Eclipse Plugin does not build for Hadoop 0.20.1

2009-12-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1262:
-

Status: Open  (was: Patch Available)

The patch causes the build to 
[fail|http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/console],
 specifically:
{noformat}
 [exec] compile:
 [exec]  [echo] contrib: eclipse-plugin 
 [exec] [javac] Compiling 45 source files to 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h6.grid.sp2.yahoo.net/trunk\
/build/contrib/eclipse-plugin/classes
 [exec] [javac] 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h6.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin\

/src/java/org/apache/hadoop/eclipse/launch/HadoopApplicationLaunchShortcut.java:35:
 cannot find symbol
 [exec] [javac] symbol  : class JavaApplicationLaunchShortcut
 [exec] [javac] location: package 
org.eclipse.jdt.debug.ui.launchConfigurations
 [exec] [javac] import 
org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut;
{noformat}

 Eclipse Plugin does not build for Hadoop 0.20.1
 ---

 Key: MAPREDUCE-1262
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1262
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0
 Environment: SLES 10, Mac OS/X 10.5.8
Reporter: Stephen Watt
 Fix For: 0.20.2, 0.21.0, 0.22.0, 0.20.1

 Attachments: hadoop-0.20.1-eclipse-plugin.jar, HADOOP-6360.patch


 When trying to run the build script for the Eclipse Plugin in 
 src/contrib/eclipse-plugin there are several errors a user receives. The 
 first error is that the eclipse.home is not set. This is easily remedied by 
 adding a value for eclipse.home in the build.properties file in the 
 eclipse-plugin directory.
 The script then states it cannot compile 
 org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut because it 
 cannot resolve JavaApplicationLaunchShortcut on line 35:
   import 
 org.eclipse.jdt.internal.debug.ui.launcher.JavaApplicationLaunchShortcut;
 and fails
 I believe this is because there is no jar in the eclipse.home/plugins that 
 has this class in that package. I did however find it in 
 org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut 
 which was inside in org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar in the 
 plugins dir of Eclipse 3.5
 Changing the import in the class in the source to the latter allows the build 
 to complete successfully. The M/R Perspective opens and works on my SLES 10 
 Linux environment but not on my Macbook Pro. Both are running Eclipse 3.5.
 To users wanting to do the same, I built this inside Eclipse. To do that I 
 added org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar and 
 hadoop-0.20.1-core.jar to the ant runtime configuration classpath. I also had 
 to set the version value=0.20.1 in the build.properties. You will also need 
 to copy hadoop-0.20.1-core.jar to hadoop.home/build and commons-cli-1.2.jar 
 to hadoop.home/build/ivy/lib/Hadoop/common.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786322#action_12786322
 ] 

Chris Douglas commented on MAPREDUCE-1114:
--

I thought the bulk of the problem was re-resolving these properties during the 
same run. Is that mistaken? The current proposal also works across runs, which 
could be helpful, but again: maintaining the build is already a pain. Adding a 
cache to a bad idea is a well established software engineering practice, but 
I'd favor either fixing our use of ivy or replacing it if middling performance 
requires this.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786324#action_12786324
 ] 

Hadoop QA commented on MAPREDUCE-1174:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12426970/MAPREDUCE-1174.2.patch
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/console

This message is automatically generated.

 Sqoop improperly handles table/column names which are reserved sql words
 

 Key: MAPREDUCE-1174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: zhiyong zhang
 Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch


 In some databases it is legal to name tables and columns with terms that 
 overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such 
 cases, the database allows you to escape the table and column names. We 
 should always escape table and column names when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching

2009-12-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786342#action_12786342
 ] 

Todd Lipcon commented on MAPREDUCE-1114:


Ivy already caches the resolves done in the same run, in theory, but there are 
a lot of different resolves, I think? The gain here *is* from caching between 
runs as you surmised.

 Speed up ivy resolution in builds with clever caching
 -

 Key: MAPREDUCE-1114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: mapreduce-1114.txt, mapreduce-1114.txt, 
 mapreduce-1114.txt


 An awful lot of time is spent in the ivy:resolve parts of the build, even 
 when all of the dependencies have been fetched and cached. Profiling showed 
 this was in XML parsing. I have a sort-of-ugly hack which speeds up 
 incremental compiles (and more importantly ant test) significantly using 
 some ant macros to cache the resolved classpaths.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5

2009-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786344#action_12786344
 ] 

Hadoop QA commented on MAPREDUCE-1097:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425767/MAPREDUCE-1097.patch
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/console

This message is automatically generated.

 Changes/fixes to support Vertica 3.5
 

 Key: MAPREDUCE-1097
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
 Environment: Hadoop 0.21.0 pre-release and Vertica 3.5
Reporter: Omer Trajman
Assignee: Omer Trajman
Priority: Minor
 Fix For: 0.21.0

 Attachments: MAPREDUCE-1097.patch


 Vertica 3.5 includes three changes that the formatters should handle:
 1) deploy_design function that handles much of the logic in the optimize 
 method.  This improvement uses deploy_design if the server version supports 
 it instead of orchestrating in the formatter function.
 2) truncate table instead of recreating the table
 3) numeric, decimal, money, number types (all the same path)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.