[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747242#action_12747242
 ] 

Arun C Murthy commented on MAPREDUCE-430:
-

I've been doing some thinking about the 'right' approach for handling 
exceptions and errors in the map/reduce tasks and did bounce some of these 
through Chris too:

# Every code path in the tasks' should propagate the exception/error upwards 
after doing any necessary clean-up in it's own components and sub-components
# We should distinguish between user errors (OOM, IOException etc.) and 
systemic errors (FSError, ChecksumError etc.) and define just two methods on 
the TaskUmbilicalProtocol: userError and systemError. In future these should be 
used to _blacklist_ nodes only on 'systemError', not on 'userError'.
# Child.java:main should be the only place we call the methods on 
TaskUmbilicalProtocol to inform the parent TaskTracker about errors. It should 
unwrap the caught exception and 
# All threads (shuffle copier threads, merger threads, sort/spill threads etc.) 
should catch Throwable and save the exception for the 'main' thread to examine. 
The 'main' thread should examine these at all appropriate places and abort 
correctly.
# We should _never_ *rethrow* exceptions from the 'main' threads - rather we 
should 'wrap' them in appropriate exceptions and throw them with the right 
*initCause*.  This is so that we don't lose the original stack traces.
# We should strive to use the same 'exception' types for the 'wrapper 
exceptions' whenever the exception is part of the signature e.g. IOException 
for map/reduce in the old api and IOException and InterruptedException for 
map/reduce in the new api (it is highly unfortunate that the RPC layer wraps 
InterruptedException in an IOException today! :( ). This is very important 
since the application writer might be relying on the 'right' exception for his 
specific error-handling needs. Thus we should wrap 
IOException/InterruptedException in an IOException and other Exceptions/Errors 
in a RuntimeException.

Thoughts?

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)
TestTaskFail fail sometimes
---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1


TestTaskFail  fail sometimes with following error :
junit.framework.AssertionFailedError
at 
org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
at 
org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-911:
--

Attachment: faillog.txt

Attaching test failure log

 TestTaskFail fail sometimes
 ---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: faillog.txt


 TestTaskFail  fail sometimes with following error :
 junit.framework.AssertionFailedError
   at 
 org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
   at 
 org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-875) Make DBRecordReader execute queries lazily

2009-08-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747263#action_12747263
 ] 

Enis Soztutar commented on MAPREDUCE-875:
-

+1

 Make DBRecordReader execute queries lazily
 --

 Key: MAPREDUCE-875
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-875
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-875.2.patch, MAPREDUCE-875.patch


 DBInputFormat's DBRecordReader executes the user's SQL query in the 
 constructor. If the query is long-running, this can cause task timeout. The 
 user is unable to spawn a background thread (e.g., in a MapRunnable) to 
 inform Hadoop of on-going progress. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747260#action_12747260
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-911:
---

TestTaskFail fails first 2 attempts and passes in thrid attempt. 
From the logs, it shows that before 2nd attempt could get a chance to run, 
third attempt is launched speculatively, which leads to job completion.
So, speculation-execution should be disabled for the test.

 TestTaskFail fail sometimes
 ---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: faillog.txt


 TestTaskFail  fail sometimes with following error :
 junit.framework.AssertionFailedError
   at 
 org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
   at 
 org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-318) Refactor reduce shuffle code

2009-08-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747268#action_12747268
 ] 

Arun C Murthy commented on MAPREDUCE-318:
-

bq. Document rationale of why all fetch threads will never stall i.e. will 
allow one thread to go past always

The patch looks fine, could you comment on the bug you saw with a deadlocked 
shuffle you saw once since I do see you've reverted back to 'original' patch 
wrt the fetcher threads and stalling?

 Refactor reduce shuffle code
 

 Key: MAPREDUCE-318
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-318
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HADOOP-5233_api.patch, HADOOP-5233_part0.patch, 
 mapred-318-14Aug.patch, mapred-318-20Aug.patch, mapred-318-24Aug.patch, 
 mapred-318-common.patch


 The reduce shuffle code has become very complex and entangled. I think we 
 should move it out of ReduceTask and into a separate package 
 (org.apache.hadoop.mapred.task.reduce). Details to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-873) Simplify Job Recovery

2009-08-25 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-873:
-

Attachment: 873_v2.patch

Patch for review. It does following:
Recovery no more depends on job history. Logic to replay history events is 
removed.
Jobs are recovered based on job files present in mapred system dir.
Job info file containing job tracker restart count is retained as it is 
required to avoid task attempt id clashes for recovered jobs.
When job tracker comes up, the job history files from last run are moved to 
mapred.job.tracker.history.completed.location with the suffix added as . + 
jtIdentifier +.old. This is done to avoid over writing the history files for 
recovered jobs.
TestJobTrackerSafeMode, TestJobTrackerRestart and 
TestJobTrackerRestartWithLostTracker are removed. 

 Simplify Job Recovery
 -

 Key: MAPREDUCE-873
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-873
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Devaraj Das
Assignee: Sharad Agarwal
 Fix For: 0.21.0

 Attachments: 873_v1.patch, 873_v2.patch


 On a couple of occasions we have seen the JobTracker not being able to handle 
 job recovery well, and leading to cluster downtime after a restart. The 
 current design for handling job recovery is complex and prone to corner cases 
 not being handled well enough. In retrospect, it seems like the transaction 
 log based approach as was proposed on HADOOP-3245 
 (http://tinyurl.com/luh9hb), would have been a better/simpler model. However, 
 that is a big project, and it seems for the medium term, just handling job 
 re-submissions after a restart is a good tradeoff. That is, the JobTracker 
 after getting restarted, will resubmit all jobs that were running in its past 
 life. They will all start from the beginning (downside is completed tasks 
 will reexecute). In the long term, the transaction log model or some variant 
 of that should be pursued.
 Thoughts/comments welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-873) Simplify Job Recovery

2009-08-25 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-873:
-

Hadoop Flags: [Incompatible change]
  Status: Patch Available  (was: Open)

 Simplify Job Recovery
 -

 Key: MAPREDUCE-873
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-873
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Devaraj Das
Assignee: Sharad Agarwal
 Fix For: 0.21.0

 Attachments: 873_v1.patch, 873_v2.patch


 On a couple of occasions we have seen the JobTracker not being able to handle 
 job recovery well, and leading to cluster downtime after a restart. The 
 current design for handling job recovery is complex and prone to corner cases 
 not being handled well enough. In retrospect, it seems like the transaction 
 log based approach as was proposed on HADOOP-3245 
 (http://tinyurl.com/luh9hb), would have been a better/simpler model. However, 
 that is a big project, and it seems for the medium term, just handling job 
 re-submissions after a restart is a good tradeoff. That is, the JobTracker 
 after getting restarted, will resubmit all jobs that were running in its past 
 life. They will all start from the beginning (downside is completed tasks 
 will reexecute). In the long term, the transaction log model or some variant 
 of that should be pursued.
 Thoughts/comments welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-885) More efficient SQL queries for DBInputFormat

2009-08-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747277#action_12747277
 ] 

Enis Soztutar commented on MAPREDUCE-885:
-

Data driven splits are really neat. Just a few suggestions 
- We can add a getSplitter(int sqlDataType) method to DDDBIF and move sql type 
- DBSplitter instance mapping, so that classes extending it can easily 
override this logic, for skewed data, etc. 
- Introduce DDDBRR extending DBRR in DDDBIF and move getDataBasedSelectQuery() 
as an overridden implementation of getSelectQuery(). 
- Do we need mapred.lib.db.DDDBIF since it is introduced as deprecated. I know 
that lot's of legacy code is using the old API, but adding a already deprecated 
class seems odd. 


 More efficient SQL queries for DBInputFormat
 

 Key: MAPREDUCE-885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-885
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-885.patch


 DBInputFormat generates InputSplits by counting the available rows in a 
 table, and selecting subsections of the table via the LIMIT and OFFSET 
 SQL keywords. These are only meaningful in an ordered context, so the query 
 also includes an ORDER BY clause on an index column. The resulting queries 
 are often inefficient and require full table scans. Actually using multiple 
 mappers with these queries can lead to O(n^2) behavior in the database, where 
 n is the number of splits. Attempting to use parallelism with these queries 
 is counter-productive.
 A better mechanism is to organize splits based on data values themselves, 
 which can be performed in the WHERE clause, allowing for index range scans of 
 tables, and can better exploit parallelism in the database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-911:
--

Attachment: patch-911.txt
patch-911-0.20.txt

Patch for branch 0.20 and trunk

 TestTaskFail fail sometimes
 ---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: faillog.txt, patch-911-0.20.txt, patch-911.txt


 TestTaskFail  fail sometimes with following error :
 junit.framework.AssertionFailedError
   at 
 org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
   at 
 org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: jobtracker_configurationdump.txt

uploading the file that contains the configuration dump when -dumpConfiguration 
option is given.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: MAPREDUCE-768-5.patch

re-uploading the patch for hudson to pick it up.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-875) Make DBRecordReader execute queries lazily

2009-08-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747314#action_12747314
 ] 

Hadoop QA commented on MAPREDUCE-875:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12416939/MAPREDUCE-875.2.patch
  against trunk revision 807165.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/513/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/513/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/513/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/513/console

This message is automatically generated.

 Make DBRecordReader execute queries lazily
 --

 Key: MAPREDUCE-875
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-875
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-875.2.patch, MAPREDUCE-875.patch


 DBInputFormat's DBRecordReader executes the user's SQL query in the 
 constructor. If the query is long-running, this can cause task timeout. The 
 user is unable to spawn a background thread (e.g., in a MapRunnable) to 
 inform Hadoop of on-going progress. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna reassigned MAPREDUCE-768:
---

Assignee: V.V.Chaitanya Krishna

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-768:
---

Status: Patch Available  (was: Open)

+1 for code changes. Trying hudson.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747325#action_12747325
 ] 

V.V.Chaitanya Krishna commented on MAPREDUCE-768:
-

All the tests passed locally, i.e., for contrib and core. 

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-08-25 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Attachment: MAPREDUCE-856-20090825.3.txt

Updated patch, synching with latest patches at MAPREDUCE-476 and MAPREDUCE-871 
which are blockers for this issue, the later also because of clash of changes. 

Included all of the above review comments except one.

The one related to cleanup of stale user directories is not going to be done as 
part of this issue. As implementation went on, it became more and more 
complicated, the latest complexity added due to the fact that cleanup of 
distributed cache files only removes the files and leaves behind a lot of empty 
path components behind. It is still doable but stretches the scope of the 
current issue a bit. Stale dirs cleanup can be done as part of a follow-up 
issue.

 Localized files from DistributedCache should have right access-control
 --

 Key: MAPREDUCE-856
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
 MAPREDUCE-856-20090825.3.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-08-25 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747330#action_12747330
 ] 

Vinod K V commented on MAPREDUCE-856:
-

Running tests locally as the blocker are not yet checked in.

 Localized files from DistributedCache should have right access-control
 --

 Key: MAPREDUCE-856
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
 MAPREDUCE-856-20090825.3.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747334#action_12747334
 ] 

Devaraj Das commented on MAPREDUCE-430:
---

I'd be happier with the previous approach for 0.20. I believe it doesn't break 
the existing behavior (I will look at the last patch in more detail today). 
This seems like a good first step towards a probably better fix for the issue 
as is proposed in the last comment. The last proposal seems like a much bigger 
change and I propose that we thrash that design/implementation out as part of 
0.21. Thoughts?

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: MAPREDUCE-768-6.patch

The mapred-default.xml contains two properties related to queues, which are 
also present in QueueManager's xml file. Uploading patch with these properties 
removed from mapred-default.xml to prevent duplication.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-6.patch, 
 MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-768:
-

Status: Open  (was: Patch Available)

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, 
 MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, 
 MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-6.patch, 
 MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: commands_manual.pdf

uploading the documentation in pdf format.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: MAPREDUCE-768-7.patch

Uploading the patch with changes needed for documentation done.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768-7.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-894) DBInputformat not working with SQLServer

2009-08-25 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated MAPREDUCE-894:


Affects Version/s: 0.21.0
   Issue Type: New Feature  (was: Bug)

 DBInputformat not working with SQLServer
 

 Key: MAPREDUCE-894
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-894
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 0.21.0
Reporter: Budianto Lie
 Attachments: MAPREDUCE-894.patch


 org.apache.hadoop.mapreduce.lib.db.DBInputFormat
 Microsoft SQLServer doesn't support LIMIT and OFFSET.
 Fix:
 Based on MAPREDUCE-716, I already implemented it.
 By creating a new class 
 org.apache.hadoop.mapreduce.lib.db.MsSqlDBRecordReader 
 and modifying class org.apache.hadoop.mapreduce.lib.db.DBInputFormat 
 Note: this fix is working only with SQLServer 2005 or higher.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747391#action_12747391
 ] 

Devaraj Das commented on MAPREDUCE-430:
---

The code {noformat}
 LOG.fatal(STRING, throwable);
 try {
   umbilical.fatalError(getTaskID(), t);
 } catch (IOException ioe) {
   LOG.fatal(Failed to contact the tasktracker, t);
   System.exit(-1);
 }
{noformat} can be factored out to a method.. Other than that patch looks okay.. 

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747394#action_12747394
 ] 

V.V.Chaitanya Krishna commented on MAPREDUCE-768:
-

tests and test-patch ran successfully.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768-7.patch, MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler

2009-08-25 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-824:
---

Status: Patch Available  (was: Open)

Trying hudson.

 Support a hierarchy of queues in the capacity scheduler
 ---

 Key: MAPREDUCE-824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/capacity-sched
Reporter: Hemanth Yamijala
 Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, 
 HADOOP-824-3.patch, HADOOP-824-4.patch, HADOOP-824-5.patch, 
 MAPREDUCE-824-6.patch


 Currently in Capacity Scheduler, cluster capacity is divided among the queues 
 based on the queue capacity. These queues typically represent an organization 
 and the capacity of the queue represents the capacity the organization is 
 entitled to. Most organizations are large and need to divide their capacity 
 among sub-organizations they have. Or they may want to divide the capacity 
 based on a category or type of jobs they run. This JIRA covers the 
 requirements and other details to provide the above feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler

2009-08-25 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-824:
---

Attachment: MAPREDUCE-824-6.patch

Attached patch incorporates all comments I raised, except the following:
- JobQueue.removeWaitingJob is used by JobInitializationPoller. Hence, retained 
the access level as package private.
- I also did not fold methods in JobQueue and JobQueuesManager since most 
seemed to be used at more than one place.

In addition to the comments, Rahul and I also found and fixed a bug in 
getMaxCapacity() and changed the default value of the same to -1 in the 
capacity-scheduler.xml.template.

Rahul, can you please take a look at the changes and make sure you are fine 
with them ?

 Support a hierarchy of queues in the capacity scheduler
 ---

 Key: MAPREDUCE-824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/capacity-sched
Reporter: Hemanth Yamijala
 Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, 
 HADOOP-824-3.patch, HADOOP-824-4.patch, HADOOP-824-5.patch, 
 MAPREDUCE-824-6.patch


 Currently in Capacity Scheduler, cluster capacity is divided among the queues 
 based on the queue capacity. These queues typically represent an organization 
 and the capacity of the queue represents the capacity the organization is 
 entitled to. Most organizations are large and need to divide their capacity 
 among sub-organizations they have. Or they may want to divide the capacity 
 based on a category or type of jobs they run. This JIRA covers the 
 requirements and other details to provide the above feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread V.V.Chaitanya Krishna (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

V.V.Chaitanya Krishna updated MAPREDUCE-768:


Attachment: MAPREDUCE-768-ydist.patch

uploading patch for the internal Yahoo! distribution

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768-7.patch, MAPREDUCE-768-ydist.patch, 
 MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747419#action_12747419
 ] 

Hemanth Yamijala commented on MAPREDUCE-768:


Manual tests have been run to verify the command line works as expected. 
Regression testing has been done with respect to starting the JT without the 
Command line option and verifying it starts up as usual. Jobs submitted run 
fine.

On the basis of these tests, I will commit the patch to trunk.

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768-7.patch, MAPREDUCE-768-ydist.patch, 
 MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.

2009-08-25 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-768:
---

   Resolution: Fixed
Fix Version/s: 0.21.0
 Release Note: 
Provides an ability to dump jobtracker configuration in JSON format to standard 
output and exits.
To dump, use hadoop jobtracker -dumpConfiguration
The format of the dump is 
{properties:[{key:key,value:value,isFinal:true/false,resource : 
resource}] }
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I committed this to trunk. Thanks, Chaitanya !

 Configuration information should generate dump in a standard format.
 

 Key: MAPREDUCE-768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: rahul k singh
Assignee: V.V.Chaitanya Krishna
 Fix For: 0.21.0

 Attachments: commands_manual.pdf, jobtracker_configurationdump.txt, 
 MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, 
 MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, 
 MAPREDUCE-768-6.patch, MAPREDUCE-768-7.patch, MAPREDUCE-768-ydist.patch, 
 MAPREDUCE-768.patch


  We need to generate the configuration dump in a standard format .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747454#action_12747454
 ] 

Hadoop QA commented on MAPREDUCE-911:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12417581/patch-911.txt
  against trunk revision 807543.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 202 release audit warnings 
(more than the trunk's current 0 warnings).

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/515/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/515/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/515/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/515/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/515/console

This message is automatically generated.

 TestTaskFail fail sometimes
 ---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: faillog.txt, patch-911-0.20.txt, patch-911.txt


 TestTaskFail  fail sometimes with following error :
 junit.framework.AssertionFailedError
   at 
 org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
   at 
 org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-906) Updated Sqoop documentation

2009-08-25 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-906:


Status: Patch Available  (was: Open)

 Updated Sqoop documentation
 ---

 Key: MAPREDUCE-906
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-906
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.patch


 Here's the latest documentation for Sqoop, in both user-guide and manpage 
 form. Built with asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747506#action_12747506
 ] 

Aaron Kimball commented on MAPREDUCE-430:
-

Arun,

Can you elaborate on the difference between user and system errors? I can 
imagine IOException getting thrown from within Hadoop internals just as much as 
from within a user's map() method. The distinction you mentioned above seems 
somewhat arbitrary to me at first glance.

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747544#action_12747544
 ] 

Arun C Murthy commented on MAPREDUCE-430:
-

I guess we should continue to debate the semantics of exception-handling in a 
separate jira - and get this in as an emergency fix. 

Wrt the patch, I'd be happier to with 2 changes:

# Don't rethrow exceptions:
# 'Unwrap' to get the 'cause' in Child.java 
(http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Throwable.html#getCause%28%29).
 
Thoughts?

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747546#action_12747546
 ] 

Arun C Murthy commented on MAPREDUCE-430:
-

To clarify: both catch clauses (Exception and Throwable) should do the unwrap 
via getCause() and use that if it's not null to report to TT, log etc.

 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-08-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747550#action_12747550
 ] 

Arun C Murthy commented on MAPREDUCE-430:
-

{noformat}
+  /** Report that the task encounted a fatal error.*/
+  void fatalError(TaskAttemptID taskId, Throwable throwable) throws 
IOException;
+  
{noformat}

is wrong - it should be:

{noformat}
+  /** Report that the task encounted a fatal error.*/
+  void fatalError(TaskAttemptID taskId, String message) throws IOException;
+  
{noformat}


 Task stuck in cleanup with OutOfMemoryErrors
 

 Key: MAPREDUCE-430
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-430-v1.11.patch, 
 MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, 
 MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, 
 MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch


 Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2

2009-08-25 Thread gary murry (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747576#action_12747576
 ] 

gary murry commented on MAPREDUCE-767:
--

Hi guys,  Can we get a comment on why no new unit tests were needed?  Thanks.

 to remove mapreduce dependency on commons-cli2
 --

 Key: MAPREDUCE-767
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-767
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
Affects Versions: 0.20.1
Reporter: Giridharan Kesavan
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-767-v1.1.patch, MAPREDUCE-767-v1.2.patch, 
 MAPREDUCE-767-v1.3-branch-0.20.patch, MAPREDUCE-767-v1.3.patch


 mapreduce, streaming and eclipse plugin depends on common-cli2 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2

2009-08-25 Thread Lee Tucker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lee Tucker reopened MAPREDUCE-767:
--


It looks like the commons-cli-2.0-SNAPSHOT.jar wasn't deleted.  Please fix the 
commit.

 to remove mapreduce dependency on commons-cli2
 --

 Key: MAPREDUCE-767
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-767
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
Affects Versions: 0.20.1
Reporter: Giridharan Kesavan
Assignee: Amar Kamat
 Fix For: 0.20.1

 Attachments: MAPREDUCE-767-v1.1.patch, MAPREDUCE-767-v1.2.patch, 
 MAPREDUCE-767-v1.3-branch-0.20.patch, MAPREDUCE-767-v1.3.patch


 mapreduce, streaming and eclipse plugin depends on common-cli2 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-892) command line tool to list all tasktrackers and their status

2009-08-25 Thread Dmytro Molkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Molkov updated MAPREDUCE-892:


Attachment: 0001-First-iteration-of-adding-the-task-tracker-reporting.patch

This patch adds a new command line argument to the JobClient: 
-list-trackers-info
It is an improvement of -list-active-trackers/-list-blacklisted-trackers that 
actually prints out information about a number of tasks currently running on 
the tracker, tasks progress, maximum task capacity, etc.

Also added a jsp page that returns all the same info in JSON format

 command line tool to list all tasktrackers and their status
 ---

 Key: MAPREDUCE-892
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-892
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Dmytro Molkov
 Attachments: 
 0001-First-iteration-of-adding-the-task-tracker-reporting.patch


 The hadoop mradmin -report could list all the tasktrackers that the 
 JobTracker knows about. It will also list a brief status summary for each of 
 the TaskTracker. (This is similar to the hadop dfsadmin -report command that 
 lists all Datanodes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-885) More efficient SQL queries for DBInputFormat

2009-08-25 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-885:


Status: Open  (was: Patch Available)

 More efficient SQL queries for DBInputFormat
 

 Key: MAPREDUCE-885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-885
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-885.2.patch, MAPREDUCE-885.patch


 DBInputFormat generates InputSplits by counting the available rows in a 
 table, and selecting subsections of the table via the LIMIT and OFFSET 
 SQL keywords. These are only meaningful in an ordered context, so the query 
 also includes an ORDER BY clause on an index column. The resulting queries 
 are often inefficient and require full table scans. Actually using multiple 
 mappers with these queries can lead to O(n^2) behavior in the database, where 
 n is the number of splits. Attempting to use parallelism with these queries 
 is counter-productive.
 A better mechanism is to organize splits based on data values themselves, 
 which can be performed in the WHERE clause, allowing for index range scans of 
 tables, and can better exploit parallelism in the database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-885) More efficient SQL queries for DBInputFormat

2009-08-25 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-885:


Attachment: MAPREDUCE-885.2.patch

Attaching new patch that incorporates Enis's comments.

Enis -- I've made the change to add DDDBRR extending DBRR and added a 
MySQLDDDBRR on top of that after all. I think it's cleaner this way after all.

 More efficient SQL queries for DBInputFormat
 

 Key: MAPREDUCE-885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-885
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-885.2.patch, MAPREDUCE-885.patch


 DBInputFormat generates InputSplits by counting the available rows in a 
 table, and selecting subsections of the table via the LIMIT and OFFSET 
 SQL keywords. These are only meaningful in an ordered context, so the query 
 also includes an ORDER BY clause on an index column. The resulting queries 
 are often inefficient and require full table scans. Actually using multiple 
 mappers with these queries can lead to O(n^2) behavior in the database, where 
 n is the number of splits. Attempting to use parallelism with these queries 
 is counter-productive.
 A better mechanism is to organize splits based on data values themselves, 
 which can be performed in the WHERE clause, allowing for index range scans of 
 tables, and can better exploit parallelism in the database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-911) TestTaskFail fail sometimes

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747748#action_12747748
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-911:
---

-1 core tests. Due to MAPREDUCE-880
-1 release audit. Looks spurious. Patch does not add any new files.

 TestTaskFail fail sometimes
 ---

 Key: MAPREDUCE-911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: faillog.txt, patch-911-0.20.txt, patch-911.txt


 TestTaskFail  fail sometimes with following error :
 junit.framework.AssertionFailedError
   at 
 org.apache.hadoop.mapred.TestTaskFail.validateJob(TestTaskFail.java:136)
   at 
 org.apache.hadoop.mapred.TestTaskFail.testWithDFS(TestTaskFail.java:181)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-912) apache license header missing for some java files

2009-08-25 Thread Amareshwari Sriramadasu (JIRA)
apache license header missing for some java files
-

 Key: MAPREDUCE-912
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-912
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
 Fix For: 0.21.0


The following files do not have apache license header :
src/java/org/apache/hadoop/mapred/lib/db/DBWritable.java
src/java/org/apache/hadoop/mapreduce/Counters.java
src/test/mapred/org/apache/hadoop/mapred/lib/db/TestConstructQuery.java
src/examples/org/apache/hadoop/examples/WordCount.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-08-25 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747751#action_12747751
 ] 

Sharad Agarwal commented on MAPREDUCE-370:
--

+1

 Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
 ---

 Key: MAPREDUCE-370
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler

2009-08-25 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-824:


Status: Open  (was: Patch Available)

 Support a hierarchy of queues in the capacity scheduler
 ---

 Key: MAPREDUCE-824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/capacity-sched
Reporter: Hemanth Yamijala
 Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, 
 HADOOP-824-3.patch, HADOOP-824-4.patch, HADOOP-824-5.patch, 
 MAPREDUCE-824-6.patch


 Currently in Capacity Scheduler, cluster capacity is divided among the queues 
 based on the queue capacity. These queues typically represent an organization 
 and the capacity of the queue represents the capacity the organization is 
 entitled to. Most organizations are large and need to divide their capacity 
 among sub-organizations they have. Or they may want to divide the capacity 
 based on a category or type of jobs they run. This JIRA covers the 
 requirements and other details to provide the above feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-893) Provide an ability to refresh queue configuration without restart.

2009-08-25 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747790#action_12747790
 ] 

Hemanth Yamijala commented on MAPREDUCE-893:


As explained above, we have this feature for the properties that the framework 
handles w.r.to queues - ACLs and state.

We can draw the scope of this JIRA from what exists already.

- The existing framework for refresh of ACLs and state relies on a command 
*hadoop mradmin -refreshQueues*. 
- This command causes QueueManager to reload the configuration from the 
mapred-queues.xml file.
- If there's a syntactic or semantic error in reload, the refresh command fails 
with an exception that is sent back to the hadoop mradmin command.
- Importantly, the existing configuration is untouched and the system is left 
in a consistent state
- The UGI of the administrator who raised the refresh is logged to the JT log 
for audit purposes.

I believe all of these generic requirements will apply for the current JIRA as 
well.

The extension of scope is in the following manner:

- First the framework processes a reload of the configuration for properties it 
manages.
- If it passes, the framework will call the scheduler to refresh queue 
properties that are managed by the scheduler. However, it will not commit its 
changes until this call succeeds.
- In MAPREDUCE-861, we suggested that schedulers which could define their 
properties as key-value pairs can do so in the format we suggest 
[here|https://issues.apache.org/jira/browse/MAPREDUCE-861?focusedCommentId=12744910page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12744910].
 (Look at the *properties* tag under queue). The framework can pass the list of 
properties per queue to the scheduler, maybe as a Map of queue-name and 
properties.
- If the scheduler cannot process this information, it will throw an error (via 
an exception or a return value) and discard the changes itself. The framework 
will likewise discard it own changes and return error to the client with an 
appropriate message.
- It is possible that some of the properties may not be refresh-able. For e.g. 
we are not going to handle new queues getting added or deleted. I think we 
should give a return value back to the refreshQueues to indicate this.

Does this work ?

 Provide an ability to refresh queue configuration without restart.
 --

 Key: MAPREDUCE-893
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-893
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Hemanth Yamijala

 While administering a cluster using multiple queues, administrators feel a 
 need to refresh queue properties on the fly without needing to restart the 
 JobTracker. This is partially supported for some properties such as queue 
 ACLs (HADOOP-5396) and state (HADOOP-5913). The idea is to extend the 
 facility to refresh other queue properties as well, including scheduler 
 properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.