[jira] Updated: (MAPREDUCE-1542) Deprecate mapred.permissions.supergroup in favor of hadoop.cluster.administrators

2010-03-12 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1542:


Attachment: mapreduce-1542-y20s.patch

The patch Ravi attached earlier on had changes for HDFS also. After discussing 
with him, Vinod and Devaraj, we feel that the changes in HDFS are not required 
at the moment and do not form the scope of this JIRA. I am attaching a new 
patch that removes those changes. I will be reviewing this. The patch is still 
not meant for commit here.

 Deprecate mapred.permissions.supergroup in favor of 
 hadoop.cluster.administrators
 -

 Key: MAPREDUCE-1542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Vinod K V
Assignee: Ravi Gummadi
 Fix For: 0.22.0

 Attachments: 1542.20S.patch, 1542.patch, 1542.v1.patch, 
 mapreduce-1542-y20s.patch


 HADOOP-6568 added the configuration {{hadoop.cluster.administrators}} through 
 which admins can configure who the superusers/supergroups for the cluster 
 are. MAPREDUCE itself already has {{mapred.permissions.supergroup}} (which is 
 just a single group). As agreed upon at HADOOP-6568, this should be 
 deprecated in favor of {{hadoop.cluster.administrators}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1595) LinuxTaskController is too strict on the initial ownership of files/dir.

2010-03-12 Thread Vinod K V (JIRA)
LinuxTaskController is too strict on the initial ownership of files/dir.


 Key: MAPREDUCE-1595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1595
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security, task-controller
Reporter: Vinod K V


Linux task controller is too strict now w.r.t the initial ownership of the 
files/dir that it tries to make as secure as possible. Currently, it expects, 
for e.g, the mapred-local/tasktracker/user-dir to be both user-owned and 
group-owned by TT. This leads to unrecoverable failures in some corner cases.

It can instead allow the files/dirs to be owned either by TT *or* by the 
jobOwner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1595) LinuxTaskController is too strict on the initial ownership of files/dir.

2010-03-12 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844452#action_12844452
 ] 

Vinod K V commented on MAPREDUCE-1595:
--

Consider this failure case:
 - Two mapred-local-dirs /a/b/c/ML1 and /a/b/c/ML2.
 - TT first creates userdir Alice in both the dirs, i.e. 
/a/b/c/ML1/taskTracker/Alice and /a/b/c/ML2/taskTracker/Alice
 - TT then launched LinuxTaskController for 'securifying' the user-dirs aka 
MAPREDUCE-856.
 - LinuxTaskController then successfully changes ownership of 
/a/b/c/ML1/taskTracker/Alice to Alice:tt_group but then fails on changing 
ownership of /a/b/c/ML2/taskTracker/Alice due to a transitory disk problem.

The above result in the failure of current task because of a failed 
INITIALIZE_USER operation.
Not just that, every other task of Alice that ever comes on this TT will try 
INITIALIZE_USER and fail because LinuxTaskController see Alice:tt_group on 
/a/b/c/ML1/taskTracker/Alice and fails saying the file is not owned by TT.

This can otherwise only be rectified when TT restarts/reinits and moves and 
deletes the old mapred-local-dirs.

 LinuxTaskController is too strict on the initial ownership of files/dir.
 

 Key: MAPREDUCE-1595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1595
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security, task-controller
Reporter: Vinod K V
 Fix For: 0.22.0


 Linux task controller is too strict now w.r.t the initial ownership of the 
 files/dir that it tries to make as secure as possible. Currently, it expects, 
 for e.g, the mapred-local/tasktracker/user-dir to be both user-owned and 
 group-owned by TT. This leads to unrecoverable failures in some corner cases.
 It can instead allow the files/dirs to be owned either by TT *or* by the 
 jobOwner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1595) LinuxTaskController is too strict on the initial ownership of files/dir.

2010-03-12 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-1595:
-

Fix Version/s: 0.22.0

 LinuxTaskController is too strict on the initial ownership of files/dir.
 

 Key: MAPREDUCE-1595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1595
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security, task-controller
Reporter: Vinod K V
 Fix For: 0.22.0


 Linux task controller is too strict now w.r.t the initial ownership of the 
 files/dir that it tries to make as secure as possible. Currently, it expects, 
 for e.g, the mapred-local/tasktracker/user-dir to be both user-owned and 
 group-owned by TT. This leads to unrecoverable failures in some corner cases.
 It can instead allow the files/dirs to be owned either by TT *or* by the 
 jobOwner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844462#action_12844462
 ] 

Hadoop QA commented on MAPREDUCE-1593:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438552/MAPREDUCE-1593-20100311.patch
  against trunk revision 922047.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/522/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/522/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/522/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/522/console

This message is automatically generated.

 [Rumen] Improvements to random seed generation 
 ---

 Key: MAPREDUCE-1593
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0, 0.22.0
Reporter: Tamas Sarlos
Assignee: Tamas Sarlos
Priority: Trivial
 Fix For: 0.21.0, 0.22.0

 Attachments: MAPREDUCE-1593-20100311.patch


 RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
 reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
 the call thread safe as well. Neither of these is an issue with the current 
 client, the mumak simulator, but the changes are small and make the code more 
 useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1429) New ant target to run all and only the linux task-controller related tests

2010-03-12 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844463#action_12844463
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1429:


Can ant test-task-controller be added to this target ?

 New ant target to run all and only the linux task-controller related tests
 --

 Key: MAPREDUCE-1429
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1429
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build, task-controller, test
Reporter: Vinod K V
Assignee: Vinod K V
 Attachments: MAPREDUCE-1429-20100120.txt


 The LinuxTaskController tests cannot be run automatically by Hudson and so 
 we've missed several bugs in the past because of not running some of these 
 tests explicitly ourselves. It's a real pain to run them manually one by one, 
 we should have an ant target to run them all in one swoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1589) Need streaming examples in mapred/src/examples/streaming

2010-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844522#action_12844522
 ] 

Hadoop QA commented on MAPREDUCE-1589:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438557/streaming_example_bigrams.patch
  against trunk revision 922047.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/523/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/523/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/523/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/523/console

This message is automatically generated.

 Need streaming examples in mapred/src/examples/streaming
 

 Key: MAPREDUCE-1589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1589
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: examples
Reporter: Alok Singh
Priority: Minor
 Fix For: 0.20.1, 0.20.2, 0.20.3

 Attachments: streaming_example_bigrams.patch, 
 streaming_example_bigrams.patch


 Hi,
  The examples directory contains the examples for pipes, java mapred but not 
 for the streaming.
 We are planning to add the test cases for the streaming in the examples 
 respository
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-03-12 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-1594:
-

Attachment: 1594-yhadoop-20-1xx.patch

Attaching the first cut patch for 1594 for yhadoop 20.1xx branch. 
This patch is dependent on 1376 yhadoop 20.1xx patch . 

To make it work in apply 
https://issues.apache.org/jira/secure/attachment/12438576/1376-2-yhadoop-security.patch;
 and on top of it apply this patch.

 Support for Sleep Jobs in gridmix
 -

 Key: MAPREDUCE-1594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/gridmix
Reporter: rahul k singh
 Attachments: 1594-yhadoop-20-1xx.patch


 Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-03-12 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-1594:
-

Attachment: 1594-yhadoop-20-1xx-1.patch

Earlier patch was incomplete , attaching the correct patch

 Support for Sleep Jobs in gridmix
 -

 Key: MAPREDUCE-1594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/gridmix
Reporter: rahul k singh
 Attachments: 1594-yhadoop-20-1xx-1.patch, 1594-yhadoop-20-1xx.patch


 Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-889) binary communication formats added to Streaming by HADOOP-1722 should be documented

2010-03-12 Thread Klaas Bosteels (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaas Bosteels updated MAPREDUCE-889:
-

Attachment: MAPREDUCE-889.patch

Does the attached patch provide the documentation you had in mind, Amareshwari?

 binary communication formats added to Streaming by HADOOP-1722 should be 
 documented
 ---

 Key: MAPREDUCE-889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-889
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Reporter: Amareshwari Sriramadasu
Assignee: Klaas Bosteels
Priority: Blocker
 Fix For: 0.21.0

 Attachments: MAPREDUCE-889.patch


 binary communication formats added to Streaming by HADOOP-1722 should be 
 documented in forrest

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844586#action_12844586
 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1579:
---

 One minor thing: Do you really want to create a new test source, instead of 
 using the existing TestHarFileSystem.java. Wouldn't it be better to merge all 
 Har-related tests on the same .java file?

I actually began with add the new tests to TestHarFileSystem.  However, there 
are subtle difference in the new tests compared with the existing tests.  So I 
created a new file.

 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mahadev konar
 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844590#action_12844590
 ] 

Hadoop QA commented on MAPREDUCE-1579:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438566/m1579_20100311.patch
  against trunk revision 922047.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/524/console

This message is automatically generated.

 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mahadev konar
 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844599#action_12844599
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1579:


+1

Everything looks good to me.

 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mahadev konar
 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1596) MapReduce trunk snapshot is not being published to maven

2010-03-12 Thread Aaron Kimball (JIRA)
MapReduce trunk snapshot is not being published to maven


 Key: MAPREDUCE-1596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Aaron Kimball
Priority: Critical


The hadoop-core and hadoop-hdfs artifacts are pushed to maven on a regular 
basis (daily?), but hadoop-mapreduce has not been updated since 2/18/10. Is 
there something automatic in Hudson that is configured for these core and hdfs, 
but not mapred?

Downstream projects that try to build against Hadoop's trunk (via Ivy or Maven) 
cannot compile due to API inconsistency here.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
--

 Priority: Blocker  (was: Major)
Fix Version/s: 0.22.0
   0.21.0
   0.20.3

 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 0.20.3, 0.21.0, 0.22.0

 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844631#action_12844631
 ] 

Hudson commented on MAPREDUCE-1579:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #275 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/275/])
. archive: check and possibly replace the space charater in source paths.


 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 0.20.3, 0.21.0, 0.22.0

 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1579:
--

   Resolution: Fixed
Fix Version/s: (was: 0.20.3)
   (was: 0.21.0)
 Assignee: Tsz Wo (Nicholas), SZE  (was: Mahadev konar)
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I have committed this.

 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Fix For: 0.22.0

 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1589) Need streaming examples in mapred/src/examples/streaming

2010-03-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1589:
-

  Description: 
The examples directory contains the examples for pipes, java mapred but not for 
the streaming.

We are planning to add the test cases for the streaming in the examples 
respository

  was:
Hi,

 The examples directory contains the examples for pipes, java mapred but not 
for the streaming.

We are planning to add the test cases for the streaming in the examples 
respository

Alok

Fix Version/s: (was: 0.20.3)
   (was: 0.20.2)
   (was: 0.20.1)
   0.22.0
 Assignee: Alok Singh
   Issue Type: Improvement  (was: New Feature)

 Need streaming examples in mapred/src/examples/streaming
 

 Key: MAPREDUCE-1589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1589
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Reporter: Alok Singh
Assignee: Alok Singh
Priority: Minor
 Fix For: 0.22.0

 Attachments: streaming_example_bigrams.patch, 
 streaming_example_bigrams.patch


 The examples directory contains the examples for pipes, java mapred but not 
 for the streaming.
 We are planning to add the test cases for the streaming in the examples 
 respository

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1579) archive: check and possibly replace the space charater in paths

2010-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844664#action_12844664
 ] 

Hudson commented on MAPREDUCE-1579:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #276 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/276/])
Move  from 0.21 to trunk in CHANGES.txt.


 archive: check and possibly replace the space charater in paths
 ---

 Key: MAPREDUCE-1579
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1579
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Fix For: 0.22.0

 Attachments: m1579_20100310.patch, m1579_20100310b.patch, 
 m1579_20100311.patch


 Since the space character is used as a separator in the index files, it won't 
 work if there are spaces in the path (see also HADOOP-6591).  The archive 
 tools should 
 # detect if there are spaces in the paths and 
 # provide an option to replace it with some other characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1597) combinefileinputformat does not work with non-splittable files

2010-03-12 Thread Namit Jain (JIRA)
combinefileinputformat does not work with non-splittable files
--

 Key: MAPREDUCE-1597
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1597
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Namit Jain


CombineFileInputFormat.getSplits() does not take into account whether a file is 
splittable.
This can lead to a problem for compressed text files - for example, getSplits() 
may return more
than 1 split depending on the size of the compressed file, all the splits 
recordreader will read the
complete file.

I ran into this problem while using Hive on hadoop 20.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1376) Support for varied user submission in Gridmix

2010-03-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844723#action_12844723
 ] 

Chris Douglas commented on MAPREDUCE-1376:
--

The attached patch, even with security disabled, did not work for me using 
{{RoundRobinUserResolver}} (user from trace replaced with {{traceuser}}, user 
from userlist replaced with {{targetuser}}):
{noformat}
10/03/12 20:20:52 WARN gridmix.JobSubmitter: Failed to submit GRIDMIX00106 as 
targetuser via traceuser
org.apache.hadoop.ipc.RemoteException: User: traceuser is not allowed to 
impersonate targetuser
at org.apache.hadoop.ipc.Client.call(Client.java:873)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:222)
at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:360)
at org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:443)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:437)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:422)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:477)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:766)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:475)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:464)
at org.apache.hadoop.mapred.gridmix.GridmixJob.call(GridmixJob.java:230)
at 
org.apache.hadoop.mapred.gridmix.JobSubmitter$SubmitTask.run(JobSubmitter.java:119)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
{noformat}

Other feedback:
* The following should be removed:
** This method is not used:
{noformat}
+  private void throwException(Throwable exception) throws Throwable{
+throw exception;
+  }
{noformat}
** GridmixJob contains a few lines with empty statements (see {{doAs}} blocks)
** {{GridmixTestUtils}} is not a useful abstraction. Its functionality should 
remain in/be added to {{TestGridmixSubmission}}
* Consider {{Collections.emptyList()}} instead of creating and returning new, 
empty collections (e.g. {{EchoUserResolver}})
* {{EchoUserResolver}} only needs to extend {{ShellBasedUnixGroupsMapping}} for 
the unit test, right? The group mapping isn't important for the type. A 
separate group mapping class in test would be appropriate (presumably one 
already exists)
* {{RoundRobinUserResolver::parseUserList}} should be protected so subclasses 
may override it
* Since {{UserResolver}} can remain an abstract class (no need to extend any 
groups mapping), {{parseUserList}} can remain there.
* The default policy should be {{REPLAY}}, not {{STRESS}}
* {{JobSubmitter}} should not re-resolve the resolved UGI before calling 
{{buildSplits}}.
* It is not sufficient to fix the failure above, but the job is not submitted 
in a {{doAs}} block.

 Support for varied user submission in Gridmix
 -

 Key: MAPREDUCE-1376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1376
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/gridmix
Reporter: Chris Douglas
Assignee: Chris Douglas
 Attachments: 1376-2-yhadoop-security.patch, 
 1376-yhadoop-security.patch, M1376-0.patch, M1376-1.patch, M1376-2.patch, 
 M1376-3.patch, M1376-4.patch


 Gridmix currently submits all synthetic jobs as the client user. It should be 
 possible to map users in the trace to a set of users appropriate for the 
 target cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-12 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1480:
-

Attachment: MAPREDUCE-1480.4.patch

Test case caught an off-by-one error in progress calculation. Fixed with patch 
#4.

 CombineFileRecordReader does not properly initialize child RecordReader
 ---

 Key: MAPREDUCE-1480
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
 MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch


 CombineFileRecordReader instantiates child RecordReader instances but never 
 calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-12 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1480:
-

Status: Open  (was: Patch Available)

 CombineFileRecordReader does not properly initialize child RecordReader
 ---

 Key: MAPREDUCE-1480
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
 MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch


 CombineFileRecordReader instantiates child RecordReader instances but never 
 calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-12 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1480:
-

Status: Patch Available  (was: Open)

 CombineFileRecordReader does not properly initialize child RecordReader
 ---

 Key: MAPREDUCE-1480
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
 MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch


 CombineFileRecordReader instantiates child RecordReader instances but never 
 calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1542) Deprecate mapred.permissions.supergroup in favor of hadoop.cluster.administrators

2010-03-12 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844797#action_12844797
 ] 

Ravi Gummadi commented on MAPREDUCE-1542:
-

Thanks Hemanth for the review. It is fine to remove HDFS related stuff from 
this patch.

# I think it makes sense to add checks for queue ACLs refresh, service refresh 
and user-to-group mapping refresh also similar to node refresh. I don't see a 
semantic difference between these operations and node refresh and they should 
be implemented in the same way, I think. But I am OK if these are taken up in a 
following JIRA, as their addition may not be in the scope of this JIRA.

Right. Let us do it in a follow-up JIRA.

# Regarding tests, I was expecting to see tests that set up users and groups in 
the hadoop.cluster.administrators ACL and then checks that various operations 
of view and modify succeed with combinations of both allowed and disallowed 
users. Exhaustive tests need not be end-to-end I think - i.e. they need not run 
real jobs. Since most of the code goes through JobInProgress.checkAccess, can 
we just create fake objects and test the checkAccess method ? And maybe have 
some end-to-end tests for very important functions like killJob, killTask and 
view job details in a JSP.

TestWebUIAuthorization code changes of the patch already check
  (a) view-job as admin through almost all JSPs,
  (b) modify-job as admin through almost all JSPs -- includes killJob, 
killTask, setJobPriority.
TestNodeRefresh checks refreshNodes() as admin.
I guess these are covering tests we want ?


Will incorporate other comments given above and upload new patch.

 Deprecate mapred.permissions.supergroup in favor of 
 hadoop.cluster.administrators
 -

 Key: MAPREDUCE-1542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Vinod K V
Assignee: Ravi Gummadi
 Fix For: 0.22.0

 Attachments: 1542.20S.patch, 1542.patch, 1542.v1.patch, 
 mapreduce-1542-y20s.patch


 HADOOP-6568 added the configuration {{hadoop.cluster.administrators}} through 
 which admins can configure who the superusers/supergroups for the cluster 
 are. MAPREDUCE itself already has {{mapred.permissions.supergroup}} (which is 
 just a single group). As agreed upon at HADOOP-6568, this should be 
 deprecated in favor of {{hadoop.cluster.administrators}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1596) MapReduce trunk snapshot is not being published to maven

2010-03-12 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan reassigned MAPREDUCE-1596:
-

Assignee: Giridharan Kesavan

 MapReduce trunk snapshot is not being published to maven
 

 Key: MAPREDUCE-1596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Aaron Kimball
Assignee: Giridharan Kesavan
Priority: Critical

 The hadoop-core and hadoop-hdfs artifacts are pushed to maven on a regular 
 basis (daily?), but hadoop-mapreduce has not been updated since 2/18/10. Is 
 there something automatic in Hudson that is configured for these core and 
 hdfs, but not mapred?
 Downstream projects that try to build against Hadoop's trunk (via Ivy or 
 Maven) cannot compile due to API inconsistency here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1596) MapReduce trunk snapshot is not being published to maven

2010-03-12 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan resolved MAPREDUCE-1596.
---

   Resolution: Fixed
Fix Version/s: 0.22.0

Fixed the publishing configuration in hudson; and triggered the build, mr 
snapshots are not published to the snapshots repository.
hadoop-mapred-0.22.0-20100313.072356-32.jar 

 MapReduce trunk snapshot is not being published to maven
 

 Key: MAPREDUCE-1596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Reporter: Aaron Kimball
Assignee: Giridharan Kesavan
Priority: Critical
 Fix For: 0.22.0


 The hadoop-core and hadoop-hdfs artifacts are pushed to maven on a regular 
 basis (daily?), but hadoop-mapreduce has not been updated since 2/18/10. Is 
 there something automatic in Hudson that is configured for these core and 
 hdfs, but not mapred?
 Downstream projects that try to build against Hadoop's trunk (via Ivy or 
 Maven) cannot compile due to API inconsistency here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.