[jira] [Updated] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP

2011-04-05 Thread Boris Shkolnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated MAPREDUCE-2420:
--

Attachment: MR-2420.patch

 JobTracker should be able to renew delegation token over HTTP
 -

 Key: MAPREDUCE-2420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.4
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: MR-2420.patch


 in case JobTracker has to talk to a NameNode running a different version (RPC 
 version mismatch), Jobtracker should be able to fall back to HTTP renewal.
 Example of the case - running distcp between different versions using hfpt.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2317) HadoopArchives throwing NullPointerException while creating hadoop archives (.har files)

2011-04-05 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015996#comment-13015996
 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-2317:
---

Hi Devaraj, if the problem is not in the latest trunk, is it already fixed by 
some other JIRAs?  Do you know which JIRAs have fixed it?

 HadoopArchives throwing NullPointerException while creating hadoop archives 
 (.har files)
 

 Key: MAPREDUCE-2317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: harchive
Affects Versions: 0.20.1, 0.23.0
 Environment: windows
Reporter: Devaraj K
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2317.patch


 While we are trying to run hadoop archive tool in widows using this way, it 
 is giving the below exception.
 java org.apache.hadoop.tools.HadoopArchives -archiveName temp.har D:/test/in 
 E:/temp
 {code:xml} 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.tools.HadoopArchives.writeTopLevelDirs(HadoopArchives.java:320)
   at 
 org.apache.hadoop.tools.HadoopArchives.archive(HadoopArchives.java:386)
   at org.apache.hadoop.tools.HadoopArchives.run(HadoopArchives.java:725)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
   at org.apache.hadoop.tools.HadoopArchives.main(HadoopArchives.java:739)
 {code} 
 I see the code flow to handle this feature in windows also, 
 {code:title=Path.java|borderStyle=solid}
 /** Returns the parent of a path or null if at root. */
   public Path getParent() {
 String path = uri.getPath();
 int lastSlash = path.lastIndexOf('/');
 int start = hasWindowsDrive(path, true) ? 3 : 0;
 if ((path.length() == start) ||   // empty path
 (lastSlash == start  path.length() == start+1)) { // at root
   return null;
 }
 String parent;
 if (lastSlash==-1) {
   parent = CUR_DIR;
 } else {
   int end = hasWindowsDrive(path, true) ? 3 : 0;
   parent = path.substring(0, lastSlash==end?end+1:lastSlash);
 }
 return new Path(uri.getScheme(), uri.getAuthority(), parent);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2257) distcp can copy blocks in parallel

2011-04-05 Thread Rosie Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016023#comment-13016023
 ] 

Rosie Li commented on MAPREDUCE-2257:
-

The failure of the contrib test is not related to the new distcp.

 distcp can copy blocks in parallel
 --

 Key: MAPREDUCE-2257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2257
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: 0.21.0
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: MAPREDUCE-2257.patch


 The minimum unit of work for a distcp task is a file. We have files that are 
 greater than 1 TB with a block size of  1 GB. If we use distcp to copy these 
 files, the tasks either take a long long long time or finally fails. A better 
 way for distcp would be to copy all the source blocks in parallel, and then 
 stich the blocks back to files at the destination via the HDFS Concat API 
 (HDFS-222)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP

2011-04-05 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016060#comment-13016060
 ] 

Devaraj Das commented on MAPREDUCE-2420:


This patch handles the case of HDFS token renewals, when the JobTracker's HDFS 
is running a different version of HDFS than the one the job is trying to use. 
An example of such a job is distcp (where it uses hftp to talk to a different 
source cluster to pull data to the cluster where distcp is running).

When the job is submitted, the client requests a delegation token over hftp and 
stuffs it in the job. Today, the NameNode doesn't distinguish between hftp and 
hdfs accesses, and issues HDFS tokens for both (and the token-kind field in the 
token has the value as 'HDFS'). Ideally, that should be fixed to have the 
token-kind as HFTP for hftp accesses. We should have the JobTracker handle all 
sorts of token renewals, and have a way in which it can look at a token and 
decide which protocol to use to talk to the server in question. This includes 
HDFS, HFTP, and also HIVE (where the protocol is thrift). 

I think this patch is okay for the short term - the JobTracker falls back to 
hftp if it couldn't renew a token over hdfs. In the patch, there are a bunch of 
white space changes that aren't required. The string comparisons for exception 
messages and then instantiating a concrete exception could probably be replaced 
with a forname() on the string exception.

When we fix this issue in trunk, please make it more generic on lines similar 
to above.

 JobTracker should be able to renew delegation token over HTTP
 -

 Key: MAPREDUCE-2420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.4
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: MR-2420.patch


 in case JobTracker has to talk to a NameNode running a different version (RPC 
 version mismatch), Jobtracker should be able to fall back to HTTP renewal.
 Example of the case - running distcp between different versions using hfpt.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2011-04-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016061#comment-13016061
 ] 

Tom White commented on MAPREDUCE-2374:
--

+1

 Should not use PrintWriter to write taskjvm.sh
 --

 Key: MAPREDUCE-2374
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: mapreduce-2374-on-20sec.txt


 Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
 class swallows all IO exceptions. We're not currently checking for errors, 
 which I'm seeing result in occasional task failures with the message Text 
 file busy - assumedly because the close() call is failing silently for some 
 reason.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2395) TestBlockFixer timing out on trunk

2011-04-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016069#comment-13016069
 ] 

Tom White commented on MAPREDUCE-2395:
--

Is it worth committing this as it stands and fixing the intermittent failure in 
a follow up JIRA?

 TestBlockFixer timing out on trunk
 --

 Key: MAPREDUCE-2395
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2395
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Ramkumar Vadali
Priority: Critical
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2395.patch


 In recent Hudson builds, TestBlockFixer has been timing out. Not clear how 
 long it has been broken since MAPREDUCE-2394 was hiding the RAID tests from 
 Hudson's test result parsing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP

2011-04-05 Thread Boris Shkolnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated MAPREDUCE-2420:
--

Attachment: MR-2420-1.patch

implemented some of Devaraj's comments.

also set stack to empty for artificially created exceptions (on the client 
side).

Agree with Devaraj that we need a more generic solution for all types of token.

Please open a Jira on this and put these requirements in.

 JobTracker should be able to renew delegation token over HTTP
 -

 Key: MAPREDUCE-2420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.4
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: MR-2420-1.patch, MR-2420.patch


 in case JobTracker has to talk to a NameNode running a different version (RPC 
 version mismatch), Jobtracker should be able to fall back to HTTP renewal.
 Example of the case - running distcp between different versions using hfpt.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP

2011-04-05 Thread Boris Shkolnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016076#comment-13016076
 ] 

Boris Shkolnik commented on MAPREDUCE-2420:
---

committed to branch-20-security.

 JobTracker should be able to renew delegation token over HTTP
 -

 Key: MAPREDUCE-2420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.4
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: MR-2420-1.patch, MR-2420.patch


 in case JobTracker has to talk to a NameNode running a different version (RPC 
 version mismatch), Jobtracker should be able to fall back to HTTP renewal.
 Example of the case - running distcp between different versions using hfpt.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2307) Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler.

2011-04-05 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-2307:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I've committed this. Thanks Devaraj!

 Exception thrown in Jobtracker logs, when the Scheduler configured is 
 FairScheduler.
 

 Key: MAPREDUCE-2307
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2307
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 0.23.0
Reporter: Devaraj K
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2307.patch


 If we try to start the job tracker with fair scheduler using the default 
 configuration, It is giving the below exception.
 {code:xml} 
 2010-07-03 10:18:27,142 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 2 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 3 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 4 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 5 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 6 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 7 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 8 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.mapred.JobTracker: Starting 
 RUNNING
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 9 on 9001: starting
 2010-07-03 10:18:28,037 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/linux172.site
 2010-07-03 10:18:28,090 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/linux177.site
 2010-07-03 10:18:40,074 ERROR org.apache.hadoop.mapred.PoolManager: Failed to 
 reload allocations file - will use existing allocations.
 java.lang.NullPointerException
 at java.io.File.init(File.java:222)
 at 
 org.apache.hadoop.mapred.PoolManager.reloadAllocsIfNecessary(PoolManager.java:127)
 at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:234)
 at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2785)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:513)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:984)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:980)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:978)
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2307) Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler.

2011-04-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016155#comment-13016155
 ] 

Hudson commented on MAPREDUCE-2307:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #637 (See 
[https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/637/])
MAPREDUCE-2307. Exception thrown in Jobtracker logs, when the Scheduler
configured is FairScheduler. Contributed by Devaraj K.


 Exception thrown in Jobtracker logs, when the Scheduler configured is 
 FairScheduler.
 

 Key: MAPREDUCE-2307
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2307
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 0.23.0
Reporter: Devaraj K
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2307.patch


 If we try to start the job tracker with fair scheduler using the default 
 configuration, It is giving the below exception.
 {code:xml} 
 2010-07-03 10:18:27,142 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 2 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 3 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 4 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 5 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 6 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 7 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 8 on 9001: starting
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.mapred.JobTracker: Starting 
 RUNNING
 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 9 on 9001: starting
 2010-07-03 10:18:28,037 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/linux172.site
 2010-07-03 10:18:28,090 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
 new node: /default-rack/linux177.site
 2010-07-03 10:18:40,074 ERROR org.apache.hadoop.mapred.PoolManager: Failed to 
 reload allocations file - will use existing allocations.
 java.lang.NullPointerException
 at java.io.File.init(File.java:222)
 at 
 org.apache.hadoop.mapred.PoolManager.reloadAllocsIfNecessary(PoolManager.java:127)
 at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:234)
 at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2785)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:513)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:984)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:980)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:978)
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira