[jira] [Updated] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP
[ https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-2420: -- Attachment: MR-2420.patch JobTracker should be able to renew delegation token over HTTP - Key: MAPREDUCE-2420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.4 Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: MR-2420.patch in case JobTracker has to talk to a NameNode running a different version (RPC version mismatch), Jobtracker should be able to fall back to HTTP renewal. Example of the case - running distcp between different versions using hfpt. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2317) HadoopArchives throwing NullPointerException while creating hadoop archives (.har files)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015996#comment-13015996 ] Tsz Wo (Nicholas), SZE commented on MAPREDUCE-2317: --- Hi Devaraj, if the problem is not in the latest trunk, is it already fixed by some other JIRAs? Do you know which JIRAs have fixed it? HadoopArchives throwing NullPointerException while creating hadoop archives (.har files) Key: MAPREDUCE-2317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2317 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.1, 0.23.0 Environment: windows Reporter: Devaraj K Priority: Minor Fix For: 0.23.0 Attachments: MAPREDUCE-2317.patch While we are trying to run hadoop archive tool in widows using this way, it is giving the below exception. java org.apache.hadoop.tools.HadoopArchives -archiveName temp.har D:/test/in E:/temp {code:xml} java.lang.NullPointerException at org.apache.hadoop.tools.HadoopArchives.writeTopLevelDirs(HadoopArchives.java:320) at org.apache.hadoop.tools.HadoopArchives.archive(HadoopArchives.java:386) at org.apache.hadoop.tools.HadoopArchives.run(HadoopArchives.java:725) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.HadoopArchives.main(HadoopArchives.java:739) {code} I see the code flow to handle this feature in windows also, {code:title=Path.java|borderStyle=solid} /** Returns the parent of a path or null if at root. */ public Path getParent() { String path = uri.getPath(); int lastSlash = path.lastIndexOf('/'); int start = hasWindowsDrive(path, true) ? 3 : 0; if ((path.length() == start) || // empty path (lastSlash == start path.length() == start+1)) { // at root return null; } String parent; if (lastSlash==-1) { parent = CUR_DIR; } else { int end = hasWindowsDrive(path, true) ? 3 : 0; parent = path.substring(0, lastSlash==end?end+1:lastSlash); } return new Path(uri.getScheme(), uri.getAuthority(), parent); } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2257) distcp can copy blocks in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016023#comment-13016023 ] Rosie Li commented on MAPREDUCE-2257: - The failure of the contrib test is not related to the new distcp. distcp can copy blocks in parallel -- Key: MAPREDUCE-2257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2257 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distcp Affects Versions: 0.21.0 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: MAPREDUCE-2257.patch The minimum unit of work for a distcp task is a file. We have files that are greater than 1 TB with a block size of 1 GB. If we use distcp to copy these files, the tasks either take a long long long time or finally fails. A better way for distcp would be to copy all the source blocks in parallel, and then stich the blocks back to files at the destination via the HDFS Concat API (HDFS-222) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP
[ https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016060#comment-13016060 ] Devaraj Das commented on MAPREDUCE-2420: This patch handles the case of HDFS token renewals, when the JobTracker's HDFS is running a different version of HDFS than the one the job is trying to use. An example of such a job is distcp (where it uses hftp to talk to a different source cluster to pull data to the cluster where distcp is running). When the job is submitted, the client requests a delegation token over hftp and stuffs it in the job. Today, the NameNode doesn't distinguish between hftp and hdfs accesses, and issues HDFS tokens for both (and the token-kind field in the token has the value as 'HDFS'). Ideally, that should be fixed to have the token-kind as HFTP for hftp accesses. We should have the JobTracker handle all sorts of token renewals, and have a way in which it can look at a token and decide which protocol to use to talk to the server in question. This includes HDFS, HFTP, and also HIVE (where the protocol is thrift). I think this patch is okay for the short term - the JobTracker falls back to hftp if it couldn't renew a token over hdfs. In the patch, there are a bunch of white space changes that aren't required. The string comparisons for exception messages and then instantiating a concrete exception could probably be replaced with a forname() on the string exception. When we fix this issue in trunk, please make it more generic on lines similar to above. JobTracker should be able to renew delegation token over HTTP - Key: MAPREDUCE-2420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.4 Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: MR-2420.patch in case JobTracker has to talk to a NameNode running a different version (RPC version mismatch), Jobtracker should be able to fall back to HTTP renewal. Example of the case - running distcp between different versions using hfpt. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016061#comment-13016061 ] Tom White commented on MAPREDUCE-2374: -- +1 Should not use PrintWriter to write taskjvm.sh -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: mapreduce-2374-on-20sec.txt Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2395) TestBlockFixer timing out on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016069#comment-13016069 ] Tom White commented on MAPREDUCE-2395: -- Is it worth committing this as it stands and fixing the intermittent failure in a follow up JIRA? TestBlockFixer timing out on trunk -- Key: MAPREDUCE-2395 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2395 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Ramkumar Vadali Priority: Critical Fix For: 0.23.0 Attachments: MAPREDUCE-2395.patch In recent Hudson builds, TestBlockFixer has been timing out. Not clear how long it has been broken since MAPREDUCE-2394 was hiding the RAID tests from Hudson's test result parsing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP
[ https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-2420: -- Attachment: MR-2420-1.patch implemented some of Devaraj's comments. also set stack to empty for artificially created exceptions (on the client side). Agree with Devaraj that we need a more generic solution for all types of token. Please open a Jira on this and put these requirements in. JobTracker should be able to renew delegation token over HTTP - Key: MAPREDUCE-2420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.4 Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: MR-2420-1.patch, MR-2420.patch in case JobTracker has to talk to a NameNode running a different version (RPC version mismatch), Jobtracker should be able to fall back to HTTP renewal. Example of the case - running distcp between different versions using hfpt. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2420) JobTracker should be able to renew delegation token over HTTP
[ https://issues.apache.org/jira/browse/MAPREDUCE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016076#comment-13016076 ] Boris Shkolnik commented on MAPREDUCE-2420: --- committed to branch-20-security. JobTracker should be able to renew delegation token over HTTP - Key: MAPREDUCE-2420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2420 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.4 Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: MR-2420-1.patch, MR-2420.patch in case JobTracker has to talk to a NameNode running a different version (RPC version mismatch), Jobtracker should be able to fall back to HTTP renewal. Example of the case - running distcp between different versions using hfpt. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2307) Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated MAPREDUCE-2307: - Resolution: Fixed Status: Resolved (was: Patch Available) I've committed this. Thanks Devaraj! Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler. Key: MAPREDUCE-2307 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2307 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.23.0 Reporter: Devaraj K Priority: Minor Fix For: 0.23.0 Attachments: MAPREDUCE-2307.patch If we try to start the job tracker with fair scheduler using the default configuration, It is giving the below exception. {code:xml} 2010-07-03 10:18:27,142 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting 2010-07-03 10:18:28,037 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux172.site 2010-07-03 10:18:28,090 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux177.site 2010-07-03 10:18:40,074 ERROR org.apache.hadoop.mapred.PoolManager: Failed to reload allocations file - will use existing allocations. java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.PoolManager.reloadAllocsIfNecessary(PoolManager.java:127) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:234) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2785) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:513) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:984) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:980) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:978) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2307) Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016155#comment-13016155 ] Hudson commented on MAPREDUCE-2307: --- Integrated in Hadoop-Mapreduce-trunk-Commit #637 (See [https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/637/]) MAPREDUCE-2307. Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler. Contributed by Devaraj K. Exception thrown in Jobtracker logs, when the Scheduler configured is FairScheduler. Key: MAPREDUCE-2307 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2307 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.23.0 Reporter: Devaraj K Priority: Minor Fix For: 0.23.0 Attachments: MAPREDUCE-2307.patch If we try to start the job tracker with fair scheduler using the default configuration, It is giving the below exception. {code:xml} 2010-07-03 10:18:27,142 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting 2010-07-03 10:18:27,143 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING 2010-07-03 10:18:27,143 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting 2010-07-03 10:18:28,037 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux172.site 2010-07-03 10:18:28,090 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/linux177.site 2010-07-03 10:18:40,074 ERROR org.apache.hadoop.mapred.PoolManager: Failed to reload allocations file - will use existing allocations. java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.PoolManager.reloadAllocsIfNecessary(PoolManager.java:127) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:234) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2785) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:513) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:984) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:980) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:978) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira