[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130382#comment-13130382 ] Devaraj K commented on MAPREDUCE-3193: -- bq. This isn't something we should change lightly, it's probably going to break user apps. If the input path contains one nested dir, it is considering as file and trying to execute the task and it fails with the below error. Failing the job itself when the inputpath contains nested dir might not be correct. {code:xml} Caused by: java.io.FileNotFoundException: File does not exist: /r1/r2 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:736) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:699) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:671) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:315) at org.apache.hadoop.hdfs.protocolR23Compatible.ClientNamenodeProtocolServerSideTranslatorR23.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorR23.java:130) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:632) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1513) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1511) at org.apache.hadoop.ipc.Client.call(Client.java:1085) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:244) at $Proxy8.getBlockLocations(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:130) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:81) at $Proxy8.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.protocolR23Compatible.ClientNamenodeProtocolTranslatorR23.getBlockLocations(ClientNamenodeProtocolTranslatorR23.java:150) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:566) ... 14 more {code} NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130384#comment-13130384 ] Arun C Murthy commented on MAPREDUCE-3193: -- bq. If the input path contains one nested dir, it is considering as file and trying to execute the task and it fails with the below error. Failing the job itself when the inputpath contains nested dir might not be correct. Yes, I realize that. FileInputFormat has this behaviour for a long while and changing it now (we definitely shouldn't do this for hadoop-0.23.0) will probably affect a lot of apps. Hence, at the very least, we need to have this off by default. NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129630#comment-13129630 ] Devaraj K commented on MAPREDUCE-3193: -- Here is the problem. In FileInputFormat.listStatus, It considers the files/directories in one nested level and takes every thing as file. Finally it creates splits with directories and fails the task. {code:title=FileInputFormat.java|borderStyle=solid} for (int i=0; i dirs.length; ++i) { Path p = dirs[i]; FileSystem fs = p.getFileSystem(job.getConfiguration()); FileStatus[] matches = fs.globStatus(p, inputFilter); if (matches == null) { errors.add(new IOException(Input path does not exist: + p)); } else if (matches.length == 0) { errors.add(new IOException(Input Pattern + p + matches 0 files)); } else { for (FileStatus globStat: matches) { if (globStat.isDirectory()) { for(FileStatus stat: fs.listStatus(globStat.getPath(), inputFilter)) { result.add(stat); } } else { result.add(globStat); } } } } {code} NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130372#comment-13130372 ] Arun C Murthy commented on MAPREDUCE-3193: -- This isn't something we should change lightly, it's probably going to break user apps. At least, we need a config to turn this on, and it should be off by default. NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129001#comment-13129001 ] Mahadev konar commented on MAPREDUCE-3193: -- Ramgopal, What input format are you using? NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129481#comment-13129481 ] Ramgopal N commented on MAPREDUCE-3193: --- I have executed Wordcount job from examples.jar.It uses FileInputFormat NextGen Mapreduce framework is not able to read the job input recursively.Input is read only for one folder level deep -- Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramgopal N java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira