[jira] [Commented] (HADOOP-10521) FsShell commands for extended attributes.
[ https://issues.apache.org/jira/browse/HADOOP-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982201#comment-13982201 ] Uma Maheswara Rao G commented on HADOOP-10521: -- Regarding to Enum#valueOf, how does below code piece look? Of course guava library does that Illegalargumentexception handling and return null to us, if there is no enum value exist. {code} private final static FunctionString, ENCODE encodeValueOfFunc = Enums .valueOfFunction(ENCODE.class); . . . if (en != null) { encode = encodeValueOfFunc.apply(en); } Preconditions.checkArgument(encode != null, Invalid/unsupported econding option specified: en= + en); Test: runCommand(new String[] { -getfattr, -e, invalid,-n, xattrname, /file1 }); result: -getfattr: Invalid/unsupported encoding option specified: en=invalid {code} FsShell commands for extended attributes. - Key: HADOOP-10521 URL: https://issues.apache.org/jira/browse/HADOOP-10521 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: HDFS XAttrs (HDFS-2006) Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-10521.1.patch, HADOOP-10521.2.patch, HADOOP-10521.3.patch, HADOOP-10521.patch “setfattr” and “getfattr” commands are added to FsShell for XAttr, and these are the same as in Linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HADOOP-10521) FsShell commands for extended attributes.
[ https://issues.apache.org/jira/browse/HADOOP-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982201#comment-13982201 ] Uma Maheswara Rao G edited comment on HADOOP-10521 at 4/27/14 6:27 AM: --- Regarding to Enum#valueOf, how does below code piece look? Of course guava library does that Illegalargumentexception handling and return null to us, if there is no enum value exist. {code} private final static FunctionString, ENCODE encodeValueOfFunc = Enums .valueOfFunction(ENCODE.class); . . . if (en != null) { encode = encodeValueOfFunc.apply(en); } Preconditions.checkArgument(encode != null, Invalid/unsupported econding option specified: en= + en); Test: runCommand(new String[] { -getfattr, -e, invalid,-n, xattrname, /file1 }); result: -getfattr: Invalid/unsupported encoding option specified: en=invalid {code} tiny nit: Can not specify both '-n name' and '-x name' *option.* -- Can not specify both '-n name' and '-x name' *options.* ? was (Author: umamaheswararao): Regarding to Enum#valueOf, how does below code piece look? Of course guava library does that Illegalargumentexception handling and return null to us, if there is no enum value exist. {code} private final static FunctionString, ENCODE encodeValueOfFunc = Enums .valueOfFunction(ENCODE.class); . . . if (en != null) { encode = encodeValueOfFunc.apply(en); } Preconditions.checkArgument(encode != null, Invalid/unsupported econding option specified: en= + en); Test: runCommand(new String[] { -getfattr, -e, invalid,-n, xattrname, /file1 }); result: -getfattr: Invalid/unsupported encoding option specified: en=invalid {code} FsShell commands for extended attributes. - Key: HADOOP-10521 URL: https://issues.apache.org/jira/browse/HADOOP-10521 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: HDFS XAttrs (HDFS-2006) Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-10521.1.patch, HADOOP-10521.2.patch, HADOOP-10521.3.patch, HADOOP-10521.patch “setfattr” and “getfattr” commands are added to FsShell for XAttr, and these are the same as in Linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HADOOP-10521) FsShell commands for extended attributes.
[ https://issues.apache.org/jira/browse/HADOOP-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982201#comment-13982201 ] Uma Maheswara Rao G edited comment on HADOOP-10521 at 4/27/14 6:32 AM: --- Regarding to Enum#valueOf, how does below code piece look? Of course guava library does that Illegalargumentexception handling and return null to us, if there is no enum value exist. {code} private final static FunctionString, ENCODE encodeValueOfFunc = Enums .valueOfFunction(ENCODE.class); . . . if (en != null) { encode = encodeValueOfFunc.apply(en); } Preconditions.checkArgument(encode != null, Invalid/unsupported econding option specified: en= + en); Test: runCommand(new String[] { -getfattr, -e, invalid,-n, xattrname, /file1 }); result: -getfattr: Invalid/unsupported encoding option specified: en=invalid {code} Also please cover this validation cases in your tests. tiny nit: Can not specify both '-n name' and '-x name' *option.* -- Can not specify both '-n name' and '-x name' *options.* ? was (Author: umamaheswararao): Regarding to Enum#valueOf, how does below code piece look? Of course guava library does that Illegalargumentexception handling and return null to us, if there is no enum value exist. {code} private final static FunctionString, ENCODE encodeValueOfFunc = Enums .valueOfFunction(ENCODE.class); . . . if (en != null) { encode = encodeValueOfFunc.apply(en); } Preconditions.checkArgument(encode != null, Invalid/unsupported econding option specified: en= + en); Test: runCommand(new String[] { -getfattr, -e, invalid,-n, xattrname, /file1 }); result: -getfattr: Invalid/unsupported encoding option specified: en=invalid {code} tiny nit: Can not specify both '-n name' and '-x name' *option.* -- Can not specify both '-n name' and '-x name' *options.* ? FsShell commands for extended attributes. - Key: HADOOP-10521 URL: https://issues.apache.org/jira/browse/HADOOP-10521 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: HDFS XAttrs (HDFS-2006) Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-10521.1.patch, HADOOP-10521.2.patch, HADOOP-10521.3.patch, HADOOP-10521.patch “setfattr” and “getfattr” commands are added to FsShell for XAttr, and these are the same as in Linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10433) Key Management Server based on KeyProvider API
[ https://issues.apache.org/jira/browse/HADOOP-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982273#comment-13982273 ] Hadoop QA commented on HADOOP-10433: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642105/HADOOP-10433.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms hadoop-dist hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-openstack hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.crypto.key.kms.server.TestKMSServer org.apache.hadoop.mapreduce.lib.db.TestDBJob org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer The following test timeouts occurred in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms hadoop-dist hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-openstack hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.mapred.pipes.TestPipeApplication {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3859//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3859//console This message is automatically generated. Key Management Server based on KeyProvider API -- Key: HADOOP-10433 URL: https://issues.apache.org/jira/browse/HADOOP-10433 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HadoopKMSDocsv2.pdf, KMS-doc.pdf (from HDFS-6134 proposal) Hadoop KMS is the gateway, for Hadoop and Hadoop clients, to the underlying KMS. It provides an interface that works with existing Hadoop security components (authenticatication, confidentiality). Hadoop KMS will be implemented leveraging the work being done in HADOOP-10141 and HADOOP-10177. Hadoop KMS will provide an additional implementation of the Hadoop KeyProvider class. This implementation will be a client-server implementation. The client-server protocol will be secure: * Kerberos HTTP SPNEGO (authentication) * HTTPS for transport (confidentiality and integrity) * Hadoop ACLs (authorization) The Hadoop KMS implementation will not provide additional ACL to access encrypted files. For sophisticated access control requirements, HDFS ACLs (HDFS-4685) should be used. Basic key administration will be supported by the Hadoop KMS via the, already available, Hadoop KeyShell command line tool There are minor changes that must be done in Hadoop KeyProvider functionality: The KeyProvider contract, and the existing implementations, must be thread-safe KeyProvider API should have an API to generate the key material internally JavaKeyStoreProvider should use, if present, a password provided via configuration KeyProvider Option and Metadata should include a label (for easier cross-referencing) To avoid overloading the underlying KeyProvider implementation, the Hadoop KMS will cache keys using a TTL policy. Scalability and High Availability of the Hadoop KMS can achieved by running multiple instances behind a VIP/Load-Balancer. For High Availability, the underlying KeyProvider implementation used by the Hadoop KMS must be High Available. -- This message was
[jira] [Commented] (HADOOP-10433) Key Management Server based on KeyProvider API
[ https://issues.apache.org/jira/browse/HADOOP-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982282#comment-13982282 ] Hadoop QA commented on HADOOP-10433: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642105/HADOOP-10433.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms hadoop-dist hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-openstack hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.ha.TestZKFailoverControllerStress org.apache.hadoop.mapreduce.lib.db.TestDBJob org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer The following test timeouts occurred in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms hadoop-dist hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-openstack hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: org.apache.hadoop.mapred.pipes.TestPipeApplication {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3858//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3858//console This message is automatically generated. Key Management Server based on KeyProvider API -- Key: HADOOP-10433 URL: https://issues.apache.org/jira/browse/HADOOP-10433 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HadoopKMSDocsv2.pdf, KMS-doc.pdf (from HDFS-6134 proposal) Hadoop KMS is the gateway, for Hadoop and Hadoop clients, to the underlying KMS. It provides an interface that works with existing Hadoop security components (authenticatication, confidentiality). Hadoop KMS will be implemented leveraging the work being done in HADOOP-10141 and HADOOP-10177. Hadoop KMS will provide an additional implementation of the Hadoop KeyProvider class. This implementation will be a client-server implementation. The client-server protocol will be secure: * Kerberos HTTP SPNEGO (authentication) * HTTPS for transport (confidentiality and integrity) * Hadoop ACLs (authorization) The Hadoop KMS implementation will not provide additional ACL to access encrypted files. For sophisticated access control requirements, HDFS ACLs (HDFS-4685) should be used. Basic key administration will be supported by the Hadoop KMS via the, already available, Hadoop KeyShell command line tool There are minor changes that must be done in Hadoop KeyProvider functionality: The KeyProvider contract, and the existing implementations, must be thread-safe KeyProvider API should have an API to generate the key material internally JavaKeyStoreProvider should use, if present, a password provided via configuration KeyProvider Option and Metadata should include a label (for easier cross-referencing) To avoid overloading the underlying KeyProvider implementation, the Hadoop KMS will cache keys using a TTL policy. Scalability and High Availability of the Hadoop KMS can achieved by running multiple instances behind a VIP/Load-Balancer. For High Availability, the underlying KeyProvider implementation
[jira] [Created] (HADOOP-10541) InputStream in MiniKdc#initKDCServer for minikdc.ldiff is not closed
Ted Yu created HADOOP-10541: --- Summary: InputStream in MiniKdc#initKDCServer for minikdc.ldiff is not closed Key: HADOOP-10541 URL: https://issues.apache.org/jira/browse/HADOOP-10541 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Priority: Minor The same InputStream variable is used for minikdc.ldiff and minikdc-krb5.conf : {code} InputStream is = cl.getResourceAsStream(minikdc.ldiff); ... is = cl.getResourceAsStream(minikdc-krb5.conf); {code} Before the second assignment, is should be closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
Ted Yu created HADOOP-10542: --- Summary: Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock() Key: HADOOP-10542 URL: https://issues.apache.org/jira/browse/HADOOP-10542 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} in = get(blockToKey(block), byteRangeStart); out = new BufferedOutputStream(new FileOutputStream(fileBlock)); byte[] buf = new byte[bufferSize]; int numRead; while ((numRead = in.read(buf)) = 0) { {code} get() may return null. The while loop dereferences in without null check. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10521) FsShell commands for extended attributes.
[ https://issues.apache.org/jira/browse/HADOOP-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982387#comment-13982387 ] Chris Nauroth commented on HADOOP-10521: That looks good to me for the enum handling. I didn't know Guava had this. Thanks for the tip, Uma. FsShell commands for extended attributes. - Key: HADOOP-10521 URL: https://issues.apache.org/jira/browse/HADOOP-10521 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: HDFS XAttrs (HDFS-2006) Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-10521.1.patch, HADOOP-10521.2.patch, HADOOP-10521.3.patch, HADOOP-10521.patch “setfattr” and “getfattr” commands are added to FsShell for XAttr, and these are the same as in Linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10543) RemoteException's unwrapRemoteException method failed for PathIOException
Yongjun Zhang created HADOOP-10543: -- Summary: RemoteException's unwrapRemoteException method failed for PathIOException Key: HADOOP-10543 URL: https://issues.apache.org/jira/browse/HADOOP-10543 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because PathIOException overwrites the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10543) RemoteException's unwrapRemoteException method failed for PathIOException
[ https://issues.apache.org/jira/browse/HADOOP-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HADOOP-10543: --- Description: If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because some PathIOException constructors initialize the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} was: If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because PathIOException overwrites the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} RemoteException's unwrapRemoteException method failed for PathIOException - Key: HADOOP-10543 URL: https://issues.apache.org/jira/browse/HADOOP-10543 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because some PathIOException constructors initialize the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
[ https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10542: Component/s: fs/s3 Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock() --- Key: HADOOP-10542 URL: https://issues.apache.org/jira/browse/HADOOP-10542 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Reporter: Ted Yu Priority: Minor {code} in = get(blockToKey(block), byteRangeStart); out = new BufferedOutputStream(new FileOutputStream(fileBlock)); byte[] buf = new byte[bufferSize]; int numRead; while ((numRead = in.read(buf)) = 0) { {code} get() may return null. The while loop dereferences in without null check. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
[ https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982411#comment-13982411 ] Steve Loughran commented on HADOOP-10542: - I think we need to look at why the error NoSuchKey is being mapped to {{null}}, rather than raise an exception -and for the selective reporting in {{handleServiceException()}}. Do that -and provide a useful error, and the deref here goes away. HADOOP-10533 seems related. Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock() --- Key: HADOOP-10542 URL: https://issues.apache.org/jira/browse/HADOOP-10542 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Reporter: Ted Yu Priority: Minor {code} in = get(blockToKey(block), byteRangeStart); out = new BufferedOutputStream(new FileOutputStream(fileBlock)); byte[] buf = new byte[bufferSize]; int numRead; while ((numRead = in.read(buf)) = 0) { {code} get() may return null. The while loop dereferences in without null check. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8989) hadoop dfs -find feature
[ https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Allen updated HADOOP-8989: --- Attachment: HADOOP-8989.patch Patch striped back to minimum - command plus name, print and and operations. I'll add the rest back in as separate patches. hadoop dfs -find feature Key: HADOOP-8989 URL: https://issues.apache.org/jira/browse/HADOOP-8989 Project: Hadoop Common Issue Type: New Feature Reporter: Marco Nicosia Assignee: Jonathan Allen Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch Both sysadmins and users make frequent use of the unix 'find' command, but Hadoop has no correlate. Without this, users are writing scripts which make heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs -lsr is somewhat taxing on the NameNode, and a really slow experience on the client side. Possibly an in-NameNode find operation would be only a bit more taxing on the NameNode, but significantly faster from the client's point of view? The minimum set of options I can think of which would make a Hadoop find command generally useful is (in priority order): * -type (file or directory, for now) * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments) * -print0 (for piping to xargs -0) * -depth * -owner/-group (and -nouser/-nogroup) * -name (allowing for shell pattern, or even regex?) * -perm * -size One possible special case, but could possibly be really cool if it ran from within the NameNode: * -delete The hadoop dfs -lsr | hadoop dfs -rm cycle is really, really slow. Lower priority, some people do use operators, mostly to execute -or searches such as: * find / \(-nouser -or -nogroup\) Finally, I thought I'd include a link to the [Posix spec for find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8989) hadoop dfs -find feature
[ https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Allen updated HADOOP-8989: --- Attachment: HADOOP-8989.patch hadoop dfs -find feature Key: HADOOP-8989 URL: https://issues.apache.org/jira/browse/HADOOP-8989 Project: Hadoop Common Issue Type: New Feature Reporter: Marco Nicosia Assignee: Jonathan Allen Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch Both sysadmins and users make frequent use of the unix 'find' command, but Hadoop has no correlate. Without this, users are writing scripts which make heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs -lsr is somewhat taxing on the NameNode, and a really slow experience on the client side. Possibly an in-NameNode find operation would be only a bit more taxing on the NameNode, but significantly faster from the client's point of view? The minimum set of options I can think of which would make a Hadoop find command generally useful is (in priority order): * -type (file or directory, for now) * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments) * -print0 (for piping to xargs -0) * -depth * -owner/-group (and -nouser/-nogroup) * -name (allowing for shell pattern, or even regex?) * -perm * -size One possible special case, but could possibly be really cool if it ran from within the NameNode: * -delete The hadoop dfs -lsr | hadoop dfs -rm cycle is really, really slow. Lower priority, some people do use operators, mostly to execute -or searches such as: * find / \(-nouser -or -nogroup\) Finally, I thought I'd include a link to the [Posix spec for find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10544) Find command - add operator functions to find command
Jonathan Allen created HADOOP-10544: --- Summary: Find command - add operator functions to find command Key: HADOOP-10544 URL: https://issues.apache.org/jira/browse/HADOOP-10544 Project: Hadoop Common Issue Type: New Feature Reporter: Jonathan Allen Assignee: Jonathan Allen Priority: Minor Add operator functions (OR, NOT) to the find command created under HADOOP-8989. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10544) Find command - add operator functions to find command
[ https://issues.apache.org/jira/browse/HADOOP-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Allen updated HADOOP-10544: Issue Type: Sub-task (was: New Feature) Parent: HADOOP-8989 Find command - add operator functions to find command - Key: HADOOP-10544 URL: https://issues.apache.org/jira/browse/HADOOP-10544 Project: Hadoop Common Issue Type: Sub-task Reporter: Jonathan Allen Assignee: Jonathan Allen Priority: Minor Add operator functions (OR, NOT) to the find command created under HADOOP-8989. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-8989) hadoop dfs -find feature
[ https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982438#comment-13982438 ] Hadoop QA commented on HADOOP-8989: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642147/HADOOP-8989.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3860//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3860//console This message is automatically generated. hadoop dfs -find feature Key: HADOOP-8989 URL: https://issues.apache.org/jira/browse/HADOOP-8989 Project: Hadoop Common Issue Type: New Feature Reporter: Marco Nicosia Assignee: Jonathan Allen Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch Both sysadmins and users make frequent use of the unix 'find' command, but Hadoop has no correlate. Without this, users are writing scripts which make heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs -lsr is somewhat taxing on the NameNode, and a really slow experience on the client side. Possibly an in-NameNode find operation would be only a bit more taxing on the NameNode, but significantly faster from the client's point of view? The minimum set of options I can think of which would make a Hadoop find command generally useful is (in priority order): * -type (file or directory, for now) * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments) * -print0 (for piping to xargs -0) * -depth * -owner/-group (and -nouser/-nogroup) * -name (allowing for shell pattern, or even regex?) * -perm * -size One possible special case, but could possibly be really cool if it ran from within the NameNode: * -delete The hadoop dfs -lsr | hadoop dfs -rm cycle is really, really slow. Lower priority, some people do use operators, mostly to execute -or searches such as: * find / \(-nouser -or -nogroup\) Finally, I thought I'd include a link to the [Posix spec for find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10544) Find command - add operator functions to find command
[ https://issues.apache.org/jira/browse/HADOOP-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Allen updated HADOOP-10544: Attachment: HADOOP-10544.patch Find command - add operator functions to find command - Key: HADOOP-10544 URL: https://issues.apache.org/jira/browse/HADOOP-10544 Project: Hadoop Common Issue Type: Sub-task Reporter: Jonathan Allen Assignee: Jonathan Allen Priority: Minor Attachments: HADOOP-10544.patch Add operator functions (OR, NOT) to the find command created under HADOOP-8989. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10544) Find command - add operator functions to find command
[ https://issues.apache.org/jira/browse/HADOOP-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Allen updated HADOOP-10544: Status: Patch Available (was: Open) Find command - add operator functions to find command - Key: HADOOP-10544 URL: https://issues.apache.org/jira/browse/HADOOP-10544 Project: Hadoop Common Issue Type: Sub-task Reporter: Jonathan Allen Assignee: Jonathan Allen Priority: Minor Attachments: HADOOP-10544.patch Add operator functions (OR, NOT) to the find command created under HADOOP-8989. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10543) RemoteException's unwrapRemoteException method failed for PathIOException
[ https://issues.apache.org/jira/browse/HADOOP-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HADOOP-10543: --- Attachment: HADOOP-10543.001.patch RemoteException's unwrapRemoteException method failed for PathIOException - Key: HADOOP-10543 URL: https://issues.apache.org/jira/browse/HADOOP-10543 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HADOOP-10543.001.patch If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because some PathIOException constructors initialize the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10543) RemoteException's unwrapRemoteException method failed for PathIOException
[ https://issues.apache.org/jira/browse/HADOOP-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HADOOP-10543: --- Status: Patch Available (was: Open) Submitted initial patch to address the problem. RemoteException's unwrapRemoteException method failed for PathIOException - Key: HADOOP-10543 URL: https://issues.apache.org/jira/browse/HADOOP-10543 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HADOOP-10543.001.patch If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because some PathIOException constructors initialize the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10544) Find command - add operator functions to find command
[ https://issues.apache.org/jira/browse/HADOOP-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982503#comment-13982503 ] Hadoop QA commented on HADOOP-10544: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642156/HADOOP-10544.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3861//console This message is automatically generated. Find command - add operator functions to find command - Key: HADOOP-10544 URL: https://issues.apache.org/jira/browse/HADOOP-10544 Project: Hadoop Common Issue Type: Sub-task Reporter: Jonathan Allen Assignee: Jonathan Allen Priority: Minor Attachments: HADOOP-10544.patch Add operator functions (OR, NOT) to the find command created under HADOOP-8989. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10543) RemoteException's unwrapRemoteException method failed for PathIOException
[ https://issues.apache.org/jira/browse/HADOOP-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982669#comment-13982669 ] Hadoop QA commented on HADOOP-10543: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642163/HADOOP-10543.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3862//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3862//console This message is automatically generated. RemoteException's unwrapRemoteException method failed for PathIOException - Key: HADOOP-10543 URL: https://issues.apache.org/jira/browse/HADOOP-10543 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HADOOP-10543.001.patch If the cause of a RemoteException is PathIOException, RemoteException's unwrapRemoteException methods would fail, because some PathIOException constructors initialize the cause to null, which makes Throwable to throw exception at {code} public synchronized Throwable initCause(Throwable cause) { if (this.cause != this) throw new IllegalStateException(Can't overwrite cause); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)