[ https://issues.apache.org/jira/browse/PIG-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419964#comment-13419964 ]
Cheolsoo Park commented on PIG-2492: ------------------------------------ Attached [^PIG-2492-4.patch] is the newest patch. There is one thing that I'd like to mention although I already discussed it in the review board. I changed the type of 1st parameter of AvroStorageUtils.getAllSubDirs() from URI to hadoop.fs.Path. This is needed because '{' and '}' are not allowed in URI, so URI.create() throws a URISyntaxException on a glob pattern that contains those characters. But these characters are automatically escaped when constructing a Path, so what I did is constructing a Path with the given glob pattern string and getting a URI from that Path by Path.toUri(). In fact, this reverts some changes made by PIG-2540 (https://issues.apache.org/jira/browse/PIG-2540). However, this does not break S3 support because inside AvroStorageUtils.getAllSubDirs(), file system is still constructed with the given URI, and globStatus() is called on that file system. {code} FileSystem fs = FileSystem.get(path.toUri(), job.getConfiguration()); FileStatus[] matchedFiles = fs.globStatus(path); {code} So if path is a s3 URI, S3 file system will be used. Please let me know if I am wrong. Thanks! > AvroStorage should recognize globs and commas > --------------------------------------------- > > Key: PIG-2492 > URL: https://issues.apache.org/jira/browse/PIG-2492 > Project: Pig > Issue Type: Improvement > Components: piggybank > Affects Versions: 0.9.1, 0.10.0 > Reporter: Stan Rosenberg > Assignee: Cheolsoo Park > Attachments: AvroStorage.patch, AvroStorageUtils.patch, > PIG-2492-2.patch, PIG-2492-3.patch, PIG-2492-4.patch, PIG-2492.patch, > avro_test_files-2.tar.gz, avro_test_files.tar.gz > > > I've patched AvroStorage and AvroStorageUtils to support the same file input > syntax as currently supported > by hadoop's FileInputFormat. Specifically, globs and commas are supported. > Somebody should write some unit tests for theses changes; I am currently > pressed for time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira