[ 
https://issues.apache.org/jira/browse/PIG-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2539:
----------------------------

    Attachment: PIG-2539-0.patch

Pig does not support "cd" to any s3 directory, not just those containing "_". 
Actually s3 does not allow "_" in the bucket name any more. 

However, I do find some bug in s3 processing, such as:

a = load 's3n://pig-test1/';
dump a;

gives me:
Caused by: java.lang.IllegalArgumentException: Path must be absolute: 
s3n://pig-test1

The problem is the tailing "/" is important to s3, which Pig always strip away. 

Attach a patch to illustrate the fix. I haven't think through the side effect 
of retaining tailing "/". More validation is needed.
                
> Cannot use s3 buckets with '_' in the name
> ------------------------------------------
>
>                 Key: PIG-2539
>                 URL: https://issues.apache.org/jira/browse/PIG-2539
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt, parser
>    Affects Versions: 0.9.1
>         Environment: Amazon Elastic Map Reduce
>            Reporter: Russell Jurney
>            Priority: Critical
>              Labels: amazon, fun, happy, me, no, pants, pig, sad, work
>         Attachments: PIG-2539-0.patch
>
>
> grunt> cd s3://agile_data
> 2012-02-16 22:05:59,461 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2999: Unexpected internal error. Invalid hostname in URI s3://agile_data
> Details at logfile: /home/hadoop/pig_1329429351155.log
> I think the next behavior is already documented/bug filed:
> grunt> cd 's3://agile_data'
> 2012-02-16 22:02:28,489 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2999: Unexpected internal error. java.net.URISyntaxException: Illegal 
> character in scheme name at index 0: 's3://agile_data'
> Details at logfile: /home/hadoop/pig_1329429351155.log

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to