[ https://issues.apache.org/jira/browse/PIG-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236048#comment-13236048 ]
Russell Jurney commented on PIG-2540: ------------------------------------- I don't understand how to do this. I did this: russell-jurneys-macbook-pro:newpig rjurney$ git remote -v origin https://github.com/apache/pig.git (fetch) origin https://github.com/apache/pig.git (push) russell-jurneys-macbook-pro:newpig rjurney$ git branch -v * branch-0.10 14f4606 [ahead 5] Merge branch 'branch-0.10' of https://github.com/apache/pig into branch-0.10 trunk cb49401 [behind 7] PIG-2589: Additional e2e test for 0.10 new features russell-jurneys-macbook-pro:newpig rjurney$ git pull remote: Counting objects: 77, done. remote: Compressing objects: 100% (6/6), done. remote: Total 39 (delta 17), reused 39 (delta 17) Unpacking objects: 100% (39/39), done. >From https://github.com/apache/pig b8ce196..d1f6cb1 branch-0.10 -> origin/branch-0.10 8b21cc4..841f336 trunk -> origin/trunk Merge made by recursive. git diff --no-prefix 73bb67f8cc3974d76e034d09da96995e887b4c30 contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/ > PIG-2540.tests_fail.patch.3 This patch is identical to the other one. What am I missing? > AvroStorage can't read schema on amazon s3 in elastic mapreduce > --------------------------------------------------------------- > > Key: PIG-2540 > URL: https://issues.apache.org/jira/browse/PIG-2540 > Project: Pig > Issue Type: Bug > Components: data, piggybank > Affects Versions: 0.9.1, 0.10 > Environment: Amazon Elastic MapReduce > Reporter: Russell Jurney > Assignee: Russell Jurney > Priority: Critical > Labels: avro, avro_udf, aws, emr, pants, pig, s3, sad, unhappy > Fix For: 0.10 > > Attachments: PIG-2540.tests_fail.patch, PIG-2540.tests_fail.patch.2, > PIG-2540_almost_there.patch, > TEST-org.apache.pig.piggybank.test.storage.avro.TestAvroStorage.txt > > > grunt> emails = load 's3://agile.data/again_inbox' using AvroStorage(); > grunt> describe emails > Schema for emails unknown. > grunt> a = limit emails 10; > grunt> dump a > 2012-02-16 22:15:58,347 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: > LIMIT > 2012-02-16 22:15:58,483 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - > File concatenation threshold: 100 optimistic? false > 2012-02-16 22:15:58,542 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > - MR plan size before optimization: 1 > 2012-02-16 22:15:58,542 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > - MR plan size after optimization: 1 > 2012-02-16 22:15:58,632 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to > the job > 2012-02-16 22:15:58,658 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 > 2012-02-16 22:15:58,665 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2017: Internal error creating job configuration. > 2012-02-16 22:15:58,665 [main] ERROR org.apache.pig.tools.grunt.Grunt - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias a > at org.apache.pig.PigServer.openIterator(PigServer.java:901) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:652) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67) > at org.apache.pig.Main.run(Main.java:497) > at org.apache.pig.Main.main(Main.java:111) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias a > at org.apache.pig.PigServer.storeEx(PigServer.java:1000) > at org.apache.pig.PigServer.store(PigServer.java:963) > at org.apache.pig.PigServer.openIterator(PigServer.java:876) > ... 12 more > Caused by: > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:731) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:263) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:149) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1314) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1299) > at org.apache.pig.PigServer.storeEx(PigServer.java:996) > ... 14 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:352) > at > org.apache.pig.piggybank.storage.avro.AvroStorage.setLocation(AvroStorage.java:138) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:387) > ... 19 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira