[ 
https://issues.apache.org/jira/browse/PIG-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3267:
----------------------------

    Attachment: PIG-3267-1.patch

This is due to when we add a Limit job and convert the intermediate storer to 
temporary storer, we did not nullify the cached storer in POStore. End up we 
are still using HCatStorer in the intermediate job. Attach patch.
                
> HCatStorer fail in limit query
> ------------------------------
>
>                 Key: PIG-3267
>                 URL: https://issues.apache.org/jira/browse/PIG-3267
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2, 0.10.1, 0.11.1
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.12
>
>         Attachments: PIG-3267-1.patch
>
>
> The following query fail:
> {code}
> data = LOAD 'student.txt' as (name:chararray, age:int, gpa:double);
> data_limited = limit data 10;
> samples = foreach data_limited generate age as number;
> store samples into 'samples' using 
> org.apache.hcatalog.pig.HCatStorer('part_dt=20130101T010000T36');
> {code}
> Error happens before launching the second job. Error message:
> {code}
> Message: org.apache.hadoop.mapred.FileAlreadyExistsException: Output 
> directory 
> hdfs://localhost:8020/user/hive/warehouse/samples/part_dt=20130101T010000T36 
> already exists
>       at 
> org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:121)
>       at 
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.checkOutputSpecs(FileOutputFormatContainer.java:135)
>       at 
> org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:72)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188)
>       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887)
>       at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
>       at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>       at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>       at 
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
>       at 
> org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
>       at java.lang.Thread.run(Thread.java:680)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to