[ 
https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060218#comment-13060218
 ] 

Dmitriy V. Ryaboy commented on PIG-1890:
----------------------------------------

I've been a bit out of the loop on this -- you are doing your own directory 
traversal? You shouldn't need to do that in the Pig layer, this should be done 
in your InputFormat. I had to write a wrapper to emulate what MAPREDUCE-1501 
does in Elephant-Bird, and I believe Pig does the same thing (but without 
caring about the mapred.input.dir.recursive config).

As for setLocation, yes. Making it idempotent is "fun".  

I am curious about this business with calling it with different files for the 
same instance for the same job. Patrick, can you show some debug output that 
has the sequence of calls? 

> Fix piggybank unit test TestAvroStorage
> ---------------------------------------
>
>                 Key: PIG-1890
>                 URL: https://issues.apache.org/jira/browse/PIG-1890
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Daniel Dai
>            Assignee: Jakob Homan
>         Attachments: PIG-1890-1.patch, PIG-1890-2.patch
>
>
> TestAvroStorage fail on trunk. There are two reasons:
> 1. After PIG-1680, we call LoadFunc.setLocation one more time.
> 2. The schema for AvroStorage seems to be wrong. For example, in first test 
> case testArrayDefault, the schema for "in" is set to "PIG_WRAPPER: (FIELD: 
> {PIG_WRAPPER: (ARRAY_ELEM: float)})". It seems PIG_WRAPPER is redundant. This 
> issue is hidden until PIG-1188 checked in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to