[ https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060213#comment-13060213 ]
Patrick Hunt commented on PIG-1890: ----------------------------------- @ken (and @mads) thanks, I figured something like that. Could this possibly be an issue in pig itself? I do see this {noformat} LoadFunc.setLocation: * This method will be called in the backend multiple times. Implementations * should bear in mind that this method is called multiple times and should * ensure there are no inconsistent side effects due to the multiple calls. {noformat} But what I'm seeing in this UNION case is that setLocation is being called multiple times on the same AvroStorage instance, for the same job, with different files. This results (current avrostorage code with pig-1890-2.patch applied) in the duplication - 2 files are added rather than one (my patch fixes this by only taking the most recent argument to setLocation, which is consistent with existing loader funcs, whereas avrostorage keeps adding). If you check the debugging output you'll see this (I might have added a bit more debugging to setLocation to capture this event...) Regards. > Fix piggybank unit test TestAvroStorage > --------------------------------------- > > Key: PIG-1890 > URL: https://issues.apache.org/jira/browse/PIG-1890 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.9.0 > Reporter: Daniel Dai > Assignee: Jakob Homan > Attachments: PIG-1890-1.patch, PIG-1890-2.patch > > > TestAvroStorage fail on trunk. There are two reasons: > 1. After PIG-1680, we call LoadFunc.setLocation one more time. > 2. The schema for AvroStorage seems to be wrong. For example, in first test > case testArrayDefault, the schema for "in" is set to "PIG_WRAPPER: (FIELD: > {PIG_WRAPPER: (ARRAY_ELEM: float)})". It seems PIG_WRAPPER is redundant. This > issue is hidden until PIG-1188 checked in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira