[ 
https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060178#comment-13060178
 ] 

Mads Moeller commented on PIG-1890:
-----------------------------------

Hi Ken,

I am have the same use case as you and encountering the same behavior as 
Patrick. I made a few modifications to the methods "addInputPaths" and 
"addAllSubDirs" from your patch, which seems to solve the UNION issue. 

{code}
    public static boolean addInputPaths(String pathString, Job job)
        throws IOException {

        Set<Path> pathSet = new HashSet<Path>();
        
        if (addAllSubDirs(new Path(pathString), job, pathSet)) {
            Path[] paths = pathSet.toArray(new Path[pathSet.size()]);
 
            return true;
        }
        return false;
    }

    /**
     * Adds all non-hidden directories and subdirectories to the paths set
     * 
     * @throws IOException
     */
        private static boolean addAllSubDirs(Path path, Job job, Set<Path> 
paths) throws IOException {
                FileSystem fs = FileSystem.get(job.getConfiguration());

                if (PATH_FILTER.accept(path)) {
                        try {
                                FileStatus file = fs.getFileStatus(path);
                                if (file.isDir()) {
                                        for (FileStatus sub : 
fs.listStatus(path)) {
                                                addAllSubDirs(sub.getPath(), 
job, paths);
                                        }
                                } else {
                                        AvroStorageLog.details("Add input 
file:" + file);
                                        paths.add(file.getPath());
                                }
                        } catch (FileNotFoundException e) {
                                AvroStorageLog.details("Input path does not 
exist: " + path);
                                return false;
                        }
                        return true;
                }
                return false;
        }
{code}

> Fix piggybank unit test TestAvroStorage
> ---------------------------------------
>
>                 Key: PIG-1890
>                 URL: https://issues.apache.org/jira/browse/PIG-1890
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Daniel Dai
>            Assignee: Jakob Homan
>         Attachments: PIG-1890-1.patch, PIG-1890-2.patch
>
>
> TestAvroStorage fail on trunk. There are two reasons:
> 1. After PIG-1680, we call LoadFunc.setLocation one more time.
> 2. The schema for AvroStorage seems to be wrong. For example, in first test 
> case testArrayDefault, the schema for "in" is set to "PIG_WRAPPER: (FIELD: 
> {PIG_WRAPPER: (ARRAY_ELEM: float)})". It seems PIG_WRAPPER is redundant. This 
> issue is hidden until PIG-1188 checked in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to