[
https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060178#comment-13060178
]
Mads Moeller commented on PIG-1890:
-----------------------------------
Hi Ken,
I am have the same use case as you and encountering the same behavior as
Patrick. I made a few modifications to the methods "addInputPaths" and
"addAllSubDirs" from your patch, which seems to solve the UNION issue.
{code}
public static boolean addInputPaths(String pathString, Job job)
throws IOException {
Set<Path> pathSet = new HashSet<Path>();
if (addAllSubDirs(new Path(pathString), job, pathSet)) {
Path[] paths = pathSet.toArray(new Path[pathSet.size()]);
return true;
}
return false;
}
/**
* Adds all non-hidden directories and subdirectories to the paths set
*
* @throws IOException
*/
private static boolean addAllSubDirs(Path path, Job job, Set<Path>
paths) throws IOException {
FileSystem fs = FileSystem.get(job.getConfiguration());
if (PATH_FILTER.accept(path)) {
try {
FileStatus file = fs.getFileStatus(path);
if (file.isDir()) {
for (FileStatus sub :
fs.listStatus(path)) {
addAllSubDirs(sub.getPath(),
job, paths);
}
} else {
AvroStorageLog.details("Add input
file:" + file);
paths.add(file.getPath());
}
} catch (FileNotFoundException e) {
AvroStorageLog.details("Input path does not
exist: " + path);
return false;
}
return true;
}
return false;
}
{code}
> Fix piggybank unit test TestAvroStorage
> ---------------------------------------
>
> Key: PIG-1890
> URL: https://issues.apache.org/jira/browse/PIG-1890
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Daniel Dai
> Assignee: Jakob Homan
> Attachments: PIG-1890-1.patch, PIG-1890-2.patch
>
>
> TestAvroStorage fail on trunk. There are two reasons:
> 1. After PIG-1680, we call LoadFunc.setLocation one more time.
> 2. The schema for AvroStorage seems to be wrong. For example, in first test
> case testArrayDefault, the schema for "in" is set to "PIG_WRAPPER: (FIELD:
> {PIG_WRAPPER: (ARRAY_ELEM: float)})". It seems PIG_WRAPPER is redundant. This
> issue is hidden until PIG-1188 checked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira