Rohini Palaniswamy created PIG-3796:
---------------------------------------
Summary: PigStats output bytes written not collected for relative
paths
Key: PIG-3796
URL: https://issues.apache.org/jira/browse/PIG-3796
Project: Pig
Issue Type: Bug
Reporter: Rohini Palaniswamy
PIG-2924 added support for custom stats reader. But the
FileBasedOutputSizeReader only checks for
public static boolean isHDFSFileOrLocalOrS3N(String uri){
if(uri == null)
return false;
if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") ||
uri.startsWith("hdfs:")
|| uri.startsWith("viewfs:") || uri.startsWith("file:") ||
uri.startsWith("s3n:")) {
return true;
}
return false;
}
Better to change this to UriUtil.hasFileSystemImpl which will automatically
filter out hbase://. This would still not solve cases like HCatStorer which
does not have a scheme. Will also write a default stats reader that checks for
known StoreFuncInterface implementations that are not file based like
HCatStorer. More standard ones can be added later. AccumuloStorage should not
be a problem as it has scheme accumulo://.
--
This message was sent by Atlassian JIRA
(v6.2#6252)