Rohini Palaniswamy created PIG-3891:
---------------------------------------
Summary: FileBasedOutputSizeReader does not calculate size of
files in sub-directories
Key: PIG-3891
URL: https://issues.apache.org/jira/browse/PIG-3891
Project: Pig
Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Rohini Palaniswamy
FileBasedOutputSizeReader only includes files in the top level output
directory. So if files are stored under subdirectories (For eg: MultiStorage),
it does not have the bytes written correctly.
0.11 shows the correct number of total bytes written and this is a regression.
A quick look at the code shows that the JobStats.addOneOutputStats() in 0.11
also does not recursively iterate and code is same as
FileBasedOutputSizeReader. Need to investigate where the correct value comes
from in 0.11 and fix it in 0.12.1/0.13.
--
This message was sent by Atlassian JIRA
(v6.2#6252)