Accumulo Output Format can create numerous empty files
------------------------------------------------------
Key: ACCUMULO-55
URL: https://issues.apache.org/jira/browse/ACCUMULO-55
Project: Accumulo
Issue Type: Bug
Components: client
Affects Versions: 1.3.5, 1.4.0, 1.5.0
Reporter: John Vines
Assignee: John Vines
Fix For: 1.4.0
In conjuction with Accumulo-52, large amounts of empty files can cause
problems. The short problem is when a reducer is empty, due to the partitioner
used, the file for it will still be created. We do not want empty files
lingering around, especially do not want them bulk imported. It should be as
simple as either not creating the file until a write on it is attempted (more
complex) or the file should be deleted at close time if there were no records
written (simpler but more overhead due to file creation and deletion in the
process).
Due to the complexity of the patch, I do not think it should be applied before
the 1.4 version. It should simply delete the file after closing it if there are
no writes to the file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira