Accumulo Output Format can create numerous empty files
------------------------------------------------------

                 Key: ACCUMULO-55
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-55
             Project: Accumulo
          Issue Type: Bug
          Components: client
    Affects Versions: 1.3.5, 1.4.0, 1.5.0
            Reporter: John Vines
            Assignee: John Vines
             Fix For: 1.4.0


In conjuction with Accumulo-52, large amounts of empty files can cause 
problems. The short problem is when a reducer is empty, due to the partitioner 
used, the file for it will still be created. We do not want empty files 
lingering around, especially do not want them bulk imported. It should be as 
simple as either not creating the file until a write on it is attempted (more 
complex) or the file should be deleted at close time if there were no records 
written (simpler but more overhead due to file creation and deletion in the 
process).

Due to the complexity of the patch, I do not think it should be applied before 
the 1.4 version. It should simply delete the file after closing it if there are 
no writes to the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to