keith-turner opened a new pull request, #5104:
URL: https://github.com/apache/accumulo/pull/5104

   Bulk imports can add files to a tablet faster than compactions can shrink 
the number of files.  There are many scenarios that can cause this. The 
following are some of the situations that could cause this.
   
    * Compactors are all busy when new bulk imports arrive.
    * Many processes bulk import a few files to a single tablet at around the 
same time
    * A single process bulk imports a lot of files to a single tablet
   
   When a tablet has too many files it can eventually cause cascading problems 
for compaction and scan.  The change adds two properties to help avoid this 
problem.
   
   The first property `table.file.pause`.  The behavior of this property is to 
pause bulk imports, and eventually minor compactions, when a tablets current 
file counts exceeds this property.  The default is unlimited and therefore the 
default will never pause.
   
   The second property is `table.bulk.max.tablet.files`.  This property 
determines the maximum number of files a bulk import can add to a single 
tablet.  When this limit is exceeded the bulk import operation will fail w/o 
making changes to any tablets.
   
   Below is an example of how these properties behave.
   
    1. Set table.file.pause=30
    2. Set table.bulk.max.tablet.files=100
    3. Import 20 files into tablet A, this causes tablet A to have 20 files
    4. Import 20 files into tablet A, this causes tablet A to have 40 files
    5. Import 20 files into tablet A. Because the tablet currently has 40 files 
and the pause limit is 30, this bulk import will pause.
    6. Tablet A compacts 10 files, this causes tablet A to have 31 files. It is 
still above the pause limit so the bulk import does not progress.
    7. Tablet A compacts 10 files, this causes tablet A to have 22 files.
    8. The paused bulk import proceeds, this causes tablet A to have 42 files.
    9. Import 200 files into tablet B and one file in tablet A.  This operation 
fails w/o changing tablet A or B because 200 exceeds the value of 
table.bulk.max.tablet.files.
   
   While making this change ran into two preexisting problems.  One was with 
bulk import setting time.  For the case of multiple files the behavior of 
setting time was incorrect and inconsistent depending on the table time type 
and if the tablet was hosted or not.  Made the behavior consistent for hosted 
or unhosted and the two table time types. The behavior is that single time 
stamp is allocated for all files in all cases. The code used to allocate 
different number of timestamps in the four different cases.  This behavior was 
causing tablet refresh to fail and these changes to fail.  Fixed this existing 
issue since progress could not be made on these changes without fixing it. The 
new test in this PR that add lots of files to a single tablet and set request 
bulk import to set time uncovered the existing problem.
   
   The second problem was the existing code had handling for the case of a 
subset of files being added to a tablet by bulk import. This should never 
happen because files are added via a mutation. Expect either the entire 
mutation to go through or nothing.  Removed this handling for a subset and 
changed the code to throw an exception if a subset other than the empty set is 
seen. This change greatly simplified implementing this feature.
   
   fixes #5023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to