ctubbsii commented on PR #3511:
URL: https://github.com/apache/accumulo/pull/3511#issuecomment-1601646574

   Need to be sure this validation method only checks prefixes for filenames 
that we've stored... and this code path doesn't get executed when a user 
provides their own file names as in put (rfile-info or as a file that is being 
bulk imported, for example), because users can name their RFiles anything they 
like.
   
   I'm also concerned generally about doing this... for a long time, when HDFS 
or another issue caused an RFile to be corrupt or missing, we have recommended 
users perform metadata surgery to place an RFile directly. I can imagine other 
scenarios where a user has done that for surgical/maintenance reasons (like 
maybe manually compacting some files that is inconvenient for them to compact 
using our built-in compactors), and may have their own naming convention for 
the file they place (and for which they add an entry to the metadata table).
   
   This validation could break that situation where users have done that.... 
and not for any great reason. These file names are just conventions... they 
aren't a strict requirement. By default, Accumulo should work regardless of the 
file name, and these names would only matter for custom compaction strategies, 
trash policies, etc. that the user chose to deploy.
   
   Maybe a warning could be logged for unexpected name prefixes instead of a 
hard failure?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to