Github user ctubbsii commented on the issue:
https://github.com/apache/accumulo/pull/253
I spoke with Matt, who wrote the original code for this PR, to get a sense
of his design intentions. He clarified that the overall goal was to try to
avoid `RandomVolumeChooser` as much as possible, no matter what went wrong
(typos, etc.) when a user mis-typed a per-table chooser class or the class path
was not set up correctly to find the user's chooser class.
Currently (without this PR), Accumulo has hard-coded defaults on
`RandomVolumeChooser` in several places. This can be problematic, because it
can result in filling up volumes reserved for certain tables or the WALs.
Further, it seems to select the `RandomVolumeChooser` mostly silently.
This PR changes things so that if the user has configured the
`PerTableVolumeChooser`, then when that fails to load a particular table's
chooser (`table.volume.chooser` from a `TableConfiguration`), it will not fall
back to the `RandomVolumeChooser`, but instead fall back to the system-wide
setting (`SystemConfiguration`) for `table.volume.chooser` (for example,
appearing in `accumulo-site.xml`, outside of any per-table context). If the
user did not specify a `table.volume.chooser`, then then the
`PerTableVolumeChooser` would simply fail. This ensures that
`RandomVolumeChooser` is never used accidentally when using the
`PerTableVolumeChooser`. A consequence of the change in this PR is that the
defaults have changed, but this will only affect users who were relying on
`general.volume.chooser` being the default and overriding
`table.volume.chooser` (or vice-versa).
The following is how I'm thinking it should probably work:
Use the `RandomVolumeChooser` as the new default for
`general.volume.chooser` and empty string (or null) for the default
`table.volume.chooser`, as they are in this PR (see last sentence of previous
paragraph). Document *VERY WELL* (especially in the release notes) that if the
`PerTableVolumeChooser` is required by the user (e.g. they were relying on it
being the default), then the user **must** set the `general.volume.chooser` to
the `PerTableVolumeChooser` **AND** must either set a default value for
`table.volume.chooser` in the system-wide configuration (system-wide ZK
settings, or site file) **OR** ensure `table.volume.chooser` is set for *every*
table (or namespace). If loading a per-table chooser or the general chooser
fails at any point, we simply throw an exception, rather than trying to find a
fall-back chooser to use.
If we simply fail, rather than trying to fall back to some system-wide
working configuration, we can still run into problems loading the class defined
at that level, and it's not clear that is what the user wishes to occur. So, we
should just fail instead. This would simplify this PR a bit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---