ctubbsii edited a comment on issue #1973:
URL: https://github.com/apache/accumulo/issues/1973#issuecomment-859241448


   I was looking into this, and I think I was wrong about the preventative 
measure. It looks like it is already the case that none of these properties 
will have an effect until the tserver is restarted anyway. The only one that 
could affect things is enabling or disabling the use of the cache, and that is 
a per-table setting that should be possible to turn on and off at runtime.
   
   Looking into the stack trace a bit more, I started wondering if the 
`cacheId` wasn't unique. Unfortunately, the exceptions don't include the 
`blockName`, which they should.
   
   Additionally, I saw that at least in one case, `SummaryReader.load()`, there 
is no `cacheId` being set at all, so it should be `null`, which seems prone to 
error, since that's clearly not unique. The `CacheableBuilder` object looks 
like a builder pattern, but doesn't enforce required fields, and doesn't have 
any validation function to ensure it was used correctly, so `cacheId` could 
easily be set to `null`. However, even if that is a problem for 
`SummaryReader.load()` (@keith-turner implemented summaries, so he might have 
some insight into that), it doesn't seem like it would cause the stack trace 
reported in this issue, which doesn't mention summaries at all.
   
   If the problem is the `cacheId` is not unique (`null` or otherwise), then 
the block name looked up could have lots of collisions for the same offset (in 
`CacheableBlockFile.Reader.getMetaBlock()`, the block name is determined with 
`String _lookup = this.cacheId + "R" + offset;`).
   
   There are a few things that can/should be done:
   
   1. Improve the exception messages to include the `blockName` to make 
debugging this easier; @keith-ratcliffe is this issue reproducible? If so, what 
blockName is printed with the IllegalStateExceptions? (Edit: #2163)
   2. Ensure `CacheableBuilder` enforces the requirement for a non-null 
`cacheId` by adding a `build()` function with validation, like a true builder 
pattern (might not fix the problem) (Edit: #2160)
   3. Can document these properties as "fixed" properties in ZK (would help 
with documentation only... would not help with this bug, since the code that 
sets up the cache manager is only executed once at startup). (Edit: #2164)
   
   While digging into this, I noticed there were a few classes in the stack 
trace that I don't have the source to to track down further.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to