Sergey Shelukhin created HBASE-8340:
---------------------------------------

             Summary: WAL log compression handling of seeks seems to be either 
inefficient or incorrect
                 Key: HBASE-8340
                 URL: https://issues.apache.org/jira/browse/HBASE-8340
             Project: HBase
          Issue Type: Bug
            Reporter: Sergey Shelukhin


In next(...):
{code}
    if (compressionContext != null && emptyCompressionContext) {
      emptyCompressionContext = false;
    }
    return ...
{code}
 
In seek()
{code}
    if (compressionContext != null && emptyCompressionContext) {
      while (next() != null) {
        if (getPosition() == pos) {
          emptyCompressionContext = false;
          break;
        }
      }
...
reader.seek(pos);
{code}

So, seek will seek the file directly if either any next, or any seek, has been 
called before.

I am not sure what this code is for, but my best guess is that it is to 
populate the dictionary for compression.
If it is so, it would seem that one next() call (or even one seek() call) would 
not be enough, and seek must always use next(), otherwise it is incorrect.

If we assume that one next() is enough to be able to use reader.seek, as the 
current code would seem to imply, then there's no need for the first seek to 
call next() in a loop - it can call next once and then do reader.seek.

Note: even in case if all of this works fine because external usage creates the 
object and does one seek before any next-s, and no seeks after (the only 
bug-free pattern currently possible with both methods used if I'm not 
mistaken), then the code needs to be tightened and bug potential removed.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to