----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/574/#review872 -----------------------------------------------------------
@Pranav: Ryan is reviewing your v3. He knows hfile best. Should be up soon. - stack On 2010-08-10 17:58:43, Pranav Khaitan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://review.cloudera.org/r/574/ > ----------------------------------------------------------- > > (Updated 2010-08-10 17:58:43) > > > Review request for hbase, stack, Jonathan Gray, Ryan Rawson, Karthik > Ranganathan, and Kannan Muthukkaruppan. > > > Summary > ------- > > What this patch includes: > 1. Reseek framework. The ability to reseek to any position after having > seeked to some point in the file. To add this utility, changes were required > in all scanners. > 2. The option for any filter to be able to tell the scanner which key it > wants to go to next. Filters can be easily customized for different use-cases > without affecting the main read path. Since filters are optional, they do not > add any overhead for users who do not take advantage of it. > 3. ColumnPrefixFilter: This filter serves the purpose of selecting keys with > columns having a specified prefix. The filter takes advantage of theability > to pass keys to the scanner to tell which key it should seek to next. > 4. This also gives the option to seek directly to the required columns using > reseek mechanism (HBASE-2450). However, it needs to be decided if that > feature should be made optional using a filter or should it be added to the > read path to be used by everyone. Did not include it in this patch since it > required further discussions and testing. > 5. Small changes to ScanQueryMatcher to return more specific return codes. > > For HFile and reseek, the modifications were done after discussions with Ryan > and he had also written some code for this patch. For ScanQueryMatcher and > Filters, discussions were held with Jonathan, Karthik and Kannan. > > This is big as it touches 21 files. It is important to closely review the > reseek functions in HFile, StoreFileScanner, KeyValueHeap and > HalfStoreFileReader as these functions are slightly tricky and probably going > to be used in a lot of improvements in future. > > > This addresses bugs HBASE-1517, HBASE-2903 and HBASE-2904. > http://issues.apache.org/jira/browse/HBASE-1517 > http://issues.apache.org/jira/browse/HBASE-2903 > http://issues.apache.org/jira/browse/HBASE-2904 > > > Diffs > ----- > > trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java > PRE-CREATION > trunk/src/main/java/org/apache/hadoop/hbase/filter/Filter.java 983321 > trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java 983321 > trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java 983321 > trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java > 983321 > trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java > 983321 > trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 983321 > trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java > 983321 > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java > 983321 > > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java > 983321 > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java > 983321 > > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MinorCompactingStoreScanner.java > 983321 > > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java > 983321 > > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java > 983321 > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java > 983321 > > trunk/src/test/java/org/apache/hadoop/hbase/filter/TestColumnPrefixFilter.java > PRE-CREATION > trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java > PRE-CREATION > > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java > 983321 > > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java > 983321 > > trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java > 983321 > > Diff: http://review.cloudera.org/r/574/diff > > > Testing > ------- > > Added tests at HFileScanner and Filter/RegionScanner levels. The time taken > for running these tests is very less. All existing tests pass successfully. > Performance benchmarking was done and significant gains in performance can be > seen for corresponding use-cases. > > > Thanks, > > Pranav > >
