[ https://issues.apache.org/jira/browse/KUDU-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299194#comment-16299194 ]
Dan Burkert commented on KUDU-2243: ----------------------------------- > I'm on board with that but then I think we should also rename the internal > "blocks" of data to "pages" – ie have PageDecoder, PageEncoder, etc, to make > a more clear distinction between a "column block" and a page of data within > that column block. The current {{BlockBuilder}} and {{BlockDecoder}} classes appear to wrap an entire cblock, though. Any finer-grained organization internally isn't exposed in the interface. I may be misunderstanding what you mean - could you point out a specific item that could be renamed? > I seem to recall that this design was meant to allow the output of the Scan > to reference memory held by the prepared blocks. ie in the case of a > PLAIN-encoded string column, it could just have pointers directly to the data > in the block, rather than copying them into any output arena. Interesting. I think this may actually be happening, for instance I don't see anyplace in [{{PlainBlockDecoder::CopyNextValues}}|https://github.com/apache/kudu/blob/master/src/kudu/cfile/plain_block.h#L192-L209] which is copying the data. One thing I can't figure out is why cfile_reader has the {{needs_rewind_}} field, and all the checks that go along with it. I guess it's so you can call {{Scan}} multiple times in a row without seek/prepping, but that doesn't seem like something we'd actually want to do in the normal case of a scan. > CFile Reader improvements > ------------------------- > > Key: KUDU-2243 > URL: https://issues.apache.org/jira/browse/KUDU-2243 > Project: Kudu > Issue Type: Improvement > Components: cfile > Affects Versions: 1.6.0 > Reporter: Dan Burkert > > I've done a pretty thorough review of all the CFile reader code over the last > few days in order to make a targeted bug fix, and I've got some ideas for how > we can simplify it. I'd like to get others thoughts. > * To reduce confusion between CFile data blocks and FS manager blocks, I > think we should change all references in code and docs of CFile data blocks > to 'cblock'. > * Much of the complexity of the CFileIterator is due to it's complex public > API, which requires separate {{Seek(idx) -> Prepare(nrows) -> Scan(output > buf, predicates)}} calls. Additionally, the Prepare step can materialize > many blocks, which then need to be put in a queue. I think all of this could > be simplified by changing the API to be {{Seek(idx) -> Scan(nrows, output > buf, predicates)}}, and have the CFile iterator only cache the > most-recently-materialized block (instead of the queue). For really big scan > batches, this will change the internal scan/materialize pattern from > materializing all cblocks up front then copying, to materializing and copying > of cblocks being interleaved. Since in most cases cblocks are usually much > bigger (256kib) than scan batches (100 cells), I think it won't actually lead > to measurably different behavior. > * {{QueueCurrentDataBlock}} and {{ReadCurrentDataBlock}} should drop > {{Current}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)