Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11395 )

Change subject: KUDU-686 (part 2/2): use DeltaPreparer in DeltaFileIterator
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@7
PS2, Line 7: use DeltaPreparer
> Can you talk more about how this changes the logic? Or do you view this as
It's a refactor at heart, though it (obviously) affects the division of labor 
between PrepareBatch() and ApplyUpdates(). I mentioned that below where I talk 
about flame graphs. At the end of the day the DeltaFileIterator still does the 
same thing it always did: read deltas from a file, decode them, and apply the 
appropriate ones during a scan. It just does it more efficiently.

I actually don't want to mention diff scans in these two patches because they 
stand on their own merits: they improve scan performance when projecting 
multiple columns by reducing CPU load incurred via unnecessary delta decoding. 
Bringing diff scans into this will just muddy the waters. Instead, I'll tie 
back to these patches in the diff scan patch series.


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@9
PS2, Line 9: leverage DeltaPreparer
> why? just code reuse? get rid of Todd's TODO in the code?
See the JIRA and below: this refactor improves performance by getting rid of 
the decoding-heavy approach taken by the "visitor" pattern previously used by 
DeltaFileIterator.


http://gerrit.cloudera.org:8080/#/c/11395/2//COMMIT_MSG@25
PS2, Line 25: entralize as
> Can you be more specific about what we think these improvements are?
The rest of the sentence explains: it does away with a bunch of unnecessary 
delta decoding. There's more information in the JIRA.

I'll add some more color here but the commit message is already quite detailed 
(by my standards) and in the interest of brevity I don't want to duplicate too 
much from the JIRA.


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.h@258
PS2, Line 258: re irrelevan
> nit: doc this parameter
Done


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/delta_store.cc@57
PS2, Line 57: // Returns whether a mutation at 'ts' is relevant under 'snap'.
> Mind documenting the interface to this helper? It's not obvious what the me
Done


http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc
File src/kudu/tablet/deltamemstore.cc:

http://gerrit.cloudera.org:8080/#/c/11395/2/src/kudu/tablet/deltamemstore.cc@248
PS2, Line 248:       // TODO(adar): is this correct?
> Mind adding a comment with the thought process behind this bookkeeping / cl
Not sure what you mean. This isn't bookkeeping or cleanup; it's an optimization 
to skip any additional deltas for this row once we've established that we're 
done with the row.

I added the TODO because this seemed like a brain-dead optimization that should 
have been done previously, so I was wondering if perhaps it's an incorrect 
thing to do. See 
https://gerrit.cloudera.org/c/11394/2/src/kudu/tablet/deltamemstore.cc#b268 for 
the old code here.



--
To view, visit http://gerrit.cloudera.org:8080/11395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87de52092262c4b42c1bd5107f9139edfc3888b5
Gerrit-Change-Number: 11395
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Wed, 12 Sep 2018 19:40:18 +0000
Gerrit-HasComments: Yes

Reply via email to