Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
......................................................................


Patch Set 6:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73: exceeds 
![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D).
            : Therefore, in order to use skip scan performance benefits when 
possible and maintain a consistent performance with
> I think it's the number of rows in the CFileSet, which I think is also the
You are right ! How about rewording this to "rows in tablet" ?


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS5, Line 9: index skip scan (a.k.
> It's great that you found another reference to the same idea in the google'
Sounds good. Done.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13
PS5, Line 13: Let's b
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS5, Line 40:  an option
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38
PS5, Line 38:
            : Instead, a full table scan is done by default. Other databases 
may optimize such scans by building secondary indexes
            : (though it might be redundant to build one on one of the
> Let's stick with a single concrete example, say `tstamp`. Then we can point
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41
PS5, Line 41: given its lack of secondary index support.
            : 
            : The question is, can Kudu do better than a full table scan here?
            :
> nit: I think this would read better after L45. E.g.
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS5, Line 47: in the in
> nit: since this is a concrete example, we know there is only one column bef
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50
PS5, Line 50:
            : {% highlight SQL %}
> nit: reword as "to **skip** to the rows that have distinct prefix keys, and
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: ce, this metho
> nit: "Kudu tablet" or "tablet server" or "Kudu"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: as **skip scan optimization**[2-3].
            :
            : Performance
> Maybe reverse the order of **skip** and **scan**, since the name is "skip s
Done. You are correct, I have rephrased the sentence to better clarify this 
point.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70
PS5, Line 70:
> nit: add "the" in front of "Lower" and "better"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71
PS5, Line 71: nts, on up to 10 million rows per t
> I seem to recall a plot that showed the performance without the dynamic dis
That's a good point. Unfortunately, I do not have the backup of that slide.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77:
> nit: skips? for consistency with the "skip" and "scan" terminology
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77: It will be an in
> nit: I think it's clear enough that this may refer to multiple, so maybe ju
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89
PS5, Line 89:
> nit: one (`host`)
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104
PS5, Line 104: [[1
> Do you feel good about adding one more reference?  I think https://www.sqli
Thanks so much for this suggestion. I have added this reference too.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105
PS5, Line 105: Geo-
> nit: usually in the reference section they use '[x]' where it's possible to
Thank you for pointing this out. Done.



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta <ag3...@columbia.edu>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Anupama Gupta <ag3...@columbia.edu>
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Comment-Date: Sun, 09 Sep 2018 19:07:51 +0000
Gerrit-HasComments: Yes

Reply via email to