Zihao Ye has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21039 )

Change subject: IMPALA-12786: Optimize count(*) for JSON scans
......................................................................


Patch Set 10:

(5 comments)

Thanks for the review!

http://gerrit.cloudera.org:8080/#/c/21039/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21039/10//COMMIT_MSG@53
PS10, Line 53: 
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+--------+-----------+
> Nice improvement! Do you also get chance to test on nested data set, e.g. t
I haven't tested on nested datasets yet, mainly because there isn't a 
sufficiently large JSON-formatted nested dataset available. The load_nested.py 
file currently does not support generating JSON format. Do we have any other 
available scripts that can generate it?


http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/exec/json/json-parser-test.cc
File be/src/exec/json/json-parser-test.cc:

http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/exec/json/json-parser-test.cc@103
PS10, Line 103:     }
              :     else {
> nit: please put "else" to the line of "}"
Done


http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/exec/json/json-parser-test.cc@193
PS10, Line 193:   TestSkip(R"({"a":null, "b":[1,true,false]})", TYPE_OBJECT);
> Can we add more nested cases? E.g.
Done, and that will fail due to the nested object missing key names.


http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/exec/json/json-parser-test.cc@247
PS10, Line 247:     EXPECT_OK(js.Scan(max_rows, &num_rows));
> Can we also check row count from this is the same as the final 'row_count'?
Did you referring to checking whether the row count results obtained from 
Scan() and Count() are consistent? If so, the EXPECT_EQ(row_count, 
js.row_count()) below has already checked this.


http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/21039/10/be/src/service/query-options.cc@1259
PS10, Line 1259:         
query_options->__set_enable_tuple_cache(enable_tuple_cache);
> missing "break" after this
Done



--
To view, visit http://gerrit.cloudera.org:8080/21039
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I97ff097661c3c577aeafeeb1518408ce7a8a255e
Gerrit-Change-Number: 21039
Gerrit-PatchSet: 10
Gerrit-Owner: Zihao Ye <eyiz...@163.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Zihao Ye <eyiz...@163.com>
Gerrit-Comment-Date: Tue, 14 May 2024 08:00:06 +0000
Gerrit-HasComments: Yes

Reply via email to