[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9857 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Additionally, build ASAN and ran the unit test. Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Reviewed-on: http://gerrit.cloudera.org:8080/9857 Tested-by: Impala Public Jenkins Reviewed-by: Zach Amsden --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 172 insertions(+), 86 deletions(-) Approvals: Impala Public Jenkins: Verified Zach Amsden: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9857 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 3: Code-Review+2 Yes, I did - forgot to carry the +2 after rebase -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 05 Apr 2018 18:27:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6776: Increase region move timeout.
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9892 ) Change subject: IMPALA-6776: Increase region move timeout. .. Patch Set 1: Code-Review+2 Seems fine; it seems weirdly coincidental that so far all of the failed moves have happened on the first part of the keyspace. -- To view, visit http://gerrit.cloudera.org:8080/9892 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic62719f1b1aad463bcdc18d0803e780ebb0f8b18 Gerrit-Change-Number: 9892 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Apr 2018 00:22:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9857 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9857/1/be/src/exec/delimited-text-parser-test.cc File be/src/exec/delimited-text-parser-test.cc: http://gerrit.cloudera.org:8080/#/c/9857/1/be/src/exec/delimited-text-parser-test.cc@161 PS1, Line 161: nul_delim_parser > would it make sense to call this nul_column_parser (analogous to nul_field_ Done -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 29 Mar 2018 23:31:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9857 to look at the new patch set (#2). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Additionally, build ASAN and ran the unit test. Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 172 insertions(+), 86 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/9857/2 -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9857 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 1: Here's the diff highlighting the issue: diff --git a/be/src/exec/delimited-text-parser-test.cc b/be/src/exec/delimited-text-parser-test.cc index d8e977d..7df9f06 100644 --- a/be/src/exec/delimited-text-parser-test.cc +++ b/be/src/exec/delimited-text-parser-test.cc @@ -150,16 +150,17 @@ TEST(DelimitedTextParser, SpecialDelimiters) { const char TUPLE_DELIM = '\n'; // implies '\r' and "\r\n" are also delimiters const char NUL_DELIM = '\0'; const int NUM_COLS = 1; + const int MAX_COLS = 2; - bool is_materialized_col[NUM_COLS]; - for (int i = 0; i < NUM_COLS; ++i) is_materialized_col[i] = true; + bool is_materialized_col[MAX_COLS]; + for (int i = 0; i < MAX_COLS; ++i) is_materialized_col[i] = true; TupleDelimitedTextParser tuple_delim_parser(NUM_COLS, 0, is_materialized_col, TUPLE_DELIM); TupleDelimitedTextParser nul_delim_parser(NUM_COLS, 0, is_materialized_col, NUL_DELIM); - TupleDelimitedTextParser nul_field_parser(2, 0, is_materialized_col, + TupleDelimitedTextParser nul_field_parser(MAX_COLS, 0, is_materialized_col, TUPLE_DELIM, NUL_DELIM); // Non-SSE case -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 29 Mar 2018 21:35:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9857 Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Additionally, build ASAN and ran the unit test. Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 172 insertions(+), 86 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/9857/1 -- To view, visit http://gerrit.cloudera.org:8080/9857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2190c57681f29f34ee1eb393e30dfdda5839098c Gerrit-Change-Number: 9857 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] Revert "IMPALA-6389: Make '\0' delimited text files work"
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9851 ) Change subject: Revert "IMPALA-6389: Make '\0' delimited text files work" .. Patch Set 2: Strict revert; verified to fix build. -- To view, visit http://gerrit.cloudera.org:8080/9851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If581311033de8c26e33316b19192c4579594f261 Gerrit-Change-Number: 9851 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 29 Mar 2018 05:00:27 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6389: Make '\0' delimited text files work"
Zach Amsden has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9851 ) Change subject: Revert "IMPALA-6389: Make '\0' delimited text files work" .. Revert "IMPALA-6389: Make '\0' delimited text files work" This reverts commit c2bdaf8af4cf35d3462595c2a341ed84dcf5d960. An ASAN issue and potentially other problem have been found; reverting to unbreak the build and tests. Change-Id: If581311033de8c26e33316b19192c4579594f261 Reviewed-on: http://gerrit.cloudera.org:8080/9851 Reviewed-by: Lars Volker Tested-by: Zach Amsden --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 84 insertions(+), 169 deletions(-) Approvals: Lars Volker: Looks good to me, approved Zach Amsden: Verified -- To view, visit http://gerrit.cloudera.org:8080/9851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If581311033de8c26e33316b19192c4579594f261 Gerrit-Change-Number: 9851 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] Revert "IMPALA-6389: Make '\0' delimited text files work"
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9851 ) Change subject: Revert "IMPALA-6389: Make '\0' delimited text files work" .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If581311033de8c26e33316b19192c4579594f261 Gerrit-Change-Number: 9851 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 29 Mar 2018 04:59:43 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6389: Make '\0' delimited text files work"
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9851 Change subject: Revert "IMPALA-6389: Make '\0' delimited text files work" .. Revert "IMPALA-6389: Make '\0' delimited text files work" This reverts commit c2bdaf8af4cf35d3462595c2a341ed84dcf5d960. An ASAN issue and potentially other problem have been found; reverting to unbreak the build and tests. Change-Id: If581311033de8c26e33316b19192c4579594f261 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 84 insertions(+), 169 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/9851/1 -- To view, visit http://gerrit.cloudera.org:8080/9851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If581311033de8c26e33316b19192c4579594f261 Gerrit-Change-Number: 9851 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 6: I think so; I'll let it run through GVO first. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 6 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 27 Mar 2018 19:32:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6670: refresh lib-cache entries from plan
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9697 ) Change subject: IMPALA-6670: refresh lib-cache entries from plan .. Patch Set 21: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Gerrit-Change-Number: 9697 Gerrit-PatchSet: 21 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 23 Mar 2018 03:25:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 4: Sigh. Release build failed. Fixed now. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 23 Mar 2018 01:59:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9525 to look at the new patch set (#6). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 169 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/6 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 6 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9525 to look at the new patch set (#5). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 169 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/5 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6670: refresh lib-cache entries from plan
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9697 ) Change subject: IMPALA-6670: refresh lib-cache entries from plan .. Patch Set 20: (2 comments) http://gerrit.cloudera.org:8080/#/c/9697/20/be/src/runtime/lib-cache.cc File be/src/runtime/lib-cache.cc: http://gerrit.cloudera.org:8080/#/c/9697/20/be/src/runtime/lib-cache.cc@410 PS20, Line 410: if (status.ok() && exp_mtime >= 0 && fs_last_modified_time != exp_mtime) { Would it be simpler to hoist this above line 404, then just set: status = Status(TErrorCode::LIB_VERSION_MISMATCH, ...) And let the !status.ok() body do all the work. http://gerrit.cloudera.org:8080/#/c/9697/20/fe/src/main/java/org/apache/impala/catalog/Function.java File fe/src/main/java/org/apache/impala/catalog/Function.java: http://gerrit.cloudera.org:8080/#/c/9697/20/fe/src/main/java/org/apache/impala/catalog/Function.java@57 PS20, Line 57: private final static Logger LOG = LoggerFactory.getLogger(Function.class); 1 more Logger here. -- To view, visit http://gerrit.cloudera.org:8080/9697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Gerrit-Change-Number: 9697 Gerrit-PatchSet: 20 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 23 Mar 2018 01:24:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6670: refresh lib-cache entries from plan
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9697 ) Change subject: IMPALA-6670: refresh lib-cache entries from plan .. Patch Set 19: (1 comment) http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.cc File be/src/runtime/lib-cache.cc: http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.cc@411 PS19, Line 411: return Status(TErrorCode::LIB_VERSION_MISMATCH, hdfs_lib_file, > Should you also RemoveEntryInternal in this case? Vuk and I discussed this and I think this would be the right thing to do in the event of a backwards leap of mtime; such a thing would be extremely unusual, but could happen in the event the HDFS namenode gets restored from a backup or fails and is manually replaced with a spare that has stale data. -- To view, visit http://gerrit.cloudera.org:8080/9697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Gerrit-Change-Number: 9697 Gerrit-PatchSet: 19 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 23 Mar 2018 00:40:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6670: refresh lib-cache entries from plan
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9697 ) Change subject: IMPALA-6670: refresh lib-cache entries from plan .. Patch Set 19: (10 comments) http://gerrit.cloudera.org:8080/#/c/9697/17/be/src/exec/external-data-source-executor.cc File be/src/exec/external-data-source-executor.cc: http://gerrit.cloudera.org:8080/#/c/9697/17/be/src/exec/external-data-source-executor.cc@140 PS17, Line 140: // TODO: pass the mtime from the coordinator. for now, skip the mtime check (-1). Is there a JIRA for this? If not, please create one and reference it here. http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/exec/external-data-source-executor.cc File be/src/exec/external-data-source-executor.cc: http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/exec/external-data-source-executor.cc@140 PS19, Line 140: // TODO: pass the mtime from the coordinator. for now, skip the mtime check (-1). Please file a JIRA for this and note it in the comment. http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.h File be/src/runtime/lib-cache.h: http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.h@174 PS19, Line 174: /// The entry is refreshed if it needs_refresh is set and its mtime is 'if it' -> 'if' http://gerrit.cloudera.org:8080/#/c/9697/17/be/src/runtime/lib-cache.h File be/src/runtime/lib-cache.h: http://gerrit.cloudera.org:8080/#/c/9697/17/be/src/runtime/lib-cache.h@172 PS17, Line 172: /// need to be refreshed. 'if it' -> 'if' http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.cc File be/src/runtime/lib-cache.cc: http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/runtime/lib-cache.cc@411 PS19, Line 411: return Status(TErrorCode::LIB_VERSION_MISMATCH, hdfs_lib_file, Should you also RemoveEntryInternal in this case? http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/9697/19/be/src/service/fe-support.cc@401 PS19, Line 401: // TODO: used for external data sources; add proper mtime. Same comment for JIRA http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/analysis/Expr.java@615 PS19, Line 615: thriftFn.setLast_modified_time(fn_.getLastModifiedTime()); If doing this unconditionally, why not just do this in Function's toThrift() method? http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java: http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@40 PS19, Line 40: private final static Logger LOG = LoggerFactory.getLogger(AggregateFunction.class); Appears unused - is this left over from testing? If so, delete the imports as well. http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java File fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java: http://gerrit.cloudera.org:8080/#/c/9697/19/fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java@47 PS19, Line 47: private final static Logger LOG = LoggerFactory.getLogger(ScalarFunction.class); Either I'm confused about how this gets used, or I'm seeing a lot of extra Logger stuff lying around. http://gerrit.cloudera.org:8080/#/c/9697/19/testdata/bin/copy-udfs-udas.sh File testdata/bin/copy-udfs-udas.sh: http://gerrit.cloudera.org:8080/#/c/9697/19/testdata/bin/copy-udfs-udas.sh@64 PS19, Line 64:> src/main/java/org/apache/impala/NewReplaceStringUdf.java This is a pretty ugly hack; does this mean you can't run the test directly without also manually performing this step? -- To view, visit http://gerrit.cloudera.org:8080/9697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Gerrit-Change-Number: 9697 Gerrit-PatchSet: 19 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 22 Mar 2018 22:47:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9525 to look at the new patch set (#4). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 167 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/4 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.h File be/src/exec/delimited-text-parser.h: http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.h@174 PS3, Line 174: field_delim_ != tuple_delim_ > hmm, okay. this seems a bit weird to me to handle cases (and have to think I'm not super happy about this, as I found the can of worms during testing. The tests now exercise pretty much all of the cases, both valid and invalid. I did not extend the tests to validate sequence file parsing (we have no tests for this). Mostly, I am concerned with just not crashing and having well defined behavior in all cases, no matter what, even if we get nonsensical metadata for the file. http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.cc File be/src/exec/delimited-text-parser.cc: http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.cc@182 PS3, Line 182: if (DELIMITED_TUPLES) unfinished_tuple_ = false; > I don't think we need to code it explicitly. My comment about DCHECK is to Done -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 21 Mar 2018 20:54:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.h File be/src/exec/delimited-text-parser.h: http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.h@174 PS3, Line 174: field_delim_ != tuple_delim_ > is it allowed to have field_delim_ same as tuple_delim_? what's the behavio It is not allowed to have tables created like this by the frontend. It is possible, although unlikely that tables like this already exist. I encountered this behavior in the test, when the default arguments for the constructor created such a parser, so it is tested. I thought the sanest behavior was to only recognize a field delimiter when it was not equal to a tuple delimiter, and tuple delimiters were enabled. Before this, the behavior was nuts, it would not recognize tuples, and instead treat the entire row as an infinite set of columns. http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.h@175 PS3, Line 175: ) > this would be a little easier to read without the extraneous parens around Oh yes, it is just a bunch of OR clauses; we can remove these. http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.cc File be/src/exec/delimited-text-parser.cc: http://gerrit.cloudera.org:8080/#/c/9525/3/be/src/exec/delimited-text-parser.cc@182 PS3, Line 182: if (DELIMITED_TUPLES) unfinished_tuple_ = false; > can we even get here if !DELIMITED_TUPLES? If not, maybe DCHECK(DELIMINTED Nope; I was just eliding all references to unfinished_tuple_ in the class variant that had DELIMITED_TUPLES as false. I would hope in this case the compiler can help out by proving that new_tuple is always false, but we can help it out by coding it explicitly if you think that would be better. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 21 Mar 2018 17:00:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9525 to look at the new patch set (#3). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 167 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/3 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h File be/src/exec/delimited-text-parser.inline.h: http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h@163 PS1, Line 163: tuple_delim_ || (tuple_delim_ > I don't follow that. When DELIMITED_TUPLES is false, the comment says tuple I agree the logic here is kind of broken; if tuple_delim_ == field_delim_, this isn't correct for !DELIMITED_TUPLES; however, in practice tuple_delim_ can not be equal to field_delim_ except in the unit test; this is prohibited by the frontend. To compound the matter, the only place this matters, in hdfs-sequence-scanner.cc never checks this value, as by general design, the sequence scanner always knows where the tuple ends and never has to deal with unfinished tuples. This makes me a bit loathe to make the logic here more complicated. A good compromise might be to make this also include DELIMITED_TUPLES as a conjunct, and then add a DCHECK in HasUnfinishedTuples() that DELIMITED_TUPLES is true. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 15 Mar 2018 18:34:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser-test.cc File be/src/exec/delimited-text-parser-test.cc: http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser-test.cc@160 PS2, Line 160: TupleDelimitedTextParser nul_delim_parser(NUM_COLS, 0, is_materialized_col, NUL_DELIM); > should we also test nul tuple delim with a column delim? if not, why is tha Will add a test for this. http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h File be/src/exec/delimited-text-parser.h: http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@22 PS2, Line 22: #include Will delete this http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@41 PS2, Line 41: thme > them Done http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@55 PS2, Line 55: char field_delim_ = '\0', char collection_item_delim = '^', : char escape_char = '\0'); > do these default argument values still have special meanings? can the delim The only place these have interesting meanings is the unit test. I don't think the defaults are very useful, but I also don't think removing them and adding a bunch of extra unexplained fields to the unit test is helpful either. http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@171 PS2, Line 171: bool IsFieldOrCollectionItemDelimiter(char c) { > comment for this -- it's surprisingly non-trivial to understand what this i Will add a comment. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 15 Mar 2018 18:17:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9525 to look at the new patch set (#2). Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work Initially I didn't want to fully implement this, as the metadata for these tables can't even be fully stored in Postgres; however after digging into some older documentation, it appears that the ASCII NUL character actually has been used as a field separator in various vendors CSV implementation. Therefore, this patch attempts to make things as non-broken as possible and allows \0 as a field or tuple delimiter. Collection column delimiters are not allowed to be \0, as they genuinly may not exist and we don't want to force special escaping on an arbitrary character. Note that the field delimiter must be distinct from the tuple delimiter when they both exist; if it is not, the effect will be that there is no field delimiter (this is actually possible with single column tables). Testing: Created a zero delimited table as described in the JIRA, using MySQL backed Hive metastore; ran select * from tab_separated on the table, updated the unit test. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 8 files changed, 147 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/2 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 ) Change subject: IMPALA-6389: Make '\0' delimited text files work .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.h File be/src/exec/delimited-text-parser.h: http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.h@51 PS1, Line 51: tuple_delim > what should be passed in here when !delimited_tuples? let's at least note Done http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.h@96 PS1, Line 96: /// This function is disabled for non-sequence file parsing. : template : typename std::enable_if::type > why do we need that? Not necessary. Initially I thought there was more split functionality in this class than there actually is and I wanted to conditionally get rid of functions that were invalid based on delimited_tuples. I think a DCHECK would be more than sufficient. http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.h@222 PS1, Line 222: tuple_delim_ > how about saying that's valid only when delimited_tuples is true? Done http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h File be/src/exec/delimited-text-parser.inline.h: http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h@55 PS1, Line 55: b > why isn't that delimited_tuples? is it something different will fix http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h@163 PS1, Line 163: tuple_delim_ || (tuple_delim_ > do those need to be guarded? Subtly, no. Any delimiter of '\0' will not be included in the xmm_delim_search field, so the corresponding bit will never be set. http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h@235 PS1, Line 235: enabled > what's that about? An extra fake template parameter defaulting to true is used in the class definition to eliminating this function by utilizing SFINAE because it isn't legal to use fail substitution on definitions only depending on a class template parameter. -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 08 Mar 2018 21:19:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6500: gracefully handle invalid sched getcpu() values
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9544 ) Change subject: IMPALA-6500: gracefully handle invalid sched_getcpu() values .. Patch Set 2: Code-Review+2 (1 comment) Looks good, up to you if you think we could reduce spamminess. http://gerrit.cloudera.org:8080/#/c/9544/2/be/src/util/cpu-info.cc File be/src/util/cpu-info.cc: http://gerrit.cloudera.org:8080/#/c/9544/2/be/src/util/cpu-info.cc@298 PS2, Line 298: const int MAX_WARNINGS = 100; I think we could cut this down to say 20 or so; if it happens, it is likely to immediately repeat and flood logs. -- To view, visit http://gerrit.cloudera.org:8080/9544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a118953c175b0aa13762ad11dccce01a4c70032 Gerrit-Change-Number: 9544 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 08 Mar 2018 20:25:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9525 Change subject: IMPALA-6389: Make '\0' delimited text files work .. IMPALA-6389: Make '\0' delimited text files work This is conflated a bit with the support work required to get MySQL to run in the mini-cluster to support HMS, as Postgresql simply rejects these table creations outright. The basic idea is to specialize the class on whether row/tuple delimiters are actually used, as the use case for sequence and text files is quite different. Unfortunately, that came out a bit more ugly than I would have liked, as the class template parameter infected all definitions. Testing: Created a zero delimited table as described in the JIRA, ran select * from tab_separated on the table. Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 --- M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M bin/create-test-configuration.sh M bin/impala-config.sh M buildall.sh M fe/pom.xml M testdata/bin/run-sentry-service.sh 12 files changed, 136 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/1 -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6230, IMPALA-6468: Fix the output type of round() and related fns
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9346 ) Change subject: IMPALA-6230, IMPALA-6468: Fix the output type of round() and related fns .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/9346/1/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/9346/1/be/src/exprs/math-functions-ir.cc@120 PS1, Line 120: return DoubleVal(trunc( > Do we have a test that shows the better behavior for trunc() versus the pre pow(10.0, scale.val) is likely expensive to compute and it isn't clear that the compiler will know the function is "pure". Should we instead do a table lookup and bounds check to make sure the scale parameter is sane? http://gerrit.cloudera.org:8080/#/c/9346/1/be/src/exprs/math-functions-ir.cc@121 PS1, Line 121: v.val * pow(10.0, scale.val) + ((v.val < 0) ? -0.5 : 0.5)) / pow(10.0, scale.val)); > Should we also check for overflows here , set FunctionContext with an error Can you undo the code movement here so that the change is more clear? http://gerrit.cloudera.org:8080/#/c/9346/1/be/src/exprs/math-functions.h File be/src/exprs/math-functions.h: http://gerrit.cloudera.org:8080/#/c/9346/1/be/src/exprs/math-functions.h@80 PS1, Line 80: static DoubleVal RoundUpTo(FunctionContext*, const DoubleVal&, const BigIntVal&); Why the ordering change? Also, why allow BigIntVal range here, isn't that just asking for trouble with overflow? -- To view, visit http://gerrit.cloudera.org:8080/9346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77541678012edab70b182378b11ca8753be53f97 Gerrit-Change-Number: 9346 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Mon, 05 Mar 2018 20:16:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6405: Error when string to decimal cast overflows
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9339 ) Change subject: IMPALA-6405: Error when string to decimal cast overflows .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db Gerrit-Change-Number: 9339 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 05 Mar 2018 19:46:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6405: Error when string to decimal cast overflows
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9339 ) Change subject: IMPALA-6405: Error when string to decimal cast overflows .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9339/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9339/1//COMMIT_MSG@10 PS1, Line 10: a decimal, a MULL was returned. In this patch, we change this behavior NULL http://gerrit.cloudera.org:8080/#/c/9339/1/fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java File fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java: http://gerrit.cloudera.org:8080/#/c/9339/1/fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java@85 PS1, Line 85: if (resultOfReverseCast != null && It seems weird (at least to me) that LiteralExpr.create would return null instead of NullLiteral, and probably simpler to just make this the case even if errors are raised during the cast. It also seems strange that we would only hit this with decimal. Don't other casts also raise errors? -- To view, visit http://gerrit.cloudera.org:8080/9339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db Gerrit-Change-Number: 9339 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 15 Feb 2018 23:19:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/9192/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9192/4//COMMIT_MSG@23 PS4, Line 23: CAUSED BY: UnsupportedOperationException: Eternal Data source table not supported. Eternal Data source? http://gerrit.cloudera.org:8080/#/c/9192/4//COMMIT_MSG@27 PS4, Line 27: For the most part, I deleted the unused code. In a few places, a renamed I renamed http://gerrit.cloudera.org:8080/#/c/9192/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/9192/4/common/thrift/CatalogObjects.thrift@38 PS4, Line 38: DEPRECATED_DATA_SOURCE, // removed in Impala 3.0 I'm not sure I see the point of keeping this around instead of just totally deleting it. http://gerrit.cloudera.org:8080/#/c/9192/4/common/thrift/CatalogObjects.thrift@48 PS4, Line 48: DEPRECATED_DATA_SOURCE_TABLE, // removed in Impala 3.0 Same comment. http://gerrit.cloudera.org:8080/#/c/9192/4/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/9192/4/common/thrift/PlanNodes.thrift@43 PS4, Line 43: DEPRECATED_DATA_SOURCE_NODE, // removed in Impala 3.0 Again, why do we need this? -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 4 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 05 Feb 2018 19:01:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6429: Fix decimal division
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9114 ) Change subject: IMPALA-6429: Fix decimal division .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h File be/src/util/decimal-util.h: http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h@58 PS3, Line 58: return MultiplyByScale(v, t.scale, false, nullptr); > SFINAE is why it compiled. Slight correction - I think it's even simpler than that - the template never even got instantiated. Please make sure that all the be tests still compile - some of them were doing weird things and creating their own ColumnTypes iirc. -- To view, visit http://gerrit.cloudera.org:8080/9114 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1075d9c78986cd975dd29c1125d71ba6560c23 Gerrit-Change-Number: 9114 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 01 Feb 2018 19:02:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6429: Fix decimal division
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9114 ) Change subject: IMPALA-6429: Fix decimal division .. Patch Set 4: Code-Review+1 Dan, Tim, any last comments? -- To view, visit http://gerrit.cloudera.org:8080/9114 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1075d9c78986cd975dd29c1125d71ba6560c23 Gerrit-Change-Number: 9114 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 01 Feb 2018 18:55:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6429: Fix decimal division
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9114 ) Change subject: IMPALA-6429: Fix decimal division .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h File be/src/util/decimal-util.h: http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h@58 PS3, Line 58: return MultiplyByScale(v, t.scale, false, nullptr); > Good catch. I have no idea. I'm not sure how it even got compiled. It looks SFINAE is why it compiled. -- To view, visit http://gerrit.cloudera.org:8080/9114 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1075d9c78986cd975dd29c1125d71ba6560c23 Gerrit-Change-Number: 9114 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 01 Feb 2018 18:54:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6429: Fix decimal division
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9114 ) Change subject: IMPALA-6429: Fix decimal division .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/exprs/expr-test.cc@2877 PS3, Line 2877: EXPECT_TRUE((-scaled_up_dividend) / scale_multiplier != I'm not sure how valuable this EXPECT_TRUE is - haven't we already verified that with the if condition? http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/runtime/decimal-value.inline.h@161 PS3, Line 161: return floor_log2[scale_by] + 1; Why not add the +1 into the floor function? Then you can actually omit it for the case of zero, where the formula is conservative (no increase in bits multiplying by 1). http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/runtime/decimal-value.inline.h@175 PS3, Line 175: int num_occupied = 128 - BitUtil::CountLeadingZeros(abs(num)); An optimization idea for later. Curious how efficient the abs() implementation is for int128_t. Given that we have to do a lot of these abs() transformations, might it make sense for us to do that up front for large multiply and divide, do everything in unsigned math, and correct the result after? This also simplifies getting the rounding correct. http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h File be/src/util/decimal-util.h: http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h@54 PS3, Line 54: /// TODO: do we need to handle overflow here or at a higher abstraction. TODO looks to be now done :) http://gerrit.cloudera.org:8080/#/c/9114/3/be/src/util/decimal-util.h@58 PS3, Line 58: return MultiplyByScale(v, t.scale, false, nullptr); Which function is this actually calling? -- To view, visit http://gerrit.cloudera.org:8080/9114 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1075d9c78986cd975dd29c1125d71ba6560c23 Gerrit-Change-Number: 9114 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 31 Jan 2018 20:00:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Zach Amsden has abandoned this change. ( http://gerrit.cloudera.org:8080/9124 ) Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. Abandoned Looks like the Hive change is going to be reverted; no longer necessary. -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9124 ) Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@450 PS1, Line 450: // the original table name, and if that fails, try looking for the > As Dimitris points out, this would be clearer if we said: I can't do this. HBASE_TABLE_NAME is defined in a Hive class which is thoroughly wedged into its public API. I can only rename HBASE_TABLE_NAME_COMPAT - and since I can't rename the first, I'd prefer to follow the naming convention and omit the _KEY. I agree it is a better name though. -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 25 Jan 2018 20:36:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9124 ) Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/9124/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9124/1//COMMIT_MSG@9 PS1, Line 9: Upstream Hive has changed the property name used to specify > I think also HBase? Can we pull in the JIRA numbers in the corresponding pr Done http://gerrit.cloudera.org:8080/#/c/9124/1//COMMIT_MSG@10 PS1, Line 10: HBase table names. It is relatively straitforward to make > nit: straightforward Done http://gerrit.cloudera.org:8080/#/c/9124/1//COMMIT_MSG@14 PS1, Line 14: Testing: With an Impala built with a local override to use > The documentation in docs/topics/impala_hbase.xml refers "hbase.table.name" I think I can "magically" fix this. We're going to need the property name in an environment variable anyway because tests. This should do it: # Make the HBASE_TABLE_NAME available to tests and scripts export HBASE_TABLE_NAME_PROPERTY=\ `javap -constants org.apache.hadoop.hive.hbase.HBaseSerDe | grep HBASE_TABLE_NAME |\ sed 's/[^"]*"\([^"]*\).*/\1/'` http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@450 PS1, Line 450: // the original table name, and if that fails, try looking for the > As Dimitris points out, this would be clearer if we said: Will use HBASE_TABLE_NAME_KEY -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 25 Jan 2018 00:00:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Hello Philip Zeyliger, Dimitris Tsirogiannis, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9124 to look at the new patch set (#2). Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. IMPALA-6440: Backwards compatibility for HBase metadata Upstream Hive has changed the property name used to specify HBase table names. It is relatively straitforward to make this backwards compatible, no matter what version of Hive Impala is linked with. Testing: With an Impala built with a local override to use latest Hive-2.1.1, perform a full data load, then: create external table test_compute_stats like functional_hbase.alltypes; select * from test_compute_stats; Before, this failed with TableNotFound, now this succeeds. Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b --- M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java 1 file changed, 15 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/9124/2 -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9124 ) Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@117 PS1, Line 117: for backwards compatibility with the original spec. > Actually, I would mention here which was the latest version to support this Done http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@449 PS1, Line 449: try : // the original table name, and if that fails, try looking for the : // original table name property in SERDEPROPERTIES. > I am a bit confused with the terminology. Is HBaseSerDe.HBASE_TABLE_NAME th The original table name, and the name that used to be stored in SERDEPROPERTIES is "hbase.table.name". HBaseSerDe.HBASE_TABLE_NAME is the new table name, which varies depending on which version of HBase we link with (and varies even with versions due to late breaking changes). http://gerrit.cloudera.org:8080/#/c/9124/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@452 PS1, Line 452: look for > nit: we're not looking for it, we simply construct it, no? Done -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 24 Jan 2018 23:29:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6440: Backwards compatibility for HBase metadata
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9124 Change subject: IMPALA-6440: Backwards compatibility for HBase metadata .. IMPALA-6440: Backwards compatibility for HBase metadata Upstream Hive has changed the property name used to specify HBase table names. It is relatively straitforward to make this backwards compatible, no matter what version of Hive Impala is linked with. Testing: With an Impala built with a local override to use latest Hive-2.1.1, perform a full data load, then: create external table test_compute_stats like functional_hbase.alltypes; select * from test_compute_stats; Before, this failed with TableNotFound, now this succeeds. Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b --- M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java 1 file changed, 14 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/9124/1 -- To view, visit http://gerrit.cloudera.org:8080/9124 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id7c5b62f38be275762771ecf39cf898f06f8138b Gerrit-Change-Number: 9124 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6429: Fix decimal division
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9114 ) Change subject: IMPALA-6429: Fix decimal division .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/exprs/expr-test.cc@2400 PS1, Line 2400: { true, false, 0, 38, 6 }}}, Suggest also adding another test with full precision decimal(38,3) / decimal(1,0), which will overflow in V2 since the result can't fit in decimal(38,6). Even the value 1.0 will work as a divisor for this. http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/runtime/decimal-value.inline.h@235 PS1, Line 235: return DecimalUtil::SafeMultiply(left, mult, *overflow) + right; This is going to check *overflow with a branch anyway. Why not hoist the check of *overflow to this function? Then if (UNLIKELY(*overflow == true,)) return garbage, and always call DecimalUtil::SafeMultiply with false. http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/runtime/decimal-value.inline.h@288 PS1, Line 288: return DecimalUtil::SafeMultiply(left, mult, *overflow) + right; Same comment applies. http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/runtime/decimal-value.inline.h@457 PS1, Line 457: bool may_overflow = scale_by > 38; You could add a leading zero check as well, but that requires knowing the maximum value of scale_by. This adds one more corner case that relies on the exact formula for result_scale, and that makes me uncomfortable. In the future, we may want to improve decimal results by allowing the query to specify the output scale (as a cast, which gets pushed down into result_scale), and doing that is already a bit dangerous as I think there are other places that rely on the result scale formulas being predefined. http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/util/decimal-util.h File be/src/util/decimal-util.h: http://gerrit.cloudera.org:8080/#/c/9114/1/be/src/util/decimal-util.h@67 PS1, Line 67: if (UNLIKELY(result / multiplier != v)) *overflow = true; You can infer !may_overflow from overflow == nullptr, and this should inline just as well in cases where constant nullptr is passed. (Edit - I see the way you are using it, this isn't quite true - I think if you adopt my suggestions for Add / SubtractLarge, then this may follow). Also, this is likely expensive for int256_t. I don't think there is a practical way to make that faster without direct access to the underlying type (to get the number of leading zeroes). -- To view, visit http://gerrit.cloudera.org:8080/9114 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibd1075d9c78986cd975dd29c1125d71ba6560c23 Gerrit-Change-Number: 9114 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 24 Jan 2018 18:16:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6420: Fix TestCharFormats for local filesystem tests
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9074 ) Change subject: IMPALA-6420: Fix TestCharFormats for local filesystem tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9074/2/tests/query_test/test_chars.py File tests/query_test/test_chars.py: http://gerrit.cloudera.org:8080/#/c/9074/2/tests/query_test/test_chars.py@93 PS2, Line 93: def __create_char_tables(self): > Can we rework this to use unique_database instead? These tables should not I think since this relies on an external file populated in an earlier data loading phase, that for now, we need an absolute path instead of a unique one. Very open to suggestions. If we could move the data load into this test, we should be able to fix that and just use a unique database, but this is very awkward with the existing testdata/workloads, testdata/datasets, tests separation. The dataset involved here is quite small, so it is doable, but if we are going to attempt that, we should probably approach it in terms of a more unified strategy instead of just throwing datasets directly into the tests. -- To view, visit http://gerrit.cloudera.org:8080/9074 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I916b5566a3c5a83185e502b814ad43b5af14e87f Gerrit-Change-Number: 9074 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 19 Jan 2018 20:02:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4924: Enable Decimal V2 by default
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9062 ) Change subject: IMPALA-4924: Enable Decimal V2 by default .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/9062/5/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9062/5/be/src/exprs/expr-test.cc@7688 PS5, Line 7688: ColumnType::CreateDecimalType(15,2)); > my understanding is we need to executor_->PopExecOption(); at the end of th Yes, same with below. We should clean this up and make an RAII class WithExecOption so this is automatic when leaving scope. (Not in this diff) http://gerrit.cloudera.org:8080/#/c/9062/5/testdata/workloads/functional-query/queries/QueryTest/uda.test File testdata/workloads/functional-query/queries/QueryTest/uda.test: http://gerrit.cloudera.org:8080/#/c/9062/5/testdata/workloads/functional-query/queries/QueryTest/uda.test@85 PS5, Line 85: select agg_decimal_intermediate(cast(c3 as decimal(2,1)), 2), count(*) going on blind faith here http://gerrit.cloudera.org:8080/#/c/9062/5/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/9062/5/tests/hs2/test_hs2.py@443 PS5, Line 443: assert "Invalid base64 string; input length is 3, which is not a multiple of 4" in log ok, we're just testing get_log here http://gerrit.cloudera.org:8080/#/c/9062/5/tests/query_test/test_decimal_casting.py File tests/query_test/test_decimal_casting.py: http://gerrit.cloudera.org:8080/#/c/9062/5/tests/query_test/test_decimal_casting.py@126 PS5, Line 126: val, precision, vector.get_value('cast_from')).format(val, precision, scale) Unnecessary change. Yes it is nicer, but it could cause more merge conflicts going forward. -- To view, visit http://gerrit.cloudera.org:8080/9062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibbdd05bf986b7947f106b396017faa3a0bd87fd7 Gerrit-Change-Number: 9062 Gerrit-PatchSet: 5 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 19 Jan 2018 18:17:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4924: Enable Decimal V2 by default
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9062 ) Change subject: IMPALA-4924: Enable Decimal V2 by default .. Patch Set 4: Code-Review+1 Despite trying my best to poke holes, this looks good to me. -- To view, visit http://gerrit.cloudera.org:8080/9062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibbdd05bf986b7947f106b396017faa3a0bd87fd7 Gerrit-Change-Number: 9062 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 19 Jan 2018 07:12:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4924: Enable Decimal V2 by default
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9062 ) Change subject: IMPALA-4924: Enable Decimal V2 by default .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/9062/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9062/4/be/src/exprs/expr-test.cc@5017 PS4, Line 5017: TestValue("cast(-12345.345 as double) % cast(7 as double)", nit: could be on same line http://gerrit.cloudera.org:8080/#/c/9062/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java: http://gerrit.cloudera.org:8080/#/c/9062/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@2276 PS4, Line 2276: ScalarType.createDecimalType(15, 5), ScalarType.createDecimalType(16, 5)); Although we have agreed upon these rules, I must still register my objection to this useless precision upgrade. http://gerrit.cloudera.org:8080/#/c/9062/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@2345 PS4, Line 2345: ScalarType.createDecimalType(15, 5), ScalarType.createDecimalType(16, 5)); Similarly. http://gerrit.cloudera.org:8080/#/c/9062/4/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/9062/4/testdata/workloads/functional-query/queries/QueryTest/exprs.test@a75 PS4, Line 75: ??!! did this even run before? http://gerrit.cloudera.org:8080/#/c/9062/4/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/9062/4/tests/hs2/test_hs2.py@442 PS4, Line 442: log = self.get_log("select base64decode('foo')") what? This looks like a completely different test. Merge conflict or something? http://gerrit.cloudera.org:8080/#/c/9062/4/tests/query_test/test_decimal_casting.py File tests/query_test/test_decimal_casting.py: http://gerrit.cloudera.org:8080/#/c/9062/4/tests/query_test/test_decimal_casting.py@84 PS4, Line 84: if cast_from == 'string': Should we have a comment that this is deprecating decimal_v1 behavior testing? -- To view, visit http://gerrit.cloudera.org:8080/9062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibbdd05bf986b7947f106b396017faa3a0bd87fd7 Gerrit-Change-Number: 9062 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 19 Jan 2018 07:02:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6420: Fix TestCharFormats for local filesystem tests
Hello Michael Brown, Philip Zeyliger, David Knupp, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9074 to look at the new patch set (#2). Change subject: IMPALA-6420: Fix TestCharFormats for local filesystem tests .. IMPALA-6420: Fix TestCharFormats for local filesystem tests Tl;dr - running tests was covering up a bug in the tests, or alternatively, not running tests exposes a bug. Local tests were failing because we failed to prepend the local filesystem prefix. This was covered up because the snapshot creation job runs tests, which actually creates these tables in the metastore. When running local filesystem tests with a snapshot, the table creation never runs because the snapshot import converts hdfs:// locations into local file locations. The problematic CREATE TABLE x IF NOT EXISTS then never get run, hiding the problem. Testing: With a local filesystem install, in Impala shell: drop table functional_parquet.chars_formats and then: tests/run-tests.py query_test/test_chars.py -k char_format Change-Id: I916b5566a3c5a83185e502b814ad43b5af14e87f --- M tests/query_test/test_chars.py 1 file changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/9074/2 -- To view, visit http://gerrit.cloudera.org:8080/9074 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I916b5566a3c5a83185e502b814ad43b5af14e87f Gerrit-Change-Number: 9074 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6420: Fix TestCharFormats for local filesystem tests
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9074 Change subject: IMPALA-6420: Fix TestCharFormats for local filesystem tests .. IMPALA-6420: Fix TestCharFormats for local filesystem tests Tl;dr - running tests was covering up a bug in the tests, or alternatively, not running tests exposes a bug. Local tests were failing because we failed to prepend the local filesystem prefix. This was covered up because the snapshot creation job runs tests, which actually creates these tables in the metastore. When running local filesystem tests with a snapshot, the table creation never runs because the snapshot import converts hdfs:// locations into local file locations. The problematic CREATE TABLE x IF NOT EXISTS then never get run, hiding the problem. Change-Id: I916b5566a3c5a83185e502b814ad43b5af14e87f --- M tests/query_test/test_chars.py 1 file changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/9074/1 -- To view, visit http://gerrit.cloudera.org:8080/9074 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I916b5566a3c5a83185e502b814ad43b5af14e87f Gerrit-Change-Number: 9074 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-4924: Enable Decimal V2 by default
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9062 ) Change subject: IMPALA-4924: Enable Decimal V2 by default .. Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test@1989 PS1, Line 1989: QUERY With CATCH and QUERY clauses moving around, this is very hard to review. http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test File testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test@99 PS1, Line 99: redundant space http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test@4 PS1, Line 4: extra line http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test@270 PS1, Line 270: TYPES please move back to line 263 http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test@673 PS1, Line 673: This last line seems changed in every file, can we eliminate that? http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test File testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test@194 PS1, Line 194: RESULTS move back below TYPES http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/uda.test File testdata/workloads/functional-query/queries/QueryTest/uda.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/uda.test@98 PS1, Line 98:agg_decimal_intermediate(cast(1.1 as decimal(2,1)), 2), why was this query modified? http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/uda.test@119 PS1, Line 119:agg_decimal_intermediate(cast(1.1 as decimal(2,1)), 2), also modified for some reason http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test File testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@14 PS1, Line 14: TYPES please move back to l12 http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/tpcds/queries/tpcds-q53.test File testdata/workloads/tpcds/queries/tpcds-q53.test: http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/tpcds/queries/tpcds-q53.test@50 PS1, Line 50: 827,320.05,356.59 Long line http://gerrit.cloudera.org:8080/#/c/9062/1/testdata/workloads/tpcds/queries/tpcds-q53.test@112 PS1, Line 112: 99,393.58,438.667500 Long line -- To view, visit http://gerrit.cloudera.org:8080/9062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibbdd05bf986b7947f106b396017faa3a0bd87fd7 Gerrit-Change-Number: 9062 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 19 Jan 2018 01:29:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5478: Run TPCDS queries with decimal v2 enabled
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8985 ) Change subject: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled .. Patch Set 2: Is there any easy way to get a diff between this and the non-v2 query output? -- To view, visit http://gerrit.cloudera.org:8080/8985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib867c51a521ec4a087bc127d99aee4b95ba97733 Gerrit-Change-Number: 8985 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 12 Jan 2018 21:34:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components
Zach Amsden has abandoned this change. ( http://gerrit.cloudera.org:8080/7581 ) Change subject: IMPALA-5764: Allow overriding packaged components .. Abandoned Abandoning ancient thing. Now that we're almost settled with the build / toolchain / branch structure, we'll do something better. -- To view, visit http://gerrit.cloudera.org:8080/7581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545 Gerrit-Change-Number: 7581 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6386: Invalidate metadata at table level for dataload
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9009 ) Change subject: IMPALA-6386: Invalidate metadata at table level for dataload .. Patch Set 2: Code-Review+2 (2 comments) Looks good, I have only minor comments. http://gerrit.cloudera.org:8080/#/c/9009/2/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/9009/2/testdata/bin/generate-schema-statements.py@492 PS2, Line 492: impala_output = Statements() Maybe s/impala_output/impala_create/g http://gerrit.cloudera.org:8080/#/c/9009/2/testdata/bin/generate-schema-statements.py@702 PS2, Line 702: hive_output.write_to_file('load-' + output_name + '-hive-generated.sql') Ideally these names should match as well (hive_load, hbase_load), but it is easy to dig a hole messing around with a 200+ line function. Up to you. -- To view, visit http://gerrit.cloudera.org:8080/9009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibc3a6d8a674a0bf6b02069bfe8a5e12034335b1f Gerrit-Change-Number: 9009 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 12 Jan 2018 21:29:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6386: Invalidate metadata at table level for dataload
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9009 ) Change subject: IMPALA-6386: Invalidate metadata at table level for dataload .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9009/1/bin/load-data.py File bin/load-data.py: http://gerrit.cloudera.org:8080/#/c/9009/1/bin/load-data.py@318 PS1, Line 318: create_filename = 'load-%s-impala-generated' % load_file_substr > The '%s-impala-generated' currently matches files like 'load-functional-que This isn't your fault, but it seems particularly fragile that we have to generate identical filenames in multiple pieces of python code. -- To view, visit http://gerrit.cloudera.org:8080/9009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibc3a6d8a674a0bf6b02069bfe8a5e12034335b1f Gerrit-Change-Number: 9009 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 12 Jan 2018 00:24:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6386: Invalidate metadata at table level for dataload
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/9009 ) Change subject: IMPALA-6386: Invalidate metadata at table level for dataload .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9009/1/bin/load-data.py File bin/load-data.py: http://gerrit.cloudera.org:8080/#/c/9009/1/bin/load-data.py@318 PS1, Line 318: create_filename = 'load-%s-impala-generated' % load_file_substr create_filename = 'load' followed by load_filename = '... Seems strange. Also, I don't see a corresponding change to this anywhere else, seems like there should be one? -- To view, visit http://gerrit.cloudera.org:8080/9009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibc3a6d8a674a0bf6b02069bfe8a5e12034335b1f Gerrit-Change-Number: 9009 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 12 Jan 2018 00:03:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8969 ) Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp .. Patch Set 1: Code-Review+2 Took a look at the tests, everything looks good. Feel free to wait if you want another pair of eyes but I can't find anything that needs changes or think of more useful tests. -- To view, visit http://gerrit.cloudera.org:8080/8969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da Gerrit-Change-Number: 8969 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 09 Jan 2018 23:49:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8969 ) Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp .. Patch Set 1: Code-Review+1 I'm still going through the test cases. For a timestamp related diff, this looks remarkably clean and simple. -- To view, visit http://gerrit.cloudera.org:8080/8969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da Gerrit-Change-Number: 8969 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 09 Jan 2018 23:42:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6231: Implement decimal v2 fuzz test
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8898 ) Change subject: IMPALA-6231: Implement decimal_v2 fuzz test .. Patch Set 3: Code-Review+2 Looks great! -- To view, visit http://gerrit.cloudera.org:8080/8898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85 Gerrit-Change-Number: 8898 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 09 Jan 2018 23:11:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 3: Code-Review-1 Looks like there's a Python test we need to update... 03:52:25 ] TestDecimalCasting.test_underflow[cast_from: string | decimal_type: (6, 3) | exec_option: {'decimal_v2': 'true'}] 03:52:25 ] [gw0] linux2 -- Python 2.7.12 /home/ubuntu/Impala/bin/../infra/python/env/bin/python 03:52:25 ] query_test/test_decimal_casting.py:166: in test_underflow 03:52:25 ] self._assert_decimal_result(cast, res, expected_val) 03:52:25 ] query_test/test_decimal_casting.py:80: in _assert_decimal_result 03:52:25 ] assert expected == actual, "Cast: {0}, Expected: {1}, Actual: {2}".format(cast,\ 03:52:25 ] E AssertionError: Cast: select cast('-85.0817' as Decimal(6,3)), Expected: -85.081, Actual: -85.082 03:52:25 ] E assert Decimal('-85.081') == Decimal('-85.082') -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 22 Dec 2017 05:54:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 4: Code-Review+2 (2 comments) I have only minor comments http://gerrit.cloudera.org:8080/#/c/8833/4/be/src/benchmarks/overflow-benchmark.cc File be/src/benchmarks/overflow-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/8833/4/be/src/benchmarks/overflow-benchmark.cc@137 PS4, Line 137: return false; This now always returns false. http://gerrit.cloudera.org:8080/#/c/8833/4/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8833/4/be/src/runtime/decimal-value.inline.h@186 PS4, Line 186: return num_lz - floor_log2[scale_diff] - 1; num_lz - floor_log2[scale_diff] - (scale_diff > 0) is slightly nicer, but may not matter since I don't think this is ever called for scale_diff == 0 -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 22 Dec 2017 01:28:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6231: Implement decimal v2 fuzz test
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8898 ) Change subject: IMPALA-6231: Implement decimal_v2 fuzz test .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8898/1/tests/query_test/test_decimal_fuzz.py File tests/query_test/test_decimal_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8898/1/tests/query_test/test_decimal_fuzz.py@33 PS1, Line 33: # Set the Python decimal context to a large precision initially, so that the : # mathematical operations are performed at a high precision. : decimal.getcontext().prec = 80 : decimal.getcontext().rounding = decimal.ROUND_HALF_UP > Will this create a side-effect for the entire lifetime of the python proces Use def setup_method and teardown_method for this? http://gerrit.cloudera.org:8080/#/c/8898/1/tests/query_test/test_decimal_fuzz.py@70 PS1, Line 70: 38 > Would it be better to have 37 here instead, or do you want 38 to be part of I think we want 38 http://gerrit.cloudera.org:8080/#/c/8898/1/tests/query_test/test_decimal_fuzz.py@204 PS1, Line 204: truncated_expected = expected.quantize( Nice. I didn't know you could do this. I tried a few of the more bizarre repeated fraction results and they checked out - e.g., (Decimal('99.998') / Decimal('999')).quantize(Decimal('0.' + '0' * 8 + '1')) -- To view, visit http://gerrit.cloudera.org:8080/8898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85 Gerrit-Change-Number: 8898 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 22 Dec 2017 00:35:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 22 Dec 2017 00:10:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 3: (2 comments) Let me give the math one last check.. http://gerrit.cloudera.org:8080/#/c/8774/3/be/src/exec/hdfs-scanner-ir.cc File be/src/exec/hdfs-scanner-ir.cc: http://gerrit.cloudera.org:8080/#/c/8774/3/be/src/exec/hdfs-scanner-ir.cc@145 PS3, Line 145: auto ret = StringToDecimal4(s, len, type_precision, type_scale, false, result); Will we ever want to implement rounding in the scanner? http://gerrit.cloudera.org:8080/#/c/8774/3/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/8774/3/be/src/util/string-parser.h@243 PS3, Line 243: // There are a maximum number of digits to the left of the dot. We round by I finally understand this comment. -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 22 Dec 2017 00:07:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@191 PS2, Line 191: } else if (UNLIKELY(round && total_digits_count == type_precision)) { > Saving the truncated digit may be costly. The first_truncated_digit, along I didn't think of that. Now I see your point. -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 21 Dec 2017 21:10:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@191 PS2, Line 191: } else if (UNLIKELY(round && total_digits_count == type_precision)) { > We are saving the first truncated digit here. We save the truncated digit i I just thought it would be simpler to just always save the first truncated digit - then the DCHECK on 248 is unnecessary. -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 21 Dec 2017 20:54:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove unused deps, centralize some pom versions, upgrade SLF4J and commons-io.
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8853 ) Change subject: Remove unused deps, centralize some pom versions, upgrade SLF4J and commons-io. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I717e0625dfe0fdbf7e9161312e9e80f405a359c5 Gerrit-Change-Number: 8853 Gerrit-PatchSet: 3 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 19 Dec 2017 23:28:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 2: (2 comments) Other than deprecating GetScaleQuotient, this looks good. http://gerrit.cloudera.org:8080/#/c/8833/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8833/2/be/src/runtime/decimal-value.inline.h@182 PS2, Line 182: 73, 76, 79, 83, 86, 89, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129 }; Why up to 10^40? Can we really scale that high? I would have thought the largest would be floor(log(10**38, 2)) http://gerrit.cloudera.org:8080/#/c/8833/2/be/src/runtime/decimal-value.inline.h@613 PS2, Line 613: *y_scaled = detail::SafeMultiply(y.value(), scale_factor, false); I checked and this is was the only usage of GetScaleQuotient outside of tests, and that appears to be a benchmark for this function (which is now changed). Maybe we should get rid of that function? I find it really confusing to think about, especially since the division by powers of MAX_DECIMAL by powers of 10 is not exact. -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 15 Dec 2017 22:24:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 2: Now that I see your new diff, you can ignore my CheckMultiply comment. This looks fine. Let me give it one last look over. -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 15 Dec 2017 21:55:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/exprs/expr-test.cc@1574 PS1, Line 1574: {{ false, false, 8892800, 12, 2 }}}, Not sure where this came from, but okay. Did it leak in from another diff? Either way, I don't object to it (and I know how much fun merging changes in here can be). http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/exprs/expr-test.cc@2479 PS1, Line 2479: { "cast(1 as decimal(6,1)) % cast(2 as decimal(37,35))", Did this one overflow before your change? http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@145 PS1, Line 145: inline T SafeMultiply(T a, T b) { How about making this void `CheckMultiply(T a, T b)` and then just doing the DCHECK? http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@261 PS1, Line 261: #ifdef DEBUG > Is there some way we can avoid having different control flow on release and Then you can omit the #ifdef here and just call CheckMultiply(left, mult) if (!*overflow) -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 15 Dec 2017 21:52:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@190 PS1, Line 190: int y_lz = BitUtil::CountLeadingZeros(abs(y)); > ? This was counting in 64 bit all the time before, right? Ouch because it wasn't immediately clear where the overflow was coming from. -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 15 Dec 2017 21:44:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8833 ) Change subject: IMPALA-6300: Fix decimal modulo overflow .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@190 PS1, Line 190: int y_lz = BitUtil::CountLeadingZeros(abs(y)); Ouch http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@361 PS1, Line 361: RESULT_T x = 0; Unnecessary change - can you revert? http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@571 PS1, Line 571: result = x % y; I think it would probably be faster and simpler to just use 64-bit arithmetic all the time (plus, it's less code emitted per mod). http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@641 PS1, Line 641: return true; Since this can never be true if callers respect the comment above, maybe this should be converted to a DCHECK and the return value eliminated. -- To view, visit http://gerrit.cloudera.org:8080/8833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686 Gerrit-Change-Number: 8833 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 15 Dec 2017 00:08:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6270: remove redundant version properties
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8827 ) Change subject: IMPALA-6270: remove redundant version properties .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8827/1/tests/test-hive-udfs/pom.xml File tests/test-hive-udfs/pom.xml: http://gerrit.cloudera.org:8080/#/c/8827/1/tests/test-hive-udfs/pom.xml@a40 PS1, Line 40: I didn't even realize we had a POM here. -- To view, visit http://gerrit.cloudera.org:8080/8827 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6812e11bb41716450ef29bb523773479e9f76eec Gerrit-Change-Number: 8827 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 13 Dec 2017 19:10:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@239 PS2, Line 239: value = DecimalUtil::ScaleDownAndRound(value, shift, round); > ScaleDownAndRound implements signed rounding to do rounding away from zero, After thinking on this some more, rounding away from zero is symmetric with respect to positive and negative numbers, so it doesn't matter if the number value is negated before or after this call. Either way works correctly. -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 13 Dec 2017 16:24:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5014: Part 1: Round when casting string to decimal
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8774 ) Change subject: IMPALA-5014: Part 1: Round when casting string to decimal .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/runtime/decimal-test.cc File be/src/runtime/decimal-test.cc: http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/runtime/decimal-test.cc@302 PS2, Line 302: StringToAllDecimals("0.5", 5, 0, More interesting test cases: 0.0 as 6,5 1.0 as 6.5 0.00 as 6,5 1.00 as 6,5 http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@191 PS2, Line 191: } else if (UNLIKELY(round && total_digits_count == type_precision)) { This doesn't need to be conditional on round, in fact I think leaving that condition out makes reading the DCHECK below simpler. http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@238 PS2, Line 238: // There are less than maximum number of digits to the left of the dot. Don't you mean "There are more than the maximum number of digits to the right of the dot." ? http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@239 PS2, Line 239: value = DecimalUtil::ScaleDownAndRound(value, shift, round); ScaleDownAndRound implements signed rounding to do rounding away from zero, so value should be negated if is_negative is true. http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@243 PS2, Line 243: // There are a maximum number of digits to the left of the dot. We attempt to the right of the dot? http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@248 PS2, Line 248: DCHECK(first_truncated_digit == 0 || round); I found this DCHECK confusing until I realized saving the truncated digit was conditional on round, which I don't think it needs to be. http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@250 PS2, Line 250: value += (first_truncated_digit >= 5); This rounds towards positive infinity which isn't consistent with our rounding behavior in ScaleDownAndRound, which rounds away from zero. Edit: Actually, since the value isn't negated yet, this is correct. Where to actually do the negation appears to be a bit of a conundrum. http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@262 PS2, Line 262: DCHECK_EQ(truncated_digit_count, 0); It would be nice to have a DCHECK that this can't overflow. While that is true because of the structure of the if conditions, there are enough of them to invite uncertainty in the readers mind by the time we get here. http://gerrit.cloudera.org:8080/#/c/8774/2/be/src/util/string-parser.h@391 PS2, Line 391: int digits_after_dot_count, int type_precision, int8_t exponent, T* value, type_precision is unused in this function -- To view, visit http://gerrit.cloudera.org:8080/8774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd8b92727fb384e6ff2d145e4aab7ae5d27db26d Gerrit-Change-Number: 8774 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 13 Dec 2017 00:35:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/8719/7/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/8719/7/tests/metadata/test_ddl.py@362 PS7, Line 362: 00:SCAN HDFS [functional.alltypestiny a]""" in '\n'.join(plan.data) I think you meant this? -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 7 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 11 Dec 2017 19:00:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Hello Philip Zeyliger, David Knupp, Joe McDonnell, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8719 to look at the new patch set (#7). Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/metadata/test_ddl.py 2 files changed, 42 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/7 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 7 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8719/6/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/8719/6/tests/metadata/test_ddl.py@353 PS6, Line 353: assert '\n'.join(plan.data[4:]) == \ > Why does python IN not work? This will break if we change the explain heade IN works fine on the flattened plan. For some reason I thought you were asking for IN on the list. -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 6 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 11 Dec 2017 18:59:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/8719/5/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/8719/5/tests/metadata/test_ddl.py@353 PS5, Line 353: assert '\n'.join(plan.data) == \ > Let's do contains and not include the first 4 expected lines since those mi Contains is harder than expected in Python and also a weaker condition. Slices to the rescue!: '\n'.join(plan.data[4:]) -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 11 Dec 2017 18:12:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Hello Philip Zeyliger, David Knupp, Joe McDonnell, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8719 to look at the new patch set (#6). Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/metadata/test_ddl.py 2 files changed, 43 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/6 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 6 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/8719/4/testdata/workloads/functional-query/queries/QueryTest/views-ddl.test File testdata/workloads/functional-query/queries/QueryTest/views-ddl.test: http://gerrit.cloudera.org:8080/#/c/8719/4/testdata/workloads/functional-query/queries/QueryTest/views-ddl.test@281 PS4, Line 281: QUERY > Move this and the query validation into Python as well. We don't want to ha Done http://gerrit.cloudera.org:8080/#/c/8719/4/testdata/workloads/functional-query/queries/QueryTest/views-ddl.test@289 PS4, Line 289: # Test that the plan respects the plan hints for shuffle and broadcast > I suggest moving this into Python as well. Might want to use explain_level= Done http://gerrit.cloudera.org:8080/#/c/8719/4/testdata/workloads/functional-query/queries/QueryTest/views-ddl.test@327 PS4, Line 327: # Test querying the hinted view. > Move into Python as well. Done http://gerrit.cloudera.org:8080/#/c/8719/1/tests/common/test_result_verifier.py File tests/common/test_result_verifier.py: http://gerrit.cloudera.org:8080/#/c/8719/1/tests/common/test_result_verifier.py@100 PS1, Line 100: 'Number of columns returned > the number of column types: %s' % column_types Garbage, will post another diff -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 08 Dec 2017 23:47:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Hello Philip Zeyliger, David Knupp, Joe McDonnell, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8719 to look at the new patch set (#5). Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/metadata/test_ddl.py 2 files changed, 47 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/5 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 5 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 08 Dec 2017 22:10:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8294/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@a3000 PS4, Line 3000: > For the record: I ended up removing it after running load.test both for S3 This seems reasonable. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 08 Dec 2017 22:10:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable S3 access via IAM roles for EC2 VMs .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8294/4/bin/check-s3-access.sh File bin/check-s3-access.sh: http://gerrit.cloudera.org:8080/#/c/8294/4/bin/check-s3-access.sh@82 PS4, Line 82: WGET_ARGS+=(http://169.254.169.254/latest/meta-data/iam/security-credentials/) Please make this IP address a shell variable and set it above right after you reference the docs.aws.amazon.com link. http://gerrit.cloudera.org:8080/#/c/8294/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8294/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@a3000 PS4, Line 3000: Can we restore this? LOAD DATA INPATH doesn't get coverage in the other test, and needs to be able to take S3 URIs. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 05 Dec 2017 01:09:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. IMPALA-6068: Scale back fixing functional-types I re-created the original patch for IMPALA-6068, but only performed what I believe to be the limited legal transformation of data load: DEPENDENT_LOAD -> DEPENDENT_LOAD_HIVE. Any place that directly uploads via hadoop or hdfs commands was left alone as changing it can't be proven to be correct. Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Testing: Doing a full data load. I mistakenly changed a variable name causing the first two dry-runs to fail. Reviewed-on: http://gerrit.cloudera.org:8080/8690 Reviewed-by: Zach Amsden Tested-by: Zach Amsden --- M testdata/bin/generate-schema-statements.py M testdata/common/widetable.py M testdata/datasets/functional/functional_schema_template.sql 3 files changed, 61 insertions(+), 38 deletions(-) Approvals: Zach Amsden: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 6 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 5: Code-Review+2 Rebased, carry the +2 -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 04 Dec 2017 23:46:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 04 Dec 2017 23:46:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 4: Finally! Anyone care to review? -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 04 Dec 2017 23:09:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 4: I still can't see anything that would actually change based on this patch, so either I'm being dense, or GVO is exposing a pre-existing problem here. Let's try again... -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 04 Dec 2017 19:26:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/metadata/test_ddl.py 2 files changed, 48 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/3 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8719 ) Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/metadata/test_ddl.py 2 files changed, 50 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/2 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6245: Tolerate column indenting from Hive
Zach Amsden has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8719 Change subject: IMPALA-6245: Tolerate column indenting from Hive .. IMPALA-6245: Tolerate column indenting from Hive The fix for HIVE-3140 started indenting multi-line comments, which breaks Impala testing when run against Hive 2.1.1. To test this using the pure test runner proved difficult since it would require extensive changes to support both row_regexes (since the columns changed order) and subset support (since the number of rows changed). Instead, we manually verify the hints are present in the output in the python test. The fact that the hints have been reformatted leaves us in an uncertain state as to whether they actually get applied, so a new test case has been added to run EXPLAIN SELECT on the view and verify the joins happen exactly as we expect. Testing: Ran the views-ddl test against Impala mini-cluster setups using both Hive 2.1.1 and Hive 1.1.0 Change-Id: I49e53b1230520ca6e850af28078526e6627d69de --- M testdata/workloads/functional-query/queries/QueryTest/views-ddl.test M tests/common/test_result_verifier.py M tests/metadata/test_ddl.py 3 files changed, 51 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/8719/1 -- To view, visit http://gerrit.cloudera.org:8080/8719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I49e53b1230520ca6e850af28078526e6627d69de Gerrit-Change-Number: 8719 Gerrit-PatchSet: 1 Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/8690/4/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/8690/4/testdata/datasets/functional/functional_schema_template.sql@a2115 PS4, Line 2115: What the heck is this actually doing? If I am reading this correctly, it is loading the data twice, both times with OVERWRITE, so the net effect is simply more time spent doing the data load, with no validation that the results are the same. -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 01 Dec 2017 17:48:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8690 ) Change subject: IMPALA-6068: Scale back fixing functional-types .. Patch Set 4: (1 comment) 22:54:51 FAILED (Took: 11 min 53 sec) 22:54:51 'load-data functional-query exhaustive' failed. Tail of log: 22:54:51 c1 int, 22:54:51 c2 double 22:54:51 ) 22:54:51 ROW FORMAT delimited fields terminated by ',' escaped by '\\' 22:54:51 STORED AS TEXTFILE 22:54:51 LOCATION '/test-warehouse/table_with_header' 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 USE functional 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 ALTER TABLE table_with_header SET TBLPROPERTIES('skip.header.line.count'='1') 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 CREATE DATABASE IF NOT EXISTS functional 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 CREATE EXTERNAL TABLE IF NOT EXISTS functional.table_with_header_2 ( 22:54:51 c1 int, 22:54:51 c2 double 22:54:51 ) 22:54:51 ROW FORMAT delimited fields terminated by ',' escaped by '\\' 22:54:51 STORED AS TEXTFILE 22:54:51 LOCATION '/test-warehouse/table_with_header_2' 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 USE functional 22:54:51 22:54:51 (load-functional-query-exhaustive-impala-generated-text-none-none.sql): 22:54:51 ALTER TABLE table_with_header_2 SET TBLPROPERTIES('skip.header.line.count'='2') 22:54:51 22:54:51 Data Loading from Impala failed with error: ImpalaBeeswaxException: 22:54:51 INNER EXCEPTION: 22:54:51 MESSAGE: AnalysisException: Could not resolve table reference: 'table_with_header_2' 22:54:51 22:54:51 Traceback (most recent call last): 22:54:51 File "/home/ubuntu/Impala/bin/load-data.py", line 178, in exec_impala_query_from_file 22:54:51 result = impala_client.execute(query) 22:54:51 File "/home/ubuntu/Impala/tests/beeswax/impala_beeswax.py", line 173, in execute 22:54:51 handle = self.__execute_query(query_string.strip(), user=user) 22:54:51 File "/home/ubuntu/Impala/tests/beeswax/impala_beeswax.py", line 339, in __execute_query 22:54:51 handle = self.execute_query_async(query_string, user=user) 22:54:51 File "/home/ubuntu/Impala/tests/beeswax/impala_beeswax.py", line 335, in execute_query_async 22:54:51 return self.__do_rpc(lambda: self.imp_service.query(query,)) 22:54:51 File "/home/ubuntu/Impala/tests/beeswax/impala_beeswax.py", line 460, in __do_rpc 22:54:51 raise ImpalaBeeswaxException(self.__build_error_message(b), b) 22:54:51 ImpalaBeeswaxException: ImpalaBeeswaxException: 22:54:51 INNER EXCEPTION: 22:54:51 MESSAGE: AnalysisException: Could not resolve table reference: 'table_with_header_2' 22:54:51 22:54:51 Background task Loading functional-query data (pid 69352) failed. http://gerrit.cloudera.org:8080/#/c/8690/3/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/8690/3/testdata/bin/generate-schema-statements.py@531 PS3, Line 531: assert not (insert and insert_hive),\ s/load_hive/insert_hive My bad. Trying again. -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 01 Dec 2017 17:33:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Hello Philip Zeyliger, David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8690 to look at the new patch set (#4). Change subject: IMPALA-6068: Scale back fixing functional-types .. IMPALA-6068: Scale back fixing functional-types I re-created the original patch for IMPALA-6068, but only performed what I believe to be the limited legal transformation of data load: DEPENDENT_LOAD -> DEPENDENT_LOAD_HIVE. Any place that directly uploads via hadoop or hdfs commands was left alone as changing it can't be proven to be correct. Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Testing: Doing a full data load. I mistakenly changed a variable name causing the first two dry-runs to fail. --- M testdata/bin/generate-schema-statements.py M testdata/common/widetable.py M testdata/datasets/functional/functional_schema_template.sql 3 files changed, 61 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/8690/4 -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6068: Scale back fixing functional-types
Hello Philip Zeyliger, David Knupp, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8690 to look at the new patch set (#3). Change subject: IMPALA-6068: Scale back fixing functional-types .. IMPALA-6068: Scale back fixing functional-types I re-created the original patch for IMPALA-6068, but only performed what I believe to be the limited legal transformation of data load: DEPENDENT_LOAD -> DEPENDENT_LOAD_HIVE. Any place that directly uploads via hadoop or hdfs commands was left alone as changing it can't be proven to be correct. Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Testing: Doing a full data load. I mistakenly changed a variable name causing the first two dry-runs to fail. --- M testdata/bin/generate-schema-statements.py M testdata/common/widetable.py M testdata/datasets/functional/functional_schema_template.sql 3 files changed, 61 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/8690/3 -- To view, visit http://gerrit.cloudera.org:8080/8690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c242cca209a7138b10ad517076707709b5cd204 Gerrit-Change-Number: 8690 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6244: Fix test failures with Hadoop 3.0
Hello Philip Zeyliger, David Knupp, Tim Wood, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8656 to look at the new patch set (#3). Change subject: IMPALA-6244: Fix test failures with Hadoop 3.0 .. IMPALA-6244: Fix test failures with Hadoop 3.0 The metadata query test fails when run against Hadoop 3.0 due to some defaults changing for sequence files. Testing: Compared the output of hadoop fs -text /test-warehouse/alltypesmixedformat/year=2009/month=2/23_0 and verified it is the same after a data load on Hadoop 2.6 and Hadoop 3.0; ran the metadata query test and verified it now passes in both cases. Change-Id: I1ccffdb0f712da1feb55f839e8d87a30f15e4fb6 --- M testdata/workloads/functional-query/queries/QueryTest/show-stats.test 1 file changed, 5 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8656/3 -- To view, visit http://gerrit.cloudera.org:8080/8656 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1ccffdb0f712da1feb55f839e8d87a30f15e4fb6 Gerrit-Change-Number: 8656 Gerrit-PatchSet: 3 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Wood Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6244: Fix test failures with Hadoop 3.0
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8656 ) Change subject: IMPALA-6244: Fix test failures with Hadoop 3.0 .. Patch Set 2: > (2 comments) Sigh, bad link - here is the permalink: https://github.com/cloudera/Impala/blob/9bc3d19e331a871b18dd4cb91734b841473f58fa/tests/common/test_result_verifier.py#L385 -- To view, visit http://gerrit.cloudera.org:8080/8656 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ccffdb0f712da1feb55f839e8d87a30f15e4fb6 Gerrit-Change-Number: 8656 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Wood Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 30 Nov 2017 03:47:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6244: Fix test failures with Hadoop 3.0
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8656 ) Change subject: IMPALA-6244: Fix test failures with Hadoop 3.0 .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8656/2/testdata/workloads/functional-query/queries/QueryTest/show-stats.test File testdata/workloads/functional-query/queries/QueryTest/show-stats.test: http://gerrit.cloudera.org:8080/#/c/8656/2/testdata/workloads/functional-query/queries/QueryTest/show-stats.test@85 PS2, Line 85: RESULTS: VERIFY_IS_EQUAL > VERIFY_IS_EQUAL should not be needed False, if using row_regex. Somewhat true if using column regex and this is a late enough column to be distinct. The problem is that rows will be sorted by default if VERIFY_IS_EQUAL is not provided, and the sort does not take any accounting for the presence of regular expressions in either rows or columns. https://github.com/cloudera/Impala/blob/cdh5-trunk/tests/common/test_result_verifier.py#L385 http://gerrit.cloudera.org:8080/#/c/8656/2/testdata/workloads/functional-query/queries/QueryTest/show-stats.test@86 PS2, Line 86: '2009','1',-1,1,'19.59KB','NOT CACHED','NOT CACHED','TEXT','false','$NAMENODE/test-warehouse/alltypesmixedformat/year=2009/month=1' > just change them all Done -- To view, visit http://gerrit.cloudera.org:8080/8656 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ccffdb0f712da1feb55f839e8d87a30f15e4fb6 Gerrit-Change-Number: 8656 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Wood Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 30 Nov 2017 03:45:16 + Gerrit-HasComments: Yes