[Impala-ASF-CR] IMPALA-13020: Change thrift rpc max message size to int64 t
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21367 ) Change subject: IMPALA-13020: Change thrift_rpc_max_message_size to int64_t .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/21367/4/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: http://gerrit.cloudera.org:8080/#/c/21367/4/be/src/rpc/thrift-util.cc@61 PS4, Line 61: DEFINE_int64(thrift_rpc_max_message_size, 4L * 1024 * 1024 * 1024, Can we set it to unlimited or a larger value? We have customers using impala-3.4 and their catalog-topic size reaches 4GB. Once they upgrade to Impala-4.x, the catalog-topic size will grow larger since impala-3.4 has better compression rate on catalog objects. The reason is in Impala-3.4, partition objects are sent inside the table object so more duplicated bytes can be saved. In Impala-4.x, partition objects are compressed and sent individually for incremental updates (IMPALA-3127). -- To view, visit http://gerrit.cloudera.org:8080/21367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I681b1849cc565dcb25de8c070c18776ce69cbb87 Gerrit-Change-Number: 21367 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 06 May 2024 06:06:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16095/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Mon, 06 May 2024 04:54:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 18: (2 comments) > Uploaded patch set 18. http://gerrit.cloudera.org:8080/#/c/21131/15/be/src/util/coding-util.cc File be/src/util/coding-util.cc: http://gerrit.cloudera.org:8080/#/c/21131/15/be/src/util/coding-util.cc@66 PS15, Line 66: !isalnum(static_cast(ch)) && !ShouldNotEscape(ch))) { > I played with it a bit and it doesn't work with for example '\t' and '\n' ( Done. Note: I tested those cases manually but to include those tests we'll have to bring a change in the test framework, which in its current state prevents us to write such tests. For example, '\t ' is considered a separator in our current test framework, so to incorporate such tests we might have to change the test framework which might lead to other errors, That's why I haven't included such cases in the current tests. http://gerrit.cloudera.org:8080/#/c/21131/15/be/src/util/coding-util.cc@66 PS15, Line 66: c > I think it would be good if we first cast it to unsigned char because of wh Done -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Mon, 06 May 2024 04:37:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
pranav.lo...@cloudera.com has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. IMPALA-11499: Refactor UrlEncode function to handle special characters An error came from an issue with URL encoding, where certain Unicode characters were being incorrectly encoded due to their UTF-8 representation matching characters in the set of characters to escape. For example, the string '运', which consists of three bytes 0xe8 0xbf 0x90 was wrongly getting encoded into '\E8%FFBF\90', because the middle byte matched one of the two bytes that represented the "\u00FF" literal. Inclusion of "\u00FF" was likely a mistake from the beginning and it should have been '\x7F'. The patch makes three key changes: 1. Before the change, the set of characters that need to be escaped was stored as a string. The current patch uses an unordered_set instead. 2. '\xFF', which is an invalid UTF-8 byte and whose inclusion was erroneous from the beginning, is replaced with '\x7F', which is a control character for DELETE, ensuring consistency and correctness in URL encoding. 3. The list of characters to be escaped is extended to match the current list in Hive. Testing: Tests on both traditional Hive tables and Iceberg tables are included in unicode-column-name.test, insert.test and coding-util-test.cc. Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb --- M be/src/util/coding-util-test.cc M be/src/util/coding-util.cc M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test 4 files changed, 177 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/21131/18 -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye