Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 )
Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters ...................................................................... Patch Set 20: (8 comments) http://gerrit.cloudera.org:8080/#/c/21131/20/be/src/util/coding-util-test.cc File be/src/util/coding-util-test.cc: http://gerrit.cloudera.org:8080/#/c/21131/20/be/src/util/coding-util-test.cc@98 PS20, Line 98: "*/:=?\\{[]"; Please add '^' http://gerrit.cloudera.org:8080/#/c/21131/20/be/src/util/coding-util-test.cc@104 PS20, Line 104: nit: remove blank line http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py File tests/query_test/test_insert.py: http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@297 PS20, Line 297: """Test for special characters in partition values.""" nit: move this inside the method, i.e. def test_escaped_partition_values(self, unique_database): """Test for special characters in partition values.""" http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@299 PS20, Line 299: nit: use 2 spaces indention http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@300 PS20, Line 300: e > flake8: E501 line too long (106 > 90 characters) wrap this to be self.execute_query( "create table {}(i int) partitioned by (p string) stored as parquet".format(tbl)) http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@302 PS20, Line 302: 7 > flake8: E501 line too long (108 > 90 characters) Single quote is missing here. The two single quotes in the middle doesn't represent a single quote. They become the end and start of two strings. We can double quote the string and wrap it as special_characters_ = "SpecialCharacters\x01\x02\x03\x04\x05\x06\x07\b\t\n\v\f\r" \ "\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B" \ "\x1C\x1D\x1E\x1F\"\x7F'%*/:=?\\{[]#^" Also add '^' and remove "()" http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@303 PS20, Line 303: _ > flake8: E501 line too long (104 > 90 characters) wrap this to be self.execute_query( "insert into {} partition(p='{}') values (0)".format(tbl, special_characters_)) http://gerrit.cloudera.org:8080/#/c/21131/20/tests/query_test/test_insert.py@305 PS20, Line 305: assert a.data[0] == special_characters_ This just verifies the partition value. We still need to verify the partition dir created by Impala is correct. This seems better: def test_escaped_partition_values(self, unique_database): """Test for special characters in partition values.""" tbl = unique_database + ".tbl" self.execute_query( "create table {}(i int) partitioned by (p string) stored as parquet".format(tbl)) # Use "\\'" for single quote since the insert statement use single quote on the # partition value. Use "\\\\" for back slash since it will be escaped again in # parsing the insert statement special_characters = "SpecialCharacters\x01\x02\x03\x04\x05\x06\x07\b\t\n\v\f\r" \ "\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B" \ "\x1C\x1D\x1E\x1F\"\x7F\\'%*/:=?\\\\{[]#^" part_value = "SpecialCharacters\x01\x02\x03\x04\x05\x06\x07\b\t\n\v\f\r" \ "\x0E\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B" \ "\x1C\x1D\x1E\x1F\"\x7F'%*/:=?\{[]#^" part_dir = "p=SpecialCharacters%01%02%03%04%05%06%07%08%09%0A%0B%0C%0D%0E%0F%10%11" \ "%12%13%14%15%16%17%18%19%1A%1B%1C%1D%1E%1F%22%7F%27%25%2A%2F%3A%3D%3F" \ "%5C%7B%5B%5D%23%5E" res = self.execute_query( "insert into {} partition(p='{}') values (0)".format(tbl, special_characters)) assert res.data[0] == part_dir + ": 1" res = self.client.execute("select p from {}".format(tbl)) assert res.data[0] == part_value res = self.execute_query("show partitions " + tbl) # There are "\t" in the partition value so we can't easily split the result line. # Just verify the values are inside it. assert part_value in res.data[0] assert part_dir in res.data[0] -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 20 Gerrit-Owner: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Zihao Ye <eyiz...@163.com> Gerrit-Comment-Date: Tue, 07 May 2024 03:26:22 +0000 Gerrit-HasComments: Yes