Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18990 )

Change subject: IMPALA-10660: Impala shell prints DOUBLEs with less precision 
in HS2 than beeswax
......................................................................


Patch Set 3:

(6 comments)

> (3 comments)
 >
 > Are there any security implications of accepting user input as
 > format strings? Can we say it is completely safe?

I changed the format parsing to a more strict approach: now the user input is 
interpreted with a ":" prefix, which forces it to be a format specification 
(https://docs.python.org/3/library/string.html#formatspec). Now, the remaining 
specification has a limited set of options:
- fill: fills the remaining width with the given character, can make 
numerically invalid string, eg.: aaaa1000.0 but they are printable, unicode 
characters are not allowed
- align: 4 options, cannot enter invalid option
- sign: 3 options, same as align
- width: arbitrary number of digits, can make the printing slow with huge 
values, for example width of 100000000 is printing for 1 minute
- grouping_option 2 options, same as align and sign
- precision: arbitrary number of digits, same as width
- type: 8 options, same as the previous non-terminals with closed option set

The only concern that I can see is the arbitrary number of digits for width and 
precision, it can slow down the shell to never give back result. There's no 
option that can accept named fields and identifiers, these are restricted, 
because these elements must precede the ":" token.

http://gerrit.cloudera.org:8080/#/c/18990/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/18990/2/shell/impala_client.py@63
PS2, Line 63:     TTypeId.SMALLINT_TYPE: operator.attrgetter('i16Val'),
> Update the comment section above, as this has changed.
Done


http://gerrit.cloudera.org:8080/#/c/18990/2/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/18990/2/shell/option_parser.py@324
PS2, Line 324: hs2_fp_format", type="s
> It would be nice to contain the info in the name that this only applies to
Done


http://gerrit.cloudera.org:8080/#/c/18990/2/shell/option_parser.py@326
PS2, Line 326: help
> Can you add info also about the default Python behavior and the value neede
Done


http://gerrit.cloudera.org:8080/#/c/18990/2/shell/value_converter.py
File shell/value_converter.py:

http://gerrit.cloudera.org:8080/#/c/18990/2/shell/value_converter.py@32
PS2, Line 32:     def __init__(self):
> Is it intentional that FLOAT and DOUBLE are not in this map? Don't we need
FLOAT and DOUBLE added back
Other, non-numeric types are extracted as string so no conversion is needed.


http://gerrit.cloudera.org:8080/#/c/18990/2/shell/value_converter.py@32
PS2, Line 32: def __init__(self):
> Is it intentional that it is a class (static) variable? I think it would be
Done


http://gerrit.cloudera.org:8080/#/c/18990/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/18990/2/tests/shell/test_shell_commandline.py@1312
PS2, Line 1312: mat
> The value was chosen to match with beeswax, right? I think that we could sk
Done



--
To view, visit http://gerrit.cloudera.org:8080/18990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I424339266be66437941be8bafaa83fa0f2dfbd4e
Gerrit-Change-Number: 18990
Gerrit-PatchSet: 3
Gerrit-Owner: Peter Rozsa <pro...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pro...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Sep 2022 14:04:33 +0000
Gerrit-HasComments: Yes

Reply via email to