Zihao Ye has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22289 )

Change subject: IMPALA-12927: Support specifying format for reading JSON BINARY 
columns
......................................................................


Patch Set 1:

(4 comments)

Thanks for the code review! These suggestions are very helpful.

http://gerrit.cloudera.org:8080/#/c/22289/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22289/1//COMMIT_MSG@22
PS1, Line 22: perty unset or
            : incorrectly set, and will provide an error message.
> It could be useful to add a flag that acts as a default for this property i
Good idea, I added a query option to serve as this flag.


http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@572
PS1, Line 572:           slotDesc.setType(Type.STRING);
> This doesn't look correct to me because even if it works for JSON, theoreti
Done, you are right, we indeed need to consider the case where a table can have 
different file formats. Now this information will be passed to the backend as a 
query option and will only take effect in the HdfsJsonScanner. Ultimately, the 
decision to base64 decode will be made in the TextConverter.


http://gerrit.cloudera.org:8080/#/c/22289/1/testdata/workloads/functional-query/queries/QueryTest/json-binary-format.test
File 
testdata/workloads/functional-query/queries/QueryTest/json-binary-format.test:

http://gerrit.cloudera.org:8080/#/c/22289/1/testdata/workloads/functional-query/queries/QueryTest/json-binary-format.test@3
PS1, Line 3: lter table binary_tbl unset tblproperties
> This modifies a shared table and can leave it in a dirty state if the test
Done, we can clone a temporary table for testing.


http://gerrit.cloudera.org:8080/#/c/22289/1/testdata/workloads/functional-query/queries/QueryTest/json-binary-format.test@12
PS1, Line 12: No valid table properties 'json.binary.format' (base64 or 
rawstring) provided for scanning binary column of json table 
'$DATABASE.binary_tbl'
> Can you add a test for the case when there is no valid json.binary.format,
Done



--
To view, visit http://gerrit.cloudera.org:8080/22289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf61fa3afc0f33caa63fbc05393e975733165e82
Gerrit-Change-Number: 22289
Gerrit-PatchSet: 1
Gerrit-Owner: Zihao Ye <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zihao Ye <[email protected]>
Gerrit-Comment-Date: Thu, 16 Jan 2025 08:12:15 +0000
Gerrit-HasComments: Yes

Reply via email to