hubgeter opened a new pull request, #44848:
URL: https://github.com/apache/doris/pull/44848
bp #43469
Problem Summary:
Support reading json format hive table like:
```mysql
mysql> show create table basic_json_table;
CREATE TABLE `basic_json_table`(
`id` int,
`name` string,
`age` tinyint,
`salary` float,
`is_active` boolean,
`join_date` date,
`last_login` timestamp,
`height` double,
`profile` binary,
`rating` decimal(10,2))
ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe'
```
Behavior changed:
To implement this feature, this pr modifies `new_json_reader`. Previously,
`new_json_reader` could only insert data into columnString. In order to support
inserting data into columns of other types, `DataTypeSerDe` is introduced to
insert data into columns. To maintain compatibility with previous versions,
changes to this pr are triggered only when reading hive json tables.
Limitation of Use:
1. Currently, only query is supported, and writing is not supported.
2. Currently, only the `ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe';` scenario is supported. For some
properties specified in `with serdeproperties`, Doris does not take effect.
3. Since Hive does not allow columns with the same name but different case
when creating a table in Json format (including inside a Struct), we convert
the field names in the Json data to lowercase when reading the Json data file,
and then match according to the lowercase field names. For field names that are
duplicated after being converted to lowercase in the data, the value of the
last field is used (consistent with Hive behavior).
example:
```
create table json_table(
column int
)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
a.json:
{"column":1,"COLumn",2,"COLUMN":3}
{"column":10,"COLumn",20}
{"column":100}
in Hive : load a.json to table json_table
in Doris query:
---
3
20
100
---
```
Todo(in next pr):
Merge `serde` and `json_reader` ,because they have logical conflicts.
Hive catalog support read json format table.
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [ ] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]