[ 
https://issues.apache.org/jira/browse/IMPALA-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825470#comment-17825470
 ] 

Quanlong Huang commented on IMPALA-12889:
-----------------------------------------

Here is where catalogd processes the request of changing fileformat: 
https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1193-L1204

reloadTableSchema is unchanged so it's false, which leads to not reloading the 
avro schema.

> Changing file format to AVRO doesn't update schema using 'avro.schema.url'
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-12889
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12889
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>              Labels: ramp-up
>         Attachments: alltypes.json
>
>
> When changing the file format of a table to AVRO, the schema is not updated 
> if there is a tblproperty of 'avro.schema.url'. However, after a REFRESH, the 
> schema is updated:
> {code:sql}
> create table my_part_tbl(i int) partitioned by (p int) stored as parquet;
> alter table my_part_tbl set tblproperties(
>   
> 'avro.schema.url'='hdfs:////test-warehouse/avro_schemas/functional/alltypes.json');
> alter table my_part_tbl set fileformat avro;
> describe my_part_tbl
> +------+------+---------+
> | name | type | comment |
> +------+------+---------+
> | i    | int  |         |
> | p    | int  |         |
> +------+------+---------+
> refresh my_part_tbl;
> describe my_part_tbl
> +-----------------+---------+-------------------+
> | name            | type    | comment           |
> +-----------------+---------+-------------------+
> | id              | int     | from deserializer |
> | bool_col        | boolean | from deserializer |
> | tinyint_col     | int     | from deserializer |
> | smallint_col    | int     | from deserializer |
> | int_col         | int     | from deserializer |
> | bigint_col      | bigint  | from deserializer |
> | float_col       | float   | from deserializer |
> | double_col      | double  | from deserializer |
> | date_string_col | string  | from deserializer |
> | string_col      | string  | from deserializer |
> | timestamp_col   | string  | from deserializer |
> | p               | int     |                   |
> +-----------------+---------+-------------------+
> {code}
> Note that explicitly setting the tblproperty after changing the file format 
> to AVRO does refresh the schema. I.e. changing fileformat before setting 
> 'avro.schema.url' works, but setting 'avro.schema.url' before changing 
> fileformat doesn't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to