[ https://issues.apache.org/jira/browse/HUDI-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Guo updated HUDI-4626: ---------------------------- Description: Currently, creating a table partitioned by "_hoodie_partition_path" fails w/ the following exception: {code:java} AnalysisException: Found duplicate column(s) in the data schema and the partition schema: _hoodie_partition_path {code} Using following DDL: {code:java} CREATE EXTERNAL TABLE `active_storage_attachments`( `_hoodie_commit_time` string COMMENT '', `_hoodie_commit_seqno` string COMMENT '', `_hoodie_record_key` string COMMENT '', `_hoodie_file_name` string COMMENT '', `_change_operation_type` string COMMENT '', `_upstream_event_processed_ts_ms` bigint COMMENT '', `db_shard_source_partition` string COMMENT '', `_event_origin_ts_ms` bigint COMMENT '', `_event_tx_id` bigint COMMENT '', `_event_lsn` bigint COMMENT '', `_event_xmin` bigint COMMENT '', `id` bigint COMMENT '', `name` string COMMENT '', `record_type` string COMMENT '', `record_id` bigint COMMENT '', `blob_id` bigint COMMENT '', `created_at` timestamp COMMENT '')PARTITIONED BY ( `_hoodie_partition_path` string COMMENT '')ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH SERDEPROPERTIES ( 'hoodie.query.as.ro.table'='false', 'path'='...') STORED AS INPUTFORMAT 'org.apache.hudi.hadoop.HoodieParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'LOCATION '...' TBLPROPERTIES ( 'spark.sql.sources.provider'='hudi' ) {code} was: Currently, creating a table partitioned by "_hoodie_partition_path" fails w/ the following exception: {code:java} // TBA {code} Using following DDL: {code:java} CREATE EXTERNAL TABLE `active_storage_attachments`( `_hoodie_commit_time` string COMMENT '', `_hoodie_commit_seqno` string COMMENT '', `_hoodie_record_key` string COMMENT '', `_hoodie_file_name` string COMMENT '', `_change_operation_type` string COMMENT '', `_upstream_event_processed_ts_ms` bigint COMMENT '', `db_shard_source_partition` string COMMENT '', `_event_origin_ts_ms` bigint COMMENT '', `_event_tx_id` bigint COMMENT '', `_event_lsn` bigint COMMENT '', `_event_xmin` bigint COMMENT '', `id` bigint COMMENT '', `name` string COMMENT '', `record_type` string COMMENT '', `record_id` bigint COMMENT '', `blob_id` bigint COMMENT '', `created_at` timestamp COMMENT '')PARTITIONED BY ( `_hoodie_partition_path` string COMMENT '')ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH SERDEPROPERTIES ( 'hoodie.query.as.ro.table'='false', 'path'='...') STORED AS INPUTFORMAT 'org.apache.hudi.hadoop.HoodieParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'LOCATION '...' TBLPROPERTIES ( 'spark.sql.sources.provider'='hudi' ) {code} > Partitioning table by `_hoodie_partition_path` fails > ---------------------------------------------------- > > Key: HUDI-4626 > URL: https://issues.apache.org/jira/browse/HUDI-4626 > Project: Apache Hudi > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Alexey Kudinkin > Priority: Blocker > > > Currently, creating a table partitioned by "_hoodie_partition_path" fails w/ > the following exception: > {code:java} > AnalysisException: Found duplicate column(s) in the data schema and the > partition schema: _hoodie_partition_path > {code} > Using following DDL: > {code:java} > CREATE EXTERNAL TABLE `active_storage_attachments`( `_hoodie_commit_time` > string COMMENT '', `_hoodie_commit_seqno` string COMMENT '', > `_hoodie_record_key` string COMMENT '', `_hoodie_file_name` string COMMENT > '', `_change_operation_type` string COMMENT '', > `_upstream_event_processed_ts_ms` bigint COMMENT '', > `db_shard_source_partition` string COMMENT '', `_event_origin_ts_ms` bigint > COMMENT '', `_event_tx_id` bigint COMMENT '', `_event_lsn` bigint COMMENT > '', `_event_xmin` bigint COMMENT '', `id` bigint COMMENT '', `name` > string COMMENT '', `record_type` string COMMENT '', `record_id` bigint > COMMENT '', `blob_id` bigint COMMENT '', `created_at` timestamp COMMENT > '')PARTITIONED BY ( `_hoodie_partition_path` string COMMENT '')ROW FORMAT > SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH > SERDEPROPERTIES ( 'hoodie.query.as.ro.table'='false', 'path'='...') > STORED AS INPUTFORMAT 'org.apache.hudi.hadoop.HoodieParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'LOCATION > '...' > TBLPROPERTIES ( 'spark.sql.sources.provider'='hudi' ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)