sivaraman-ai opened a new issue, #1088:
URL: https://github.com/apache/iceberg-python/issues/1088
### Apache Iceberg version
0.6.1
### Please describe the bug ๐
while writing dataframe to iceberg through tbl.append(df), there happens to
be a schema validation of table schema & df schema.
this function in append `_check_schema_compatible(self.schema(),
other_schema=df.schema)` does the schema validation.
here table schema & df schema are converted to pyarrow schema of struct
type, and compared with order of columns with data types.
this results in the following error:
`Traceback (most recent call last):
File
"/Users/apple/Projects/bright/brightmoney_collections_system/utils/index.py",
line 172, in <module>
dff = write_to_iceberg(
File
"/Users/apple/Projects/bright/brightmoney_collections_system/utils/index.py",
line 163, in write_to_iceberg
table.append(pyarrow_df)
File
"/Users/apple/Projects/bright/brightmoney_collections_system/venv/lib/python3.9/site-packages/pyiceberg/table/__init__.py",
line 1057, in append
_check_schema_compatible(self.schema(), other_schema=df.schema)
File
"/Users/apple/Projects/bright/brightmoney_collections_system/venv/lib/python3.9/site-packages/pyiceberg/table/__init__.py",
line 175, in _check_schema_compatible
raise ValueError(f"Mismatch in fields:\n{console.export_text()}")
ValueError: Mismatch in fields:
โโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Table field โ Dataframe field
โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ โ
โ 1: a: optional timestamptz โ 1: a: optional timestamptz โ
โ โ
โ 2: b: optional timestamptz โ 2: b: optional timestamptz โ
โ โ
โ 3: x: optional string โ 3: x: optional string โ
โ โ
โ 4: y: optional string โ 4: y: optional string โ
โโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ`
yet there is no mismatch in field of table & dataframe.
ideally the schema compatibility should not consider the order in which
dataframe is send?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]