vrecan opened a new issue, #701:
URL: https://github.com/apache/iceberg-go/issues/701

   ### Apache Iceberg version
   
   main (development)
   
   ### Please describe the bug 🐞
   
   When using glue and a large message schema we get this error on every 
updateCatalog
   
   Glue: UpdateTable, https response error StatusCode: 400, RequestID: 
21d099dd-a972-488f-b3ab-7a6a2a1a4c35, InvalidInputException: Payload size of 
request exceeded limit
   
   
   I believe it's because constructTableInput always includes all schema 
columns via schemasToGlueColumns(staged.Metadata()) on every UpdateTable call. 
We have roughly 3600 fields. This is by design because we use these records to 
write to elastic search with the "elastic common schema" format which has a lot 
of fields. A usual message will only have between 20-50 fields set but all 
fields exist in the schema.
   
   Would it make sense to be able to avoid having the schema in updateTable? I 
believe it's only used as metadata used in athena, the real schema would still 
exist in the metadata and parquet files? 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to