jorisvandenbossche commented on issue #43558: URL: https://github.com/apache/arrow/issues/43558#issuecomment-2271193560
@b-phi thanks for the report! Some questions that might help diagnose: - Do you only see this when reading that dataset from S3, and not reading the same data locally? - Is the "timestamp" field part of the hive partitioning scheme (i.e. in the directory names), or is it a normal column in the Parquet files? - Does this happen with a specific dataset, or do you see it with various data? Is that dataset public so you could share it? Or if not, do you also see the issue with some randomly generated data with similar characteristics? And would you be able to run this code under `gdb` to see a backtrace of the segfault? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
