ByteBaker commented on issue #3373: URL: https://github.com/apache/arrow-rs/issues/3373#issuecomment-1500905097
I was facing the same problem a few days back and came here to create an issue, then found this. To confirm, I loaded the original file into pandas and then saved to another one using pyarrow as the engine this time and the problem was gone. The issue is, our dataset is quite large (hundreds of GBs of parquet). And it'll be a daunting task to reload everything. What should I do to handle this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org