jorisvandenbossche commented on PR #36846: URL: https://github.com/apache/arrow/pull/36846#issuecomment-1689465208
> I'm not interested in these special cases at least, it is just in here because the original author did this, but I'm also fine with only including the obvious and getting rid of all the flags. Sorry for the slow reply here (holidays/conferences), but one thing to add: while I don't have a strong opinion on whether all fine-grained options are useful, I do think a promotion like int -> float can certainly be useful to allow, even if that's not a typical schema evolution allowed by Iceberg. At least in the pandas world, it's quite common to end up with a mixture (eg because of missing values turning int to float), and then being able to read multiple parquet files with such a difference in schema can help. Also in general, `pd.concat` of DataFrames follows the numpy promotion rules, and there int/float combinations typically promote to float. > I think having the same safe flag similar to `Table.cast` on the `pa.concat_tables` would do the trick as well. I _think_ promotion and safe/unsafe casting can in theory be two distinct questions. For example, you might want to allow promoting int to float, but only it preserves the value (safe casting). Or promote "s" timestamps to "ns", but only it's not out of bounds. (now, I haven't looked into detail at the PR yet, so not sure that makes sense. But I was planning to take a closer look later today) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
