jorisvandenbossche commented on PR #36846:
URL: https://github.com/apache/arrow/pull/36846#issuecomment-1689465208

   > I'm not interested in these special cases at least, it is just in here 
because the original author did this, but I'm also fine with only including the 
obvious and getting rid of all the flags.
   
   Sorry for the slow reply here (holidays/conferences), but one thing to add: 
while I don't have a strong opinion on whether all fine-grained options are 
useful, I do think a promotion like int -> float can certainly be useful to 
allow, even if that's not a typical schema evolution allowed by Iceberg. At 
least in the pandas world, it's quite common to end up with a mixture (eg 
because of missing values turning int to float), and then being able to read 
multiple parquet files with such a difference in schema can help. 
   Also in general, `pd.concat` of DataFrames follows the numpy promotion 
rules, and there int/float combinations typically promote to float.
   
   > I think having the same safe flag similar to `Table.cast` on the 
`pa.concat_tables` would do the trick as well. 
   
   I _think_ promotion and safe/unsafe casting can in theory be two distinct 
questions. For example, you might want to allow promoting int to float, but 
only it preserves the value (safe casting). Or promote "s" timestamps to "ns", 
but only it's not out of bounds.
   
   (now, I haven't looked into detail at the PR yet, so not sure that makes 
sense. But I was planning to take a closer look later today)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to