zanmato1984 commented on PR #44184:
URL: https://github.com/apache/arrow/pull/44184#issuecomment-2877634929

   Hi @khwilson , thanks for the update and the extensive research.
   
   > > Ideally, the scale should be "floating" just as in floating-point 
arithmetic, depending on the current running sum (the running sum can be very 
large if all data is positive, or very small if the data is centered around 
zero). It is then normalized to the original scale at the end. But of course 
that makes the algorithm more involved.
   > 
   > Yeah, this is quite complicated and essentially means you need to 
implement the floating point addition algorithm.
   
   +1 that this could be unrealistic for us to implement given the complicity.
   
   > > Either is fine to me.
   > 
   > Cool. @zanmato1984 do you have an opinion?
   
   I don't have an obvious preference on this. (And we don't have to jump to 
the conclusion too soon, do we?)
   
   I do have an opinion on the ideal case though. As I understand the arrow 
compute module, it should be a building block  of comprehensive data 
systems/applications. Therefore it should remain neutral on the 
application-specific behaviors, esp. the case that no one is obviously superior 
than others. That is, we probably should supply options for the desired 
behaviors, something like `enum PrecisionPolicy { PROMOTE_TO_MAX, 
DEMOTE_TO_DOUBLE, }`, and do the computation accordingly (as long as there's 
not too much engineering complexity). Of course this is an ultimate goal in the 
future and shouldn't be the concern of this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to