viirya commented on PR #6269:
URL: 
https://github.com/apache/arrow-datafusion/pull/6269#issuecomment-1539052195

   > Yeah, I guess I was thinking it would nice to avoid the unpacking of the 
dictionary result into a primitive array (when possible)
   
   I meant, for mathematics numerical kernels (e.g. add, minus etc.), the 
result of operation between two dictionary arrays is primitive array. We don't 
unpack dictionary array into primitive array. This is why the coercion rule 
specifies the result type of such op as primitive type instead of dictionary of 
it.
   
   But for such op between dictionary and a scalar, the result is dictionary 
array as for such op it can simply apply on dictionary values which is not the 
same for above case (dictionary and dictionary). So the inconsistency 
(primitive for dictionary/dictionary and dictionary for dictionary/scalar) 
leads to the bug we saw.
   
   We can either changing primitive result of op on dictionary/dictionary to 
dictionary, or changing dictionary result of op on dictionary/scalar to 
primitive. This takes the later one as a fix. One reason is that this is simply 
to apply to fix the issue now. Another reason is that I'm not sure packing op 
result of dictionary/dictionary as dictionary making sense. It is doable but 
considering dictionary encoding during mathematics numerical op, it might be 
introducing performance penalty. I'll find some time trying that.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to