FauxFaux opened a new issue #1516:
URL: https://github.com/apache/arrow-datafusion/issues/1516


   **Describe the bug**
   An aggregation query against a `FixedSizeBinary` column returns an internal 
error:
   
   `Error: Arrow error: External error: Execution error: Arrow error: External 
error: Internal error: Unsupported data type in hasher: FixedSizeBinary(16).`
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   `ctx.sql("select fsb, count(*) from tbl group by fsb")` for some `fsb 
FixedSizedBinary` column.
   
   **Expected behavior**
   
   I expect this column to be treated as if it was a `Binary`: equality based 
on length and byte equality.
   
   
   **Additional context**
   I realise there's very little support for `FixedSizeBinary` columns anywhere 
else; I have built my own equality as a UDF. Not sure if there's a wider plan 
here, like "always treat them as `Binary`".
   
   I'm pulling these from Parquet; they are legitimately binary opaque keys, 
exactly like a (v4) UUID, and I believe a FixedSizeBinary is the right type for 
them here. They have duplicates in a way that Parquet's compression handles 
well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to