alamb opened a new issue, #2874: URL: https://github.com/apache/arrow-datafusion/issues/2874
**Describe the bug** Various parts of the DataFusion codebase assume that the transformation between `ScalarValue` <--> `Array` have the same datatype. This would seem to be a reasonable assumption, however it does not hold for at least for `DictionaryArrays` For example, a `ScalarVaule` that is converted to an array, `cast`ed to a `DictionaryArray<_>` due to coertion rules, and then converted back to a `ScalarVaule`. When that supposedly cast `ScalarValue` is converted back to an Array, it does not maintain its Dictionary encoding, instead it results in a DataType::Utf8 **To Reproduce** ```rust fn bad_cast() { // here is a problem with round trip casting to/from a dictionary // array. It is desired to cast this ScalarValue to a Dictionary // (for coertion, for example) let scalar = ScalarValue::Utf8(Some("foo".to_string())); let desired_type = DataType::Dictionary( // key type Box::new(DataType::Int32), // value type Box::new(DataType::UInt8) ); // convert from scalar --> Array to call cast let scalar_array = scalar.to_array(); // cast the actual value let cast_array = kernels::cast::cast(&scalar_array, &desired_type).unwrap(); // turn it back to a scalar let cast_scalar = ScalarValue::try_from_array(&cast_array, 0).unwrap(); // Some time later the "cast" scalar is turned back into an array: let array = cast_scalar.to_array_of_size(10); // The datatype should be "Dictionary" but is actually Utf8!!! assert_eq!(array.data_type(), &desired_type) } ``` Running this function results in ``` thread 'main' panicked at 'assertion failed: `(left == right)` left: `UInt8`, right: `Dictionary(Int32, UInt8)`', src/main.rs:76:5 ``` **Expected behavior** Test case should pass **Additional context** I am not sure if it makes sense to add a `ScalarValue::Dictionary` type variant, or perhaps add a `is_dictionary` flag or something else, or maybe even just not assume a `ScalarValue` can be round tripped and maintain its data type This is the root cause of https://github.com/apache/arrow-datafusion/issues/2873 -- I added a patch for that particular case but this problem can occur elsewhere -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org