Joris Van den Bossche created ARROW-15545:
---------------------------------------------

             Summary: [C++] Cast dictionary of extension type to extension type
                 Key: ARROW-15545
                 URL: https://issues.apache.org/jira/browse/ARROW-15545
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Joris Van den Bossche


We support casting a DictionaryArray to its dictionary values' type. For 
example:

{code}
>>> arr = pa.array([1, 2, 1]).dictionary_encode()
>>> arr
<pyarrow.lib.DictionaryArray object at 0x7f0c1aca46d0>

-- dictionary:
  [
    1,
    2
  ]
-- indices:
  [
    0,
    1,
    0
  ]

>>> arr.type
DictionaryType(dictionary<values=int64, indices=int32, ordered=0>)
>>> arr.cast(arr.type.value_type)
<pyarrow.lib.Int64Array object at 0x7f0c19891dc0>
[
  1,
  2,
  1
]
{code}

However, if the type of the dictionary values is an ExtensionType, this cast is 
not supported:

{code}
>>> from pyarrow.tests.test_extension_type import UuidType
>>> storage = pa.array([b"0123456789abcdef"], type=pa.binary(16))
>>> arr = pa.ExtensionArray.from_storage(UuidType(), storage)
>>> arr
<pyarrow.lib.ExtensionArray object at 0x7f0c1875bc40>
[
  30313233343536373839616263646566
]
>>> dict_arr = pa.DictionaryArray.from_arrays(pa.array([0, 0], pa.int32()), arr)
>>> dict_arr.type
DictionaryType(dictionary<values=extension<arrow.py_extension_type<UuidType>>, 
indices=int32, ordered=0>)
>>> dict_arr.cast(UuidType())
...
ArrowNotImplementedError: Unsupported cast from 
dictionary<values=extension<arrow.py_extension_type<UuidType>>, indices=int32, 
ordered=0> to extension<arrow.py_extension_type<UuidType>> (no available cast 
function for target type)
../src/arrow/compute/cast.cc:119  
GetCastFunctionInternal(cast_options->to_type, args[0].type().get())

{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to