Hi!
Yes - each chunk can have a different mapping. And yes, you can use
unify_dictionaries/combine_chunks if you need it.
Or you can keep a separate dictionary for each chunk as well.
There is one thing I would like to have - any number of bits as indices
(e.g. 4bit or 12bit)
and further,
Thanks!
I read somewhere that the string->int mapping are not guaranteed to be the
same across chunks. Is this correct?
If so, is calling first unify_dictionaries() necessary?
Also, if the operations only work on chunks is it up to the user to iterate
through all chunks to create the resulting
Hi!
table.column('a').chunk(0).dictionary returns dictionary values as an array
that you can map...
Then you can construct new Dictionary Type columns from the mapped values
and table.column('a').chunk(0).indices
using pa.DictionaryArray.from_arrays
BR
J
niedz., 28 kwi 2024 o 20:19 Laurent
Hi,
Is there a way to cast an Array of data type DictionaryType ( for example,
I have DictionaryType(dictionary)) into integers (the indices) and retrieve the mapping (string
-> integer)?
I cannot find anything about this in the documentation. For the first ask
(cast to integers), trying to cast