Re: Python: going from DictionaryType to array of integers and str->int mapping?

2024-04-29 Thread Jacek Pliszka
Hi! Yes - each chunk can have a different mapping. And yes, you can use unify_dictionaries/combine_chunks if you need it. Or you can keep a separate dictionary for each chunk as well. There is one thing I would like to have - any number of bits as indices (e.g. 4bit or 12bit) and further,

Re: Python: going from DictionaryType to array of integers and str->int mapping?

2024-04-28 Thread Laurent Gautier
Thanks! I read somewhere that the string->int mapping are not guaranteed to be the same across chunks. Is this correct? If so, is calling first unify_dictionaries() necessary? Also, if the operations only work on chunks is it up to the user to iterate through all chunks to create the resulting

Re: Python: going from DictionaryType to array of integers and str->int mapping?

2024-04-28 Thread Jacek Pliszka
Hi! table.column('a').chunk(0).dictionary returns dictionary values as an array that you can map... Then you can construct new Dictionary Type columns from the mapped values and table.column('a').chunk(0).indices using pa.DictionaryArray.from_arrays BR J niedz., 28 kwi 2024 o 20:19 Laurent

Python: going from DictionaryType to array of integers and str->int mapping?

2024-04-28 Thread Laurent Gautier
Hi, Is there a way to cast an Array of data type DictionaryType ( for example, I have DictionaryType(dictionary)) into integers (the indices) and retrieve the mapping (string -> integer)? I cannot find anything about this in the documentation. For the first ask (cast to integers), trying to cast