[ https://issues.apache.org/jira/browse/ARROW-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17600947#comment-17600947 ]
Roberto Lobo edited comment on ARROW-17636 at 9/6/22 7:11 PM: -------------------------------------------------------------- Using an workaround: {code:java} conversion_options['types_mapper'] = _TYPE_MAPPINGS.get try: data = table.to_pandas(**conversion_options) except NotImplementedError: for tcolumn in list(table.columns): if isinstance(tcolumn.type, pa.DictionaryType): problematic_columns.append((tcolumn._name, tcolumn, pa.int64())) for tcolumn_name, tcolumn, tcolumn_type in problematic_columns: table = table.drop([tcolumn_name]) table = table.append_column( pa.field(tcolumn._name,tcolumn_type), [tcolumn.to_pylist()] ) data = table.to_pandas(**conversion_options) {code} was (Author: JIRAUSER295439): Using an workaround: {code:java} conversion_options['types_mapper'] = _TYPE_MAPPINGS.get try: data = table.to_pandas(**conversion_options) except NotImplementedError: # FIX NotImplemented that happens when partition column is int problematic_columns = [] for tcolumn in list(table.columns): if isinstance(tcolumn.type, pa.DictionaryType): if pa.types.is_integer(tcolumn.type.value_type) and pa.types.is_integer(tcolumn.type.index_type): problematic_columns.append((tcolumn._name, tcolumn, pa.int64())) for tcolumn_name, tcolumn, tcolumn_type in problematic_columns: table = table.drop([tcolumn_name]) table = table.append_column(pa.field(tcolumn._name, tcolumn_type), [tcolumn.to_pylist()]) data = table.to_pandas(**conversion_options) ############################################################## {code} > Converting Table to pandas raises NotImplementedError (when table previously > saved as partitioned parquet dataset) > ------------------------------------------------------------------------------------------------------------------ > > Key: ARROW-17636 > URL: https://issues.apache.org/jira/browse/ARROW-17636 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 9.0.0 > Environment: Docker container, based on continuumio/anaconda3 > Python 3.9.12 > PyArrow 9.0.0 > Reporter: Roberto Lobo > Priority: Major > > When converting a table in which one of the column's type is of > DictionaryType (values=int32, indices=int32, ordered=0) the conversion to > pandas DataFrame fails with: > NotImplementedError: dictionary<values=int32, indices=int32, ordered=0> > The dictionary has this conversion not implmented yet. > This DictionaryType is used as type when using one of the columns (Int64) as > one of the parquet's dataset partition columns. -- This message was sent by Atlassian Jira (v8.20.10#820010)