[ https://issues.apache.org/jira/browse/ARROW-12099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555329#comment-17555329 ]
Nick Crews commented on ARROW-12099: ------------------------------------ Small tweak to Guido's implementation (thank you for this!): If the table only has the one ListArray or MapArray column, then it crashes. This handles that case: {code:python} import pyarrow as paimport pyarrow.compute as pc def explode_table(table, column): null_filled = pc.fill_null(table[column], [None]) flattened = pc.list_flatten(null_filled) other_columns = list(table.schema.names) other_columns.remove(column) if len(other_columns) == 0: return pa.table({column: flattened}) else: indices = pc.list_parent_indices(null_filled) result = table.select(other_columns).take(indices) result = result.append_column( pa.field(column, table.schema.field(column).type.value_type), flattened, ) return result {code} > [Python] Explode array column > ----------------------------- > > Key: ARROW-12099 > URL: https://issues.apache.org/jira/browse/ARROW-12099 > Project: Apache Arrow > Issue Type: New Feature > Components: Python > Reporter: Malthe Borch > Priority: Major > > In Apache Spark, > [explode|https://spark.apache.org/docs/latest/api/sql/index.html#explode] > separates the elements of an array column (or expression) into multiple row. > Note that each explode works at the top-level only (not recursively). > This would also work with the existing > [flatten|https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten] > method to allow fully unnesting a > [pyarrow.StructArray|https://arrow.apache.org/docs/python/generated/pyarrow.StructArray.html#pyarrow-structarray]. -- This message was sent by Atlassian Jira (v8.20.7#820007)