[ https://issues.apache.org/jira/browse/SPARK-23380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon reassigned SPARK-23380: ------------------------------------ Assignee: Hyukjin Kwon > Adds a conf for Arrow fallback in toPandas/createDataFrame with Pandas > DataFrame > -------------------------------------------------------------------------------- > > Key: SPARK-23380 > URL: https://issues.apache.org/jira/browse/SPARK-23380 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 2.3.0 > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Major > Fix For: 2.4.0 > > > Seems we can check the schema ahead and fall back in toPandas. > Please see this case below: > {code} > df = spark.createDataFrame([[{'a': 1}]]) > spark.conf.set("spark.sql.execution.arrow.enabled", "false") > df.toPandas() > spark.conf.set("spark.sql.execution.arrow.enabled", "true") > df.toPandas() > {code} > {code} > ... > py4j.protocol.Py4JJavaError: An error occurred while calling > o42.collectAsArrowToPython. > ... > java.lang.UnsupportedOperationException: Unsupported data type: > map<string,bigint> > {code} > In case of {{createDataFrame}}, we fall back to make this at least working > even though the optimisation is disabled. > {code} > df = spark.createDataFrame([[{'a': 1}]]) > spark.conf.set("spark.sql.execution.arrow.enabled", "false") > pdf = df.toPandas() > spark.createDataFrame(pdf).show() > spark.conf.set("spark.sql.execution.arrow.enabled", "true") > spark.createDataFrame(pdf).show() > {code} > {code} > ... > ... UserWarning: Arrow will not be used in createDataFrame: Error inferring > Arrow type ... > +--------+ > | _1| > +--------+ > |[a -> 1]| > +--------+ > {code} > We need to match the behaviours and add a configuration to control the > behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org