[ https://issues.apache.org/jira/browse/SPARK-26627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743591#comment-16743591 ]
Hyukjin Kwon commented on SPARK-26627: -------------------------------------- You should do: {code} raw = spark.read.parquet("") converted = DataFrame(raw._jdf, raw.sql_ctx) converted.toPandas() {code} > sql_ctx loses '_conf' attribute for a pyspark dataframe converted to jdf and > back > --------------------------------------------------------------------------------- > > Key: SPARK-26627 > URL: https://issues.apache.org/jira/browse/SPARK-26627 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.4.0 > Reporter: Tomasz Bartczak > Priority: Trivial > > Having a pyspark code: > {code:java} > raw = spark.read.parquet("") > converted = DataFrame(raw._jdf, spark) > converted.toPandas(){code} > what I get when running toPandas is: > {code:java} > -> 2079 if self.sql_ctx._conf.pandasRespectSessionTimeZone(): > 2080 timezone = self.sql_ctx._conf.sessionLocalTimeZone() > 2081 else: > AttributeError: 'SparkSession' object has no attribute '_conf'{code} > So it looks like after converting df to java version and back sql_ctx lost > '_conf' attribute. > > Of course in reality life this raw._jdf is passed to a specific jvm method to > do some operations on it. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org