[jira] [Updated] (SPARK-18621) PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not Python representation
[ https://issues.apache.org/jira/browse/SPARK-18621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romi Kuntsman updated SPARK-18621: -- Description: When using Python's repr() on an object, the expected result is a string that Python can evaluate to construct the object. See: https://docs.python.org/2/library/functions.html#func-repr However, when getting a DataFrame schema in PySpark, the code (in "__repr()__" overload methods) returns the string representation for Scala, rather than for Python. Relevant code in PySpark: https://github.com/apache/spark/blob/5f02d2e5b4d37f554629cbd0e488e856fffd7b6b/python/pyspark/sql/types.py#L442 Python Code: {code} # 1. define object struct1 = StructType([StructField("f1", StringType(), True)]) # 2. print representation, expected to be like above print(repr(struct1)) # 3. actual result: # StructType(List(StructField(f1,StringType,true))) # 4. try to use result in code struct2 = StructType(List(StructField(f1,StringType,true))) # 5. get bunch of errors # Unresolved reference 'List' # Unresolved reference 'f1' # StringType is class, not constructed object # Unresolved reference 'true' {code} was: When using Python's repr() on an object, the expected result is a string that Python can evaluate to construct the object. See: https://docs.python.org/2/library/functions.html#func-repr However, when getting a DataFrame schema in PySpark, the code (in "__repr()__" overload methods) returns the string representation for Scala, rather than for Python. Relevant code in PySpark: https://github.com/apache/spark/blob/5f02d2e5b4d37f554629cbd0e488e856fffd7b6b/python/pyspark/sql/types.py#L442 Python Code: # 1. define object struct1 = StructType([StructField("f1", StringType(), True)]) # 2. print representation, expected to be like above print(repr(struct1)) # 3. actual result: # StructType(List(StructField(f1,StringType,true))) # 4. try to use result in code struct2 = StructType(List(StructField(f1,StringType,true))) # 5. get bunch of errors # Unresolved reference 'List' # Unresolved reference 'f1' # StringType is class, not constructed object # Unresolved reference 'true' > PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not > Python representation > --- > > Key: SPARK-18621 > URL: https://issues.apache.org/jira/browse/SPARK-18621 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2, 2.0.2 >Reporter: Romi Kuntsman >Priority: Minor > > When using Python's repr() on an object, the expected result is a string that > Python can evaluate to construct the object. > See: https://docs.python.org/2/library/functions.html#func-repr > However, when getting a DataFrame schema in PySpark, the code (in > "__repr()__" overload methods) returns the string representation for Scala, > rather than for Python. > Relevant code in PySpark: > https://github.com/apache/spark/blob/5f02d2e5b4d37f554629cbd0e488e856fffd7b6b/python/pyspark/sql/types.py#L442 > Python Code: > {code} > # 1. define object > struct1 = StructType([StructField("f1", StringType(), True)]) > # 2. print representation, expected to be like above > print(repr(struct1)) > # 3. actual result: > # StructType(List(StructField(f1,StringType,true))) > # 4. try to use result in code > struct2 = StructType(List(StructField(f1,StringType,true))) > # 5. get bunch of errors > # Unresolved reference 'List' > # Unresolved reference 'f1' > # StringType is class, not constructed object > # Unresolved reference 'true' > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18621) PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not Python representation
[ https://issues.apache.org/jira/browse/SPARK-18621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-18621: - Labels: bulk-closed (was: ) > PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not > Python representation > --- > > Key: SPARK-18621 > URL: https://issues.apache.org/jira/browse/SPARK-18621 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2, 2.0.2 >Reporter: Romi Kuntsman >Priority: Minor > Labels: bulk-closed > > When using Python's repr() on an object, the expected result is a string that > Python can evaluate to construct the object. > See: https://docs.python.org/2/library/functions.html#func-repr > However, when getting a DataFrame schema in PySpark, the code (in > "__repr()__" overload methods) returns the string representation for Scala, > rather than for Python. > Relevant code in PySpark: > https://github.com/apache/spark/blob/5f02d2e5b4d37f554629cbd0e488e856fffd7b6b/python/pyspark/sql/types.py#L442 > Python Code: > {code} > # 1. define object > struct1 = StructType([StructField("f1", StringType(), True)]) > # 2. print representation, expected to be like above > print(repr(struct1)) > # 3. actual result: > # StructType(List(StructField(f1,StringType,true))) > # 4. try to use result in code > struct2 = StructType(List(StructField(f1,StringType,true))) > # 5. get bunch of errors > # Unresolved reference 'List' > # Unresolved reference 'f1' > # StringType is class, not constructed object > # Unresolved reference 'true' > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18621) PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not Python representation
[ https://issues.apache.org/jira/browse/SPARK-18621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-18621: - Labels: (was: bulk-closed) > PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not > Python representation > --- > > Key: SPARK-18621 > URL: https://issues.apache.org/jira/browse/SPARK-18621 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2, 2.0.2 >Reporter: Romi Kuntsman >Assignee: Romi Kuntsman >Priority: Minor > Fix For: 3.3.0 > > > When using Python's repr() on an object, the expected result is a string that > Python can evaluate to construct the object. > See: https://docs.python.org/2/library/functions.html#func-repr > However, when getting a DataFrame schema in PySpark, the code (in > "__repr()__" overload methods) returns the string representation for Scala, > rather than for Python. > Relevant code in PySpark: > https://github.com/apache/spark/blob/5f02d2e5b4d37f554629cbd0e488e856fffd7b6b/python/pyspark/sql/types.py#L442 > Python Code: > {code} > # 1. define object > struct1 = StructType([StructField("f1", StringType(), True)]) > # 2. print representation, expected to be like above > print(repr(struct1)) > # 3. actual result: > # StructType(List(StructField(f1,StringType,true))) > # 4. try to use result in code > struct2 = StructType(List(StructField(f1,StringType,true))) > # 5. get bunch of errors > # Unresolved reference 'List' > # Unresolved reference 'f1' > # StringType is class, not constructed object > # Unresolved reference 'true' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org