[ 
https://issues.apache.org/jira/browse/SPARK-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704393#comment-14704393
 ] 

Yanbo Liang commented on SPARK-9040:
------------------------------------

[~vnayak053] The code work well on Spark 1.4. Do you have try it on Spark 1.4?

> StructField datatype Conversion Error
> -------------------------------------
>
>                 Key: SPARK-9040
>                 URL: https://issues.apache.org/jira/browse/SPARK-9040
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Spark Core, SQL
>    Affects Versions: 1.3.0
>         Environment: Cloudera 5.3 on CDH 6
>            Reporter: Sandeep Pal
>
> The following issue occurs if I specify the StructFields in specific order in 
> StructType as follow:
> fields = [StructField("d", IntegerType(), True),StructField("b", 
> IntegerType(), True),StructField("a", StringType(), True),StructField("c", 
> IntegerType(), True)]
> But the following code words fine:
> fields = [StructField("d", IntegerType(), True),StructField("b", 
> IntegerType(), True),StructField("c", IntegerType(), True),StructField("a", 
> StringType(), True)]
> <ipython-input-27-9d675dd6a2c9> in <module>()
>      18 
>      19 schema = StructType(fields)
> ---> 20 schemasimid_simple = 
> sqlContext.createDataFrame(simid_simplereqfields, schema)
>      21 schemasimid_simple.registerTempTable("simid_simple")
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/context.py in 
> createDataFrame(self, data, schema, samplingRatio)
>     302 
>     303         for row in rows:
> --> 304             _verify_type(row, schema)
>     305 
>     306         # convert python objects to sql data
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in 
> _verify_type(obj, dataType)
>     986                              "length of fields (%d)" % (len(obj), 
> len(dataType.fields)))
>     987         for v, f in zip(obj, dataType.fields):
> --> 988             _verify_type(v, f.dataType)
>     989 
>     990 _cached_cls = weakref.WeakValueDictionary()
> /usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in 
> _verify_type(obj, dataType)
>     970     if type(obj) not in _acceptable_types[_type]:
>     971         raise TypeError("%s can not accept object in type %s"
> --> 972                         % (dataType, type(obj)))
>     973 
>     974     if isinstance(dataType, ArrayType):
> TypeError: StringType can not accept object in type <type 'int'>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to