Sandeep Pal created SPARK-9040:
----------------------------------

             Summary: StructField datatype Conversion Error
                 Key: SPARK-9040
                 URL: https://issues.apache.org/jira/browse/SPARK-9040
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.3.0
         Environment: Cloudera 5.3 on CDH 6
            Reporter: Sandeep Pal


The following issue occurs if I specify the StructFields in specific order in 
StructType as follow:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), 
True),StructField("a", StringType(), True),StructField("c", IntegerType(), 
True)]

But the following code words fine:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), 
True),StructField("c", IntegerType(), True),StructField("a", StringType(), 
True)]

<ipython-input-27-9d675dd6a2c9> in <module>()
     18 
     19 schema = StructType(fields)
---> 20 schemasimid_simple = sqlContext.createDataFrame(simid_simplereqfields, 
schema)
     21 schemasimid_simple.registerTempTable("simid_simple")

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/context.py in 
createDataFrame(self, data, schema, samplingRatio)
    302 
    303         for row in rows:
--> 304             _verify_type(row, schema)
    305 
    306         # convert python objects to sql data

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in 
_verify_type(obj, dataType)
    986                              "length of fields (%d)" % (len(obj), 
len(dataType.fields)))
    987         for v, f in zip(obj, dataType.fields):
--> 988             _verify_type(v, f.dataType)
    989 
    990 _cached_cls = weakref.WeakValueDictionary()

/usr/local/bin/spark-1.3.1-bin-hadoop2.6/python/pyspark/sql/types.py in 
_verify_type(obj, dataType)
    970     if type(obj) not in _acceptable_types[_type]:
    971         raise TypeError("%s can not accept object in type %s"
--> 972                         % (dataType, type(obj)))
    973 
    974     if isinstance(dataType, ArrayType):

TypeError: StringType can not accept object in type <type 'int'>


But the following code words fine:
fields = [StructField("d", IntegerType(), True),StructField("b", IntegerType(), 
True),StructField("c", IntegerType(), True),StructField("a", StringType(), 
True)]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to