Hi All

I would like to specify a schema when reading from a json but when trying
to map a number to a Double it fails, I tried FloatType and IntType with no
joy!


When inferring the schema customer id is set to String, and I would like to
cast it as Double

so df1 is corrupted while df2 shows


Also FYI I need this to be generic as I would like to apply it to any json,
I specified the below schema as an example of the issue I am facing

import org.apache.spark.sql.types.{BinaryType, StringType,
StructField, DoubleType,FloatType, StructType, LongType,DecimalType}
val testSchema = StructType(Array(StructField("customerid",DoubleType)))
val df1 = 
spark.read.schema(testSchema).json(sc.parallelize(Array("""{"customerid":"535137"}""")))
val df2 = spark.read.json(sc.parallelize(Array("""{"customerid":"535137"}""")))
df1.show(1)
df2.show(1)


Any help would be appreciated, I am sure I am missing something obvious but
for the life of me I cant tell what it is!


Kind Regards
Sam

Reply via email to