Re: Scala types to StructType

Rishabh Wadhawan Thu, 11 Feb 2016 10:51:32 -0800

I had the same issue. I resolved it in Java, but I am pretty sure it would work 
with scala too. Its kind of a gross hack. But what I did is say I had a table 
in Mysql with 1000 columns
what is did is that I threw a jdbc query to extracted the schema of the table. 
I stored that schema and wrote a map function to create StructFields using 
structType and Row.Factory. Then I took that table loaded as a dataFrame, event 
though it had a schema. I converted that data frame into an RDD, this is when 
it lost the schema. Then performed something using that RDD and then converted 
back that RDD with the structfield.
If your source is structured type then it would be better if you can load it 
directly as a DF that way you can preserve the schema. However, in your case 
you should do something like this
List<StructFrield> fields = new ArrayList<StructField>
for(keys in MAP)
 fields.add(DataTypes.createStructField(keys, DataTypes.StringType, true));


StrructType schemaOfDataFrame = DataTypes.createStructType(conffields);

sqlcontext.createDataFrame(rdd, schemaOfDataFrame);

This is how I would do it to make it in Java, not sure about scala syntax. 
Please tell me if that helped.
> On Feb 11, 2016, at 7:20 AM, Fabian Böhnlein <fabian.boehnl...@gmail.com> 
> wrote:
> 
> Hi all,
> 
> is there a way to create a Spark SQL Row schema based on Scala data types 
> without creating a manual mapping? 
> 
> That's the only example I can find which doesn't require 
> spark.sql.types.DataType already as input, but it requires to define them as 
> Strings.
> 
> * val struct = (new StructType)
> *   .add("a", "int")
> *   .add("b", "long")
> *   .add("c", "string")
> 
> 
> Specifically I have an RDD where each element is a Map of 100s of variables 
> with different data types which I want to transform to a DataFrame
> where the keys should end up as the column names:
> Map ("Amean" -> 20.3, "Asize" -> 12, "Bmean" -> ....)
> 
> Is there a different possibility than building a mapping from the values' 
> .getClass to the Spark SQL DataTypes?
> 
> 
> Thanks,
> Fabian
> 
> 
>

Re: Scala types to StructType

Reply via email to