See the comment for createDataFrame(rowRDD: RDD[Row], schema: StructType)
method:
* Creates a [[DataFrame]] from an [[RDD]] containing [[Row]]s using the
given schema.
* It is important to make sure that the structure of every [[Row]] of
the provided RDD matches
* the provided schema.
Hi,
I have an RDD
jsonGzip
res3: org.apache.spark.rdd.RDD[(String, String, String, String)] =
MapPartitionsRDD[8] at map at :65
which I want to convert to a DataFrame with schema
so I created a schema:
al schema =
StructType(
StructField("cty", StringType, false) ::
I might be missing you point but I don't get it.
My understanding is that I need a RDD containing Rows but how do I get it?
I started with a DataFrame
run a map on it and got the RDD [string,string,string,strng] not I want to
convert it back to a DataFrame and failing
Why?
On Sun, Dec 20,
Got it to work, thanks
On Sun, 20 Dec 2015 at 17:01 Eran Witkon wrote:
> I might be missing you point but I don't get it.
> My understanding is that I need a RDD containing Rows but how do I get it?
>
> I started with a DataFrame
> run a map on it and got the RDD