ok, thanks,
I have another way that is currently working but not efficient if I have to
extract lot of fields
that is creating udf for each extraction:
df = df.withColumn("foo", getfoo.apply(col("jsonCol")))
.withColumn("bar", getbar.apply(col("jsonCol")));
On Fri, Jul 19, 2019 at 8:54 PM
You can try to split the {"foo": "val1", "bar": "val2"} as below.
/*
This is an example of output!
(c1003d93-5157-4092-86cf-0607157291d8,{"rowkey":"c1003d93-5157-4092-86cf-0607157291d8","ticker":"TSCO",
"timeissued":"2019-07-01T09:10:55", "price":395.25})
example of jsonCol (String):
{"foo": "val1", "bar": "val2"}
Thanks,
On Fri, Jul 19, 2019 at 3:57 PM Mich Talebzadeh
wrote:
> Sure.
>
> Do you have an example of a record from Cassandra read into df by any
> chance? Only columns that need to go into Oracle.
>
> df.select('col1, 'col2,
Sure.
Do you have an example of a record from Cassandra read into df by any
chance? Only columns that need to go into Oracle.
df.select('col1, 'col2, 'jsonCol).take(1).foreach(println)
HTH
Dr Mich Talebzadeh
LinkedIn *
Thanks for the reply,
my situation is little different than your sample:
Following is the schema from source (df.printSchema();)
root
|-- id: string (nullable = true)
|-- col1: string (nullable = true)
|-- col2: string (nullable = true)
|-- jsonCol: string (nullable = true)
I want extract
Hi Richard,
You can use the following to read JSON data into DF. The example is reading
JSON from Kafka topic
val sc = spark.sparkContext
import spark.implicits._
// Use map to create the new RDD using the value portion of the
pair.
val jsonRDD =
let's say I use spark to migrate some data from Cassandra table to Oracle
table
Cassandra Table:
CREATE TABLE SOURCE(
id UUID PRIMARY KEY,
col1 text,
col2 text,
jsonCol text
);
example jsonCol value: {"foo": "val1", "bar", "val2"}
I am trying to extract fields from the json column while importing