My DataFrame has the following schema
root
 |-- data: struct (nullable = true)
 |    |-- zoneId: string (nullable = true)
 |    |-- deviceId: string (nullable = true)
 |    |-- timeSinceLast: long (nullable = true)
 |-- date: date (nullable = true)

 
How can I do a writeStream with Parquet format and write the data
(containing zoneId, deviceId, timeSinceLast except date) and partition the
data by date ? I tried the following code and the partition by clause did
not work

val query1 = df1
      .writeStream
      .format("parquet")
      .option("path", "/Users/abc/hb_parquet/data")
      .option("checkpointLocation", "/Users/abc/hb_parquet/checkpoint")
      .partitionBy("data.zoneId")
      .start()



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to