Qian,

It seems like you are using the
https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/streaming/DataStreamWriter.html
and
not the spark DataSource. To use the spark datasource, look at an example
here https://hudi.apache.org/writing_data.html#datasource-writer.

DataStreamWriters are a different set of API's which IIUC don't work
interchangeably with DataSource.

Thanks,
Nishith

On Mon, Oct 28, 2019 at 3:24 PM Qian Wang <qwang1...@gmail.com> wrote:

> Hi All,
>
> I tried to use Datasource Writer to read streaming data from Kafka topic
> and write to Hudi dataset on HDFS.  I used following codes:
>
> val output = data
>    .writeStream
>    .trigger(Trigger.ProcessingTime("300 seconds"))
>    .format("org.apache.hudi")
>    .option("hoodie.table.name", "hudi_ro_table")
>    .outputMode("append")
>    .option("path", fileLocation)
>    .option("checkpointLocation", s"${fileLocation}_chpk")
>    .start()
> However, when I run this spark job it cannot write anything onto HDFS. Can
> anyone tell me how to do that? Thanks.
>
> Best,
> Eric
>

Reply via email to