Hi Nishith, Thanks for reply.
I did use the Datasource Writer to write instead of using DataStreamWriter. I think Datasource Writer also can support write streaming data, correct? Best, Qian On Oct 28, 2019, 9:31 PM -0700, nishith agarwal <[email protected]>, wrote: > Qian, > > It seems like you are using the > https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/streaming/DataStreamWriter.html > and > not the spark DataSource. To use the spark datasource, look at an example > here https://hudi.apache.org/writing_data.html#datasource-writer. > > DataStreamWriters are a different set of API's which IIUC don't work > interchangeably with DataSource. > > Thanks, > Nishith > > On Mon, Oct 28, 2019 at 3:24 PM Qian Wang <[email protected]> wrote: > > > Hi All, > > > > I tried to use Datasource Writer to read streaming data from Kafka topic > > and write to Hudi dataset on HDFS. I used following codes: > > > > val output = data > > .writeStream > > .trigger(Trigger.ProcessingTime("300 seconds")) > > .format("org.apache.hudi") > > .option("hoodie.table.name", "hudi_ro_table") > > .outputMode("append") > > .option("path", fileLocation) > > .option("checkpointLocation", s"${fileLocation}_chpk") > > .start() > > However, when I run this spark job it cannot write anything onto HDFS. Can > > anyone tell me how to do that? Thanks. > > > > Best, > > Eric > >
