thanks for reply!
could you help to explain my 2 questions  above?

Trevor <wowtua...@gmail.com> 于2020年12月1日周二 下午5:17写道:

> Hi,songj ,
>
> DeltaStreamer can be understood as a packaged Spark DataSource. You only
> need to set the required parameters, which makes it more convenient for
> data ingest.
>
> Best,
>
> Trevor
>
>
> wowtua...@gmail.com
>
> From: songj songj
> Date: 2020-12-01 16:48
> To: dev
> Subject: Re: why not use spark datasource in DeltaStreamer
> spark structured streaming consume kafka using kafka data source, and
> foreachbatch to do insert/upsert/... to hudi,
> is it similar with DeltaStreamer?
>
> songj songj <songjun...@gmail.com> 于2020年12月1日周二 下午4:28写道:
>
> > hi, I have some questions:
> >
> > 1. DeltaStreamer  has its own Source<JavaRDD<String>> to consume source
> > data,
> > such as Kafka, why not use spark datasource directly ?
> >
> > 2. Hudi has lots of logical which use RDD, why not use Spark DataFrame?
> >
> > I just want to know the background of the above implementation, thanks!
> >
>

Reply via email to