thanks for reply! could you help to explain my 2 questions above? Trevor <wowtua...@gmail.com> 于2020年12月1日周二 下午5:17写道:
> Hi,songj , > > DeltaStreamer can be understood as a packaged Spark DataSource. You only > need to set the required parameters, which makes it more convenient for > data ingest. > > Best, > > Trevor > > > wowtua...@gmail.com > > From: songj songj > Date: 2020-12-01 16:48 > To: dev > Subject: Re: why not use spark datasource in DeltaStreamer > spark structured streaming consume kafka using kafka data source, and > foreachbatch to do insert/upsert/... to hudi, > is it similar with DeltaStreamer? > > songj songj <songjun...@gmail.com> 于2020年12月1日周二 下午4:28写道: > > > hi, I have some questions: > > > > 1. DeltaStreamer has its own Source<JavaRDD<String>> to consume source > > data, > > such as Kafka, why not use spark datasource directly ? > > > > 2. Hudi has lots of logical which use RDD, why not use Spark DataFrame? > > > > I just want to know the background of the above implementation, thanks! > > >