[DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-18 Thread linshan
hi team: I need help,After a few days of thinking, trial and error, I have no idea.I wrote the relevant information on this page。Please follow this link(https://issues.apache.org/jira/browse/HUDI-1126)。 Best, linshan-ma

Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Balaji Varadarajan
Hi linshan, Sorry for the delay in responding. It is better to discuss code changes over draft PR. Can you open one and tag us there. At a high level, it looks like you are using Spark Datasource v2 APIs while currently the structured streaming write is implemented using V1 API. Let's discuss t

Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Vinoth Chandar
I would for all these new things to be revamped on top of Spark 3's newer APIs (it's kind of frustrating that the datasource APIs don't stabilize easily in Spark) I am thinking we can implement a "hudi3" format using Spark 3, with support for SQL Merges, existing functionality and a redone Spark S