Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
anged data - my suggestion to use dbtable with inline view > 2. parallelism - use numPartition,lowerbound,upper bound to generate > number of partitions > > HTH > > > > On Wed, Jan 4, 2017 at 3:46 AM, Yuanzhe Yang <yyz1...@gmail.com> wrote: > >> Hi Ayan, >>

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
erBound option. > > Essentially, you want to create a query like > > select * from table where INSERTED_ON > lowerBound and > INSERTED_ON > everytime you run the job > > > > On Wed, Jan 4, 2017 at 2:13 AM, Yuanzhe Yang <yyz1...@gmail.com> wrote: > &g

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
l map task to grab data from DB. > > In Spark, you can use sqlContext.load function for JDBC and use > partitionColumn and numPartition to define parallelism of connection. > > Best > Ayan > > On Tue, Jan 3, 2017 at 10:49 PM, Yuanzhe Yang <yyz1...@gmail.com> wrote: &g

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
Amrit Jangid <amrit.jan...@goibibo.com> wrote: > >> You can try out *debezium* : https://github.com/debezium. it reads data >> from bin-logs, provides structure and stream into Kafka. >> >> Now Kafka can be your new source for streaming. >> >> O

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
m into Kafka. > > Now Kafka can be your new source for streaming. > > On Tue, Jan 3, 2017 at 4:36 PM, Yuanzhe Yang <yyz1...@gmail.com> wrote: > >> Hi Hongdi, >> >> Thanks a lot for your suggestion. The data is truely immutable and the >> table is append-

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
fka and then consume it by >> spark streaming? >> >> On Fri, Dec 30, 2016 at 9:01 AM, Michael Armbrust <mich...@databricks.com >> > wrote: >> >>> We don't support this yet, but I've opened this JIRA as it sounds >>> generally useful: https://issues.

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
> >> We don't support this yet, but I've opened this JIRA as it sounds >> generally useful: https://issues.apache.org/jira/browse/SPARK-19031 >> >> In the mean time you could try implementing your own Source, but that is >> pretty low level and is not yet a stable API. &g

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2017-01-03 Thread Yuanzhe Yang
browse/SPARK-19031 > > In the mean time you could try implementing your own Source, but that is > pretty low level and is not yet a stable API. > > On Thu, Dec 29, 2016 at 4:05 AM, "Yuanzhe Yang (杨远哲)" <yyz1...@gmail.com> > wrote: > >> Hi all, >> >&g

[Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2016-12-29 Thread Yuanzhe Yang (杨远哲)
Hi all, Thanks a lot for your contributions to bring us new technologies. I don't want to waste your time, so before I write to you, I googled, checked stackoverflow and mailing list archive with keywords "streaming" and "jdbc". But I was not able to get any solution to my use case. I hope I