Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-24 Thread Karthik Reddy Vadde
2:38 AM > To: Tathagata Das , "ymaha...@snappydata.io" > , "priy...@asperasoft.com" , > "user @spark" > Subject: Re: [Structured Streaming] Avoiding multiple streaming queries > > Hi, > Did anyone of you thought about writing a custom foreach sink

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-24 Thread kant kodali
;, Tathagata Das , " > ymaha...@snappydata.io" , "priy...@asperasoft.com" > , "user @spark" > > *Subject: *Re: [Structured Streaming] Avoiding multiple streaming queries > > > > @Silvio Thought about duplicating rows but dropped the idea for increa

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-24 Thread Silvio Fiorito
, "ymaha...@snappydata.io" , "priy...@asperasoft.com" , "user @spark" Subject: Re: [Structured Streaming] Avoiding multiple streaming queries @Silvio Thought about duplicating rows but dropped the idea for increasing memory. forEachBatch sounds Interesting! O

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-24 Thread kant kodali
...@snappydata.io" < > ymaha...@snappydata.io>, "priy...@asperasoft.com" , > "user @spark" > > *Subject: *Re: [Structured Streaming] Avoiding multiple streaming queries > > > > understand each row has a topic column but can we write

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-23 Thread Silvio Fiorito
@asperasoft.com" , "user @spark" Subject: Re: [Structured Streaming] Avoiding multiple streaming queries understand each row has a topic column but can we write one row to multiple topics? On Thu, Jul 12, 2018 at 11:00 AM, Arun Mahadevan mailto:ar...@apache.org>>

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-23 Thread kant kodali
sinks like Kafka and >> you need to write the custom logic yourself and you cannot scale the >> partitions for the sinks independently. >> >> [1] https://spark.apache.org/docs/2.1.2/api/java/org/apache/spark/sql/ >> ForeachWriter.html >> >> From: chandan prakash >

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread Arun Mahadevan
n Iyer Cc: Tathagata Das , "ymaha...@snappydata.io" , "priy...@asperasoft.com" , "user @spark" Subject: Re: [Structured Streaming] Avoiding multiple streaming queries Thanks a lot Arun for your response. I got your point that existing sink plugins like kafka, etc

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread chandan prakash
> Date: Thursday, July 12, 2018 at 2:38 AM > To: Tathagata Das , "ymaha...@snappydata.io" > , "priy...@asperasoft.com" , > "user @spark" > Subject: Re: [Structured Streaming] Avoiding multiple streaming queries > > Hi, > Did anyone of you though

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread Arun Mahadevan
uot;user @spark" Subject: Re: [Structured Streaming] Avoiding multiple streaming queries Hi, Did anyone of you thought about writing a custom foreach sink writer which can decided which record should go to which sink (based on some marker in record, which we can possibly

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread chandan prakash
Hi, Did anyone of you thought about writing a custom foreach sink writer which can decided which record should go to which sink (based on some marker in record, which we can possibly annotate during transformation) and then accordingly write to specific sink. This will mean that: 1. every custom

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-02-14 Thread Tathagata Das
Of course, you can write to multiple Kafka topics from a single query. If your dataframe that you want to write has a column named "topic" (along with "key", and "value" columns), it will write the contents of a row to the topic in that row. This automatically works. So the only thing you need to

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-02-13 Thread Yogesh Mahajan
I had a similar issue and i think that’s where the structured streaming design lacks. Seems like Question#2 in your email is a viable workaround for you. In my case, I have a custom Sink backed by an efficient in-memory column store suited for fast ingestion. I have a Kafka stream coming from

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-02-13 Thread dcam
Hi Priyank I have a similar structure, although I am reading from Kafka and sinking to multiple MySQL tables. My input stream has multiple message types and each is headed for a different MySQL table. I've looked for a solution for a few months, and have only come up with two alternatives: 1.

[Structured Streaming] Avoiding multiple streaming queries

2018-02-12 Thread Priyank Shrivastava
I have a structured streaming query which sinks to Kafka. This query has a complex aggregation logic. I would like to sink the output DF of this query to multiple Kafka topics each partitioned on a different ‘key’ column. I don’t want to have multiple Kafka sinks for each of the different