Re: Multiple queries on same stream

2017-08-09 Thread Tathagata Das
Its important to note that running multiple streaming queries, as of today, would read the input data that many number of time. So there is a trade off between the two approaches. So even though scenario 1 wont get great catalyst optimization, it may be more efficient overall in terms of resource

Re: Multiple queries on same stream

2017-08-09 Thread Jörn Franke
This is not easy to say without testing. It depends on type of computation etc. it also depends on the Spark version. Generally vectorization / SIMD could be much faster if it is applied by Spark / the JVM in scenario 2. > On 9. Aug 2017, at 07:05, Raghavendra Pandey