This is not easy to say without testing. It depends on type of computation etc. 
it also depends on the Spark version. Generally vectorization / SIMD could be 
much faster if it is applied by Spark / the JVM in scenario 2.

> On 9. Aug 2017, at 07:05, Raghavendra Pandey <raghavendra.pan...@gmail.com> 
> wrote:
> 
> I am using structured streaming to evaluate multiple rules on same running 
> stream. 
> I have two options to do that. One is to use forEach and evaluate all the 
> rules on the row.. 
> The other option is to express rules in spark sql dsl and run multiple 
> queries. 
> I was wondering if option 1 will result in better performance even though I 
> can get catalyst optimization in option 2.
> 
> Thanks 
> Raghav 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to