RE: How to do dispatching in Streaming?

2015-04-17 Thread Evo Eftimov
, Saisai; Huang Jie Subject: Re: How to do dispatching in Streaming? Evo, In Spark there's a fixed scheduling cost for each task, so more tasks mean an increased bottom line for the same amount of work being done. The number of tasks per batch interval should relate to the CPU resources

Re: How to do dispatching in Streaming?

2015-04-17 Thread Jianshi Huang
:* Thursday, April 16, 2015 10:41 AM *To:* Evo Eftimov *Cc:* Tathagata Das; Jianshi Huang; user; Shao, Saisai; Huang Jie *Subject:* Re: How to do dispatching in Streaming? From experience, I'd recommend using the dstream.foreachRDD method and doing the filtering within that context. Extending

RE: How to do dispatching in Streaming?

2015-04-16 Thread Evo Eftimov
...@databricks.com] Sent: Thursday, April 16, 2015 12:52 AM To: Jianshi Huang Cc: user; Shao, Saisai; Huang Jie Subject: Re: How to do dispatching in Streaming? It may be worthwhile to do architect the computation in a different way. dstream.foreachRDD { rdd = rdd.foreach { record

Re: How to do dispatching in Streaming?

2015-04-16 Thread Gerard Maas
...@databricks.com] *Sent:* Thursday, April 16, 2015 12:52 AM *To:* Jianshi Huang *Cc:* user; Shao, Saisai; Huang Jie *Subject:* Re: How to do dispatching in Streaming? It may be worthwhile to do architect the computation in a different way. dstream.foreachRDD { rdd = rdd.foreach { record

RE: How to do dispatching in Streaming?

2015-04-16 Thread Evo Eftimov
...@gmail.com] Sent: Thursday, April 16, 2015 10:41 AM To: Evo Eftimov Cc: Tathagata Das; Jianshi Huang; user; Shao, Saisai; Huang Jie Subject: Re: How to do dispatching in Streaming? From experience, I'd recommend using the dstream.foreachRDD method and doing the filtering within that context

RE: How to do dispatching in Streaming?

2015-04-16 Thread Evo Eftimov
DAG pipeline instance for every message type. Moreover each such DAG pipeline instance will run in parallel with the others From: Tathagata Das [mailto:t...@databricks.com] Sent: Thursday, April 16, 2015 12:52 AM To: Jianshi Huang Cc: user; Shao, Saisai; Huang Jie Subject: Re: How to do

Re: How to do dispatching in Streaming?

2015-04-15 Thread Tathagata Das
It may be worthwhile to do architect the computation in a different way. dstream.foreachRDD { rdd = rdd.foreach { record = // do different things for each record based on filters } } TD On Sun, Apr 12, 2015 at 7:52 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi, I have a