Re: Support for Second level of concurrency

Sandeep Mahendru Tue, 25 Sep 2018 10:52:36 -0700

Hey Jorn,

  Appreciate the prompt reply.


Yeah that would surely work, we have tried a similar approach. The only
concern here is that to make the solution low latency, we want to avoid
routing through a message broker.

Regards,
Sandeep.

On Tue, Sep 25, 2018 at 12:53 PM Jörn Franke <jornfra...@gmail.com> wrote:

> What is the ultimate goal of this algorithm?  There could be already
> algorithms that can do this within Spark. You could also put a message on
> Kafka (or another broker) and have spark applications listen to them to
> trigger further computation. This would be also more controlled and can be
> done already now.
>
> > On 25. Sep 2018, at 17:31, sandeep mehandru <mahendru.sand...@gmail.com>
> wrote:
> >
> > Hi Folks,
> >
> >   There is a use-case , where we are doing large computation on two large
> > vectors. It is basically a scenario, where we run a flatmap operation on
> the
> > Left vector and run co-relation logic by comparing it with all the rows
> of
> > the second vector. When this flatmap operation is running on an executor,
> > this compares row 1 from left vector with all rows of the second vector.
> The
> > goal is that from this flatmap operation, we want to start another remote
> > map operation that compares a portion of right vector rows. This enables
> a
> > second level of concurrent operation, thereby increasing throughput and
> > utilizing other nodes. But to achieve this we need access to spark
> context
> > from within the Flatmap operation.
> >
> > I have attached a snapshot describing the limitation.
> >
> > <
> http://apache-spark-developers-list.1001551.n3.nabble.com/file/t3134/Concurrency_Snapshot.jpg>
>
> >
> > In simple words, this boils down to having access to  a spark context
> from
> > within an executor , so that the next level of map or concurrent
> operations
> > can be spun on the partitions on other machines. I have some experience
> with
> > other in-memory compute grids technologies like Coherence, Hazelcast.
> This
> > frameworks do allow to trigger next level of concurrent operations from
> > within a task being executed on one node.
> >
> >
> > Regards,
> > Sandeep.
> >
> >
> >
> > --
> > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
>

Re: Support for Second level of concurrency

Reply via email to