Hi,

Alluxio enables sharing dataframes across different applications. This blog
post <https://www.alluxio.com/blog/effective-spark-dataframes-with-alluxio>
talks
about dataframes and Alluxio, and this Spark Summit presentation
<https://spark-summit.org/2017/events/best-practices-for-using-alluxio-with-apache-spark/>
has additional information.

Thanks,
Gene

On Tue, Oct 31, 2017 at 6:04 PM, Revin Chalil <rcha...@expedia.com> wrote:

> Any info on the below will be really appreciated.
>
>
>
> I read about Alluxio and Ignite. Has anybody used any of them? Do they
> work well with multiple Apps doing lookups simultaneously? Are there better
> options? Thank you.
>
>
>
> *From: *roshan joe <impdocs2...@gmail.com>
> *Date: *Monday, October 30, 2017 at 7:53 PM
> *To: *"user@spark.apache.org" <user@spark.apache.org>
> *Subject: *share datasets across multiple spark-streaming applications
> for lookup
>
>
>
> Hi,
>
>
>
> What is the recommended way to share datasets across multiple
> spark-streaming applications, so that the incoming data can be looked up
> against this shared dataset?
>
>
>
> The shared dataset is also incrementally refreshed and stored on S3. Below
> is the scenario.
>
>
>
> Streaming App-1 consumes data from Source-1 and writes to DS-1 in S3.
>
> Streaming App-2 consumes data from Source-2 and writes to DS-2 in S3.
>
>
>
>
> Streaming App-3 consumes data from Source-3, *needs to lookup against
> DS-1 and DS-2* and write to DS-3 in S3.
>
> Streaming App-4 consumes data from Source-4, *needs to lookup against
> DS-1 and DS-2 *and write to DS-3 in S3.
>
> Streaming App-n consumes data from Source-n, *needs to lookup against
> DS-1 and DS-2 *and write to DS-n in S3.
>
>
>
> So DS-1 and DS-2 ideally should be shared for lookup across multiple
> streaming apps. Any input is appreciated. Thank you!
>

Reply via email to