Hi,

Alluxio will allow you to share or cache data in-memory between different
Spark contexts by storing RDDs or Dataframes as a file in the Alluxio
system. The files can then be accessed by any Spark job like a file in any
other distributed storage system.

These two blogs do a good job of summarizing the end-to-end workflow of
using Alluxio to share RDDs
<https://alluxio.com/blog/effective-spark-rdds-with-alluxio> or Dataframes
<https://alluxio.com/blog/effective-spark-dataframes-with-alluxio> between
Spark jobs.

Hope this helps,
Calvin

On Tue, Dec 13, 2016 at 3:42 AM, Chetan Khatri <ckhatriman...@gmail.com>
wrote:

> Hello Guys,
>
> What would be approach to accomplish Spark Multiple Shared Context without
> Alluxio and with with Alluxio , and what would be best practice to achieve
> parallelism and concurrency for spark jobs.
>
> Thanks.
>
> --
> Yours Aye,
> Chetan Khatri.
> M.+91 76666 80574 <+91%2076666%2080574>
> Data Science Researcher
> INDIA
>
> ​​Statement of Confidentiality
> ————————————————————————————
> The contents of this e-mail message and any attachments are confidential
> and are intended solely for addressee. The information may also be legally
> privileged. This transmission is sent in trust, for the sole purpose of
> delivery to the intended recipient. If you have received this transmission
> in error, any use, reproduction or dissemination of this transmission is
> strictly prohibited. If you are not the intended recipient, please
> immediately notify the sender by reply e-mail or phone and delete this
> message and its attachments, if any.​​
>

Reply via email to