Hi,

Would the benefits of project tungsten be available for access by non-JVM
programs directly into the off-heap memory?  Spark using dataframes w/ the
tungsten improvements will definitely help analytics within the JVM world
but accessing outside 3rd party c++ libraries is a challenge especially
when trying to do it with a zero copy.

Ideally the off heap memory would be accessible to a non JVM program and be
invoked in process using JNI per each partition.  The alternatives to this
involve additional costs of starting another process if using pipes as well
as the additional copy all the data.

In addition to read only non-JVM access in process would there be a way to
share the dataframe that is in memory out of process and across spark
contexts.  This way an expensive complicated initial build up of a
dataframe would not have to be replicated as well not having to pay the
penalty of the startup costs on failure.

thanks,

-paul

Reply via email to