Data Security on Spark-on-HDFS

Daniel Schulz Mon, 31 Aug 2015 03:02:40 -0700

Hi guys,

In a nutshell: does Spark check and respect user privileges when 
reading/writing data.


I am curious about the data security when Spark runs on top of HDFS — maybe 
though YARN. Is Spark running it's long-running JVM processes as a Spark user, 
that makes no distinction when accessing data? So is there a shortcoming when 
using Spark because the JVM processes are already running and therefore the 
launching user is omitted by Spark when accessing data residing on HDFS? Or is 
Spark only reading/writing data, that the user had access to, that launched 
this Thread?

What about local store when running in Standalone mode? What about access calls 
to HBase or Hive then?

Thanks for taking time.

Best regards, Daniel.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Data Security on Spark-on-HDFS

Reply via email to