Here's one: https://issues.apache.org/jira/browse/SPARK-16742
On Tue, Aug 30, 2016 at 3:02 AM, Abel Rincón <gange...@gmail.com> wrote: > Hi again, > > Is there any open issue related? > > Nowadays, we (stratio) have a end to end running, with a spark > distribution based in 1.6.2. > > Work in progress: > > - Create and share our solution documentation. > - Test Suite for all the stuff. > - Rebase our code with apache-master branch. > > Regards, > > > 2016-08-25 12:10 GMT+02:00 Abel Rincón <gange...@gmail.com>: > >> Hi devs, >> >> I'm working (at Stratio) on use spark over mesos and standalone, with a >> kerberized HDFS >> >> We are working to solve these scenarios, >> >> >> - We have an long term running spark sql context using concurrently >> by many users like Thrift server called CrossData, we need access to hdfs >> data with kerberos authorization using proxy-user method. we trust on HDFS >> permission system, or our custom authorizer. >> >> >> - We need load/write dataframes using datasources with HDFS >> backend(built-in, or others) such json, csv, parquet, orc …, so we want >> to >> enable the secure access (krb) only by configuration. >> >> >> - We have an scenario where we want to run streaming jobs over >> kerberized HDFS, from W/R and checkpointing too. >> >> >> - We have to load every single RDD that spark core over kerberized >> HDFS without breaking the Spark API. >> >> >> >> >> As you can see, We have a "special" requirement need to set the proxy >> user by job over the same spark context. >> >> Do you have any idea to cover it? >> >> > -- Michael Gummelt Software Engineer Mesosphere