Here's one: https://issues.apache.org/jira/browse/SPARK-16742
On Tue, Aug 30, 2016 at 3:02 AM, Abel Rincón wrote:
> Hi again,
>
> Is there any open issue related?
>
> Nowadays, we (stratio) have a end to end running, with a spark
> distribution based in 1.6.2.
>
> Work in progress:
>
> - Create and share our solution documentation.
> - Test Suite for all the stuff.
> - Rebase our code with apache-master branch.
>
> Regards,
>
>
> 2016-08-25 12:10 GMT+02:00 Abel Rincón :
>
>> Hi devs,
>>
>> I'm working (at Stratio) on use spark over mesos and standalone, with a
>> kerberized HDFS
>>
>> We are working to solve these scenarios,
>>
>>
>>- We have an long term running spark sql context using concurrently
>>by many users like Thrift server called CrossData, we need access to hdfs
>>data with kerberos authorization using proxy-user method. we trust on HDFS
>>permission system, or our custom authorizer.
>>
>>
>>- We need load/write dataframes using datasources with HDFS
>>backend(built-in, or others) such json, csv, parquet, orc …, so we want
>> to
>>enable the secure access (krb) only by configuration.
>>
>>
>>- We have an scenario where we want to run streaming jobs over
>>kerberized HDFS, from W/R and checkpointing too.
>>
>>
>>- We have to load every single RDD that spark core over kerberized
>>HDFS without breaking the Spark API.
>>
>>
>>
>>
>> As you can see, We have a "special" requirement need to set the proxy
>> user by job over the same spark context.
>>
>> Do you have any idea to cover it?
>>
>>
>
--
Michael Gummelt
Software Engineer
Mesosphere