Re: Spark Kerberos proxy user

2016-08-30 Thread Michael Gummelt
Here's one: https://issues.apache.org/jira/browse/SPARK-16742

On Tue, Aug 30, 2016 at 3:02 AM, Abel Rincón  wrote:

> Hi again,
>
> Is there any open issue related?
>
> Nowadays, we (stratio)  have a end to end  running, with a spark
> distribution based in 1.6.2.
>
> Work in progress:
>
> - Create and share our solution documentation.
> - Test Suite for all the stuff.
> - Rebase our code with apache-master branch.
>
> Regards,
>
>
> 2016-08-25 12:10 GMT+02:00 Abel Rincón :
>
>> Hi devs,
>>
>> I'm working (at Stratio)  on use spark over mesos and standalone, with a
>> kerberized HDFS
>>
>> We are working to solve these scenarios,
>>
>>
>>- We have an long term running spark sql context using concurrently
>>by many users like Thrift server called CrossData, we need access to hdfs
>>data with kerberos authorization using proxy-user method. we trust on HDFS
>>permission system, or our custom authorizer.
>>
>>
>>- We need load/write dataframes using datasources with HDFS
>>backend(built-in, or others)  such json, csv, parquet, orc …, so we want 
>> to
>>enable the secure access (krb)  only by configuration.
>>
>>
>>- We have an scenario where we want to run streaming jobs over
>>kerberized HDFS,  from W/R and  checkpointing too.
>>
>>
>>- We have to load every single RDD that spark core over kerberized
>>HDFS without breaking the Spark API.
>>
>>
>>
>>
>> As you can see, We have a "special" requirement need to set the proxy
>> user by job over the same spark context.
>>
>> Do you have any idea to cover it?
>>
>>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere


Re: Spark Kerberos proxy user

2016-08-30 Thread Abel Rincón
Hi again,

Is there any open issue related?

Nowadays, we (stratio)  have a end to end  running, with a spark
distribution based in 1.6.2.

Work in progress:

- Create and share our solution documentation.
- Test Suite for all the stuff.
- Rebase our code with apache-master branch.

Regards,


2016-08-25 12:10 GMT+02:00 Abel Rincón :

> Hi devs,
>
> I'm working (at Stratio)  on use spark over mesos and standalone, with a
> kerberized HDFS
>
> We are working to solve these scenarios,
>
>
>- We have an long term running spark sql context using concurrently by
>many users like Thrift server called CrossData, we need access to hdfs data
>with kerberos authorization using proxy-user method. we trust on HDFS
>permission system, or our custom authorizer.
>
>
>- We need load/write dataframes using datasources with HDFS
>backend(built-in, or others)  such json, csv, parquet, orc …, so we want to
>enable the secure access (krb)  only by configuration.
>
>
>- We have an scenario where we want to run streaming jobs over
>kerberized HDFS,  from W/R and  checkpointing too.
>
>
>- We have to load every single RDD that spark core over kerberized
>HDFS without breaking the Spark API.
>
>
>
>
> As you can see, We have a "special" requirement need to set the proxy user
> by job over the same spark context.
>
> Do you have any idea to cover it?
>
>