Re: Question regarding WebHDFS security

2016-07-06 Thread Larry McCay
Hi Ben -

It doesn’t really work exactly that way but will likely be able to handle your 
usecase.
I suggest that you bring the conversation over to the dev@ for Knox.

We can delve into the details of your usecase and your options there.

thanks,

—larry

On Jul 5, 2016, at 10:58 PM, Benjamin Ross 
mailto:br...@lattice-engines.com>> wrote:

Thanks Larry.  I'll need to look into the details quite a bit further, but I 
take it that I can define some mapping such that requests for particular file 
paths will trigger particular credentials to be used (until everything's 
upgraded)?  Currently all requests come in using permissive auth with username 
yarn.  Once we enable Kerberos, I'd optimally like for that to translate to use 
some set of Kerberos credentials if the path is /foo and some other set of 
credentials if the path is /bar.  This will only be temporary until things are 
fully upgraded.

Appreciate the help.
Ben


________
From: Larry McCay [lmc...@hortonworks.com<mailto:lmc...@hortonworks.com>]
Sent: Tuesday, July 05, 2016 4:23 PM
To: Benjamin Ross
Cc: David Morel; user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: Question regarding WebHDFS security

For consuming REST APIs like webhdfs, where kerberos is inconvenient or 
impossible, you may want to consider using a trusted proxy like Apache Knox.
It will authenticate as knox to the backend services and act on behalf of your 
custom services.
It will also allow you to authenticate to Knox from the services using a number 
of different mechanisms.

http://knox.apache.org<http://knox.apache.org/>

On Jul 5, 2016, at 2:43 PM, Benjamin Ross 
mailto:br...@lattice-engines.com>> wrote:

Hey David,
Thanks.  Yep - that's the easy part.  Let me clarify.

Consider that we have:
1. A Hadoop cluster running without Kerberos
2. A number of services contacting that hadoop cluster and retrieving data from 
it using WebHDFS.

Clearly the services don't need to login to WebHDFS using credentials because 
the cluster isn't kerberized just yet.

Now what happens when we enable Kerberos on the cluster?  We still need to 
allow those services to contact the cluster without credentials until we can 
upgrade them.  Otherwise we'll have downtime.  So what can we do?

As a possible solution, is there any way to allow unprotected access from just 
those machines until we can upgrade them?

Thanks,
Ben






From: David Morel [dmo...@amakuru.net<mailto:dmo...@amakuru.net>]
Sent: Tuesday, July 05, 2016 2:33 PM
To: Benjamin Ross
Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: Question regarding WebHDFS security


Le 5 juil. 2016 7:42 PM, "Benjamin Ross" 
mailto:br...@lattice-engines.com>> a écrit :
>
> All,
> We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
> that we have several single tenant services that rely on contacting the HDFS 
> cluster over WebHDFS without credentials.  So, the concern is that once we 
> kerberize the cluster, we will no longer be able to access it without 
> credentials from these single-tenant systems, which results in a painful 
> upgrade dependency.
>
> Any suggestions for dealing with this problem in a simple way?
>
> If not, any suggestion for a better forum to ask this question?
>
> Thanks in advance,
> Ben

It's usually not super-hard to wrap your http calls with a module that handles 
Kerberos, depending on what language you use. For instance 
https://metacpan.org/pod/Net::Hadoop::WebHDFS::LWP does this.

David



Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.



This message has been scanned for malware by Websense. 
www.websense.com<http://www.websense.com/>



Re: Question regarding WebHDFS security

2016-07-05 Thread Larry McCay
For consuming REST APIs like webhdfs, where kerberos is inconvenient or 
impossible, you may want to consider using a trusted proxy like Apache Knox.
It will authenticate as knox to the backend services and act on behalf of your 
custom services.
It will also allow you to authenticate to Knox from the services using a number 
of different mechanisms.

http://knox.apache.org

On Jul 5, 2016, at 2:43 PM, Benjamin Ross 
mailto:br...@lattice-engines.com>> wrote:

Hey David,
Thanks.  Yep - that's the easy part.  Let me clarify.

Consider that we have:
1. A Hadoop cluster running without Kerberos
2. A number of services contacting that hadoop cluster and retrieving data from 
it using WebHDFS.

Clearly the services don't need to login to WebHDFS using credentials because 
the cluster isn't kerberized just yet.

Now what happens when we enable Kerberos on the cluster?  We still need to 
allow those services to contact the cluster without credentials until we can 
upgrade them.  Otherwise we'll have downtime.  So what can we do?

As a possible solution, is there any way to allow unprotected access from just 
those machines until we can upgrade them?

Thanks,
Ben






From: David Morel [dmo...@amakuru.net]
Sent: Tuesday, July 05, 2016 2:33 PM
To: Benjamin Ross
Cc: user@hadoop.apache.org
Subject: Re: Question regarding WebHDFS security


Le 5 juil. 2016 7:42 PM, "Benjamin Ross" 
mailto:br...@lattice-engines.com>> a écrit :
>
> All,
> We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
> that we have several single tenant services that rely on contacting the HDFS 
> cluster over WebHDFS without credentials.  So, the concern is that once we 
> kerberize the cluster, we will no longer be able to access it without 
> credentials from these single-tenant systems, which results in a painful 
> upgrade dependency.
>
> Any suggestions for dealing with this problem in a simple way?
>
> If not, any suggestion for a better forum to ask this question?
>
> Thanks in advance,
> Ben

It's usually not super-hard to wrap your http calls with a module that handles 
Kerberos, depending on what language you use. For instance 
https://metacpan.org/pod/Net::Hadoop::WebHDFS::LWP does this.

David



Click here to report 
this email as spam.



This message has been scanned for malware by Websense. 
www.websense.com



Re: Securing secrets for S3 FileSystems in DistCp

2016-05-03 Thread Larry McCay
Hi Elliot -

You may find the following patch interesting: 
https://issues.apache.org/jira/browse/HADOOP-12548

This enables the use of the Credential Provider API to protect secrets for the 
s3a filesystem.
The design document attached to it describes how to use it.

If you are not using s3a, there is similar support for the credential provider 
API in s3 and s3n but there slight differences in the processing.
S3a is considered the strategic filesystem for accessing s3 - as far as I can 
tell.

Hope this is helpful.

—larry

On May 3, 2016, at 8:41 AM, Elliot West 
mailto:tea...@gmail.com>> wrote:

Hello,

We're currently using DistCp and S3 FileSystems to move data from a vanilla 
Apache Hadoop cluster to S3. We've been concerned about exposing our AWS 
secrets on our shared, on-premise cluster. As  a work-around we've patched 
DistCp to load these secrets from a JCEKS keystore. This seems to work quite 
well, however we're not comfortable on relying on a DistCp fork.

What is the usual approach to achieve this with DistCp and is there a feature 
or practice that we've overlooked? If not, might there be value in us raising a 
JIRA ticket and submitting a patch for DistCp to include this secure keystore 
functionality?

Thanks - Elliot.



Re: HDFS Encryption using Java

2016-04-22 Thread Larry McCay
Hi Shashi -

For the Key Provider API, I would personally use the code in KeyShell.java [1] 
and the tests in TestKeyProviderFactory.java [2] as examples for creating the 
key.

Unfortunately, I don’t have pointers for the createZone step for you but hope 
that is helpful.

—larry

[1]. 
https://git-wip-us.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java;h=c69dc82a6032f7cbd989cd5f89dfa7719297f9b7;hb=HEAD

[2]. 
https://git-wip-us.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderFactory.java;h=ef09d94739643a3a7ee642921d394b0d8c45acee;hb=HEAD

On Apr 22, 2016, at 5:45 AM, Shashi Vishwakarma 
mailto:shashi.vish...@gmail.com>> wrote:

Hi

I want to invoke below command using java .


hadoop key create myKey
hdfs crypto -createZone -keyName myKey -path /zone

Can someone point to java api documentation for this or any example to 
implement this in java?

Thanks in advance.

Shashi




Re: Trying to build a cluster at home

2016-03-25 Thread Larry McCay
Hi Raj -

You might want to just start with Ambari to install, configure and manage your 
cluster.
You can start from here: 
https://cwiki.apache.org/confluence/display/AMBARI/Install+Ambari+2.2.1+from+Public+Repositories

HTH,

—larry

On Mar 25, 2016, at 1:26 PM, Raj Hadoop 
mailto:hadoop...@yahoo.com.INVALID>> wrote:

Hi,

I have three CentOS laptop's at home and I have decided to build a hadoop 
cluster on these. I have a switch and a router to setup.

I am planning to setup a DNS server for a FQDN. How do I go about it ? Can any 
one share there experiences ?

Regards,
Raj



Re: Am I understanding right?

2015-11-27 Thread Larry McCay III
Well, it isn’t necessarily difficult if you can kinit to the same KDC as is 
used inside the cluster OR to one that is explicitly setup to be trusted by the 
cluster KDC.
I don’t know what is involved for cross domain trust with a cluster on azure.

If you are able to do that and you have line of site of the webhdfs host:port 
then it should work with a kinit before using curl —negotiate.
You need to understand the network security involved: is the cluster firewalled 
off, is there cross domain trust setup or are you sharing the same KDC as the 
cluster, etc.

You haven’t provided what the error is that you receive so it is a bit tough to 
speculate any more.

On Nov 27, 2015, at 4:01 AM, Jingfei Hu 
mailto:jingfei...@gmail.com>> wrote:

Hi Larry,
Thanks for your reply. But I am still confusing about
Direct access to webhdfs will be difficult from your desktop.

What do you mean by difficult? Is it impossible? My major question is can I use 
the user name and password which is recognized by the KDC (kerberized on my 
hdfs cluster) with webhdfs protocol to access hdfs files and directories? My 
thought goes like ‘Hey, I’ve got the user name and password, it’s all that the 
KDC needs to verify I am a valid user, why still can’t I access the files?’. 
What else do I need to do to get this working?  Is KDC always requiring a 
request from a trusted machine instead of just user name and password?

Thanks,
Jingfei
From: Larry McCay III [mailto:lmc...@hortonworks.com]
Sent: Tuesday, November 24, 2015 1:15 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Cc: Jingfei Hu 
Subject: Re: Am I understanding right?

Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to 
authenticate to KDC that is either shared with cluster or some sort of cross 
domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to 
Knox via LDAP or some other mechanism and Knox will take care of the strong 
authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against 
LDAP for example:

curl -ivku username:password -X GET 
https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu 
mailto:jingfei...@hotmail.com>> wrote:


Anyone?

From: Jingfei Hu [mailto:jingfei...@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Cc: jingfei...@hotmail.com<mailto:jingfei...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs 
protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just 
one node). I tried several things.
1.   Enable the Kerberos according to the wizard
a.   I can access the hdfs file using webhdfs in that node with correct 
Kerberos user name and password. (I am using curl –negotiate …)
b.   But I can’t access the hdfs file outside of the hdfs cluster, say a 
windows 10 client in our corp network.
2.   Enabled the Kerberos and connect it with a LDAP
a.   I can access the hdfs file using webhdfs in that node with correct 
Kerberos user name and password. (I am using curl –negotiate …)
b.   I can access the hdfs file using webhdfs in a machine within the 
domain which is connected with the KDC using the KDC user name and password
c.   I can access the hdfs file using webhdfs in a machine within the 
domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by 
design?

Thanks,
Jingfei



Re: Am I understanding right?

2015-11-23 Thread Larry McCay III
Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to 
authenticate to KDC that is either shared with cluster or some sort of cross 
domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to 
Knox via LDAP or some other mechanism and Knox will take care of the strong 
authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against 
LDAP for example:

curl -ivku username:password -X GET 
https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu 
mailto:jingfei...@hotmail.com>> wrote:

Anyone?

From: Jingfei Hu [mailto:jingfei...@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org
Cc: jingfei...@hotmail.com
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs 
protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just 
one node). I tried several things.
1.   Enable the Kerberos according to the wizard
a.   I can access the hdfs file using webhdfs in that node with correct 
Kerberos user name and password. (I am using curl –negotiate …)
b.   But I can’t access the hdfs file outside of the hdfs cluster, say a 
windows 10 client in our corp network.
2.   Enabled the Kerberos and connect it with a LDAP
a.   I can access the hdfs file using webhdfs in that node with correct 
Kerberos user name and password. (I am using curl –negotiate …)
b.   I can access the hdfs file using webhdfs in a machine within the 
domain which is connected with the KDC using the KDC user name and password
c.   I can access the hdfs file using webhdfs in a machine within the 
domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by 
design?

Thanks,
Jingfei



Re: HTTPFS without impersonation

2015-06-03 Thread Larry McCay
Sorry.
No, I don’t think that this is possible and I don’t think that you should try 
and manipulate the proxy settings in such a way that team users are configured 
as trusted proxies.
That would introduce risk of exactly the sort of things that you are trying to 
avoid.

On Jun 3, 2015, at 9:06 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:

Hi,

Thanks for your answer. I’m not sure I understand it all, though.

Of course, you could send a request to another team’s HTTPFS instance. But you 
won’t be necessarily be granted access to every operations (based on Kerberos 
authentication, for instance).

Anyway, my objective was to use HTTPFS without the impersonation mechanisms. 
Thus, a given HTTPFS instance would be only granted the right of the users 
under which it runs. Do you think this is possible?

Thanks & regards,
Nathaniel



From: Larry McCay [mailto:lmc...@hortonworks.com]
Sent: mercredi 3 juin 2015 14:28
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HTTPFS without impersonation

inline...

On Jun 3, 2015, at 8:03 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:


Hi,

We want to let users & teams be able to run their HTTPFS in order to isolate 
instances. One team thus cannot crash another team’s HTTPFS instance.


For my own clarity...
How does this keep them from using instances that are running as another team 
user?
If the instances are running locally to the user or on a team user specific 
gateway machine then it should be able to run as http and have the same benefit 
of physical isolation - no?
If they are running on edge nodes of the cluster then can’t a user send a 
request to any HttpFs instance?


Now, I make the following request:

curl 
"localhost:14000/webhdfs/v1/user/team_user?op=LISTSTATUS&user.name=team_user"

And I get the following response:

{"RemoteException":{"message":"User: team_user is not allowed to 
impersonateteam_user","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}


We provide the concept of trusted proxies in Hadoop.
The number of these trusted entities should ideally be kept to a minimum.
A proliferation of such trust relationships can lead to unexpected results and 
a management headache.

I wouldn’t want to see teamUser1 be trusted to impersonate teamUser2 or HDFS 
for instance - avoid using ‘*’ for the groups property.

When using a gateway like HttpFs or Knox you want that server to be trusted to 
act on behalf of other users not every user that uses it to be trusted.

My suggestion is to use HttpFs as a proxy server running as a single user that 
can be configured as trusted.
Physical isolation can be used across tenants so that they don’t have access to 
the others instances.


Thanks,
Nathaniel

From: Larry McCay [mailto:lmc...@hortonworks.com]
Sent: mercredi 3 juin 2015 13:57
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HTTPFS without impersonation

Out of curiosity, what is the added benefit of having HttpFs run as separate 
team users give you?
If the APIs are invoked with SPNEGO or a user.name of the appropriate user 
don’t you get the same permissions based protections?

Generally speaking, gateways such as HttpFs provide access on behalf of 
endusers.

On Jun 3, 2015, at 7:44 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:



Hi,

Thanks for your answer.

With this setup, only the HTTP user will be able to impersonate other users, so 
HTTPFS has to run with the HTTP user.

Instead, I need users to run HTTPFS with their own user, not with the HTTP user.

Thanks

From: Wellington Chevreuil [mailto:wellington.chevre...@gmail.com]
Sent: mercredi 3 juin 2015 13:41
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HTTPFS without impersonation


Hi, do u have below property on core-site.xml file used by your hdfs?


hadoop.proxyuser.HTTP.hosts
*
  
  
hadoop.proxyuser.HTTP.groups
*
  

Hello all,

We need to run several HTTPFS instances on our Hadoop cluster, with different 
users (basically, one HTTPFS per team).

In our setup, each HTTPFS instance runs as a team user and is allowed write 
access to that user’s directory only (so, HTTPFS does not run as the httpfs 
user).

However, this setup does not work, as we get exceptions related to 
impersonation, such as this one:

{"RemoteException":{"message":"User: team_user is not allowed to 
impersonateteam_user","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}

So, it seems that HTTPFS unconditionally tries to impersonate a user, even 
though it’s running as that same user. Is there a way to somehow disable 
impersonation?

Thanks for your help.

Regards,
Nathaniel



Re: HTTPFS without impersonation

2015-06-03 Thread Larry McCay
inline...

On Jun 3, 2015, at 8:03 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:

Hi,

We want to let users & teams be able to run their HTTPFS in order to isolate 
instances. One team thus cannot crash another team’s HTTPFS instance.


For my own clarity...
How does this keep them from using instances that are running as another team 
user?
If the instances are running locally to the user or on a team user specific 
gateway machine then it should be able to run as http and have the same benefit 
of physical isolation - no?
If they are running on edge nodes of the cluster then can’t a user send a 
request to any HttpFs instance?

Now, I make the following request:

curl 
"localhost:14000/webhdfs/v1/user/team_user?op=LISTSTATUS&user.name=team_user"

And I get the following response:

{"RemoteException":{"message":"User: team_user is not allowed to 
impersonateteam_user","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}


We provide the concept of trusted proxies in Hadoop.
The number of these trusted entities should ideally be kept to a minimum.
A proliferation of such trust relationships can lead to unexpected results and 
a management headache.

I wouldn’t want to see teamUser1 be trusted to impersonate teamUser2 or HDFS 
for instance - avoid using ‘*’ for the groups property.

When using a gateway like HttpFs or Knox you want that server to be trusted to 
act on behalf of other users not every user that uses it to be trusted.

My suggestion is to use HttpFs as a proxy server running as a single user that 
can be configured as trusted.
Physical isolation can be used across tenants so that they don’t have access to 
the others instances.

Thanks,
Nathaniel

From: Larry McCay [mailto:lmc...@hortonworks.com]
Sent: mercredi 3 juin 2015 13:57
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HTTPFS without impersonation

Out of curiosity, what is the added benefit of having HttpFs run as separate 
team users give you?
If the APIs are invoked with SPNEGO or a user.name of the appropriate user 
don’t you get the same permissions based protections?

Generally speaking, gateways such as HttpFs provide access on behalf of 
endusers.

On Jun 3, 2015, at 7:44 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:


Hi,

Thanks for your answer.

With this setup, only the HTTP user will be able to impersonate other users, so 
HTTPFS has to run with the HTTP user.

Instead, I need users to run HTTPFS with their own user, not with the HTTP user.

Thanks

From: Wellington Chevreuil [mailto:wellington.chevre...@gmail.com]
Sent: mercredi 3 juin 2015 13:41
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HTTPFS without impersonation


Hi, do u have below property on core-site.xml file used by your hdfs?


hadoop.proxyuser.HTTP.hosts
*
  
  
hadoop.proxyuser.HTTP.groups
*
  

Hello all,

We need to run several HTTPFS instances on our Hadoop cluster, with different 
users (basically, one HTTPFS per team).

In our setup, each HTTPFS instance runs as a team user and is allowed write 
access to that user’s directory only (so, HTTPFS does not run as the httpfs 
user).

However, this setup does not work, as we get exceptions related to 
impersonation, such as this one:

{"RemoteException":{"message":"User: team_user is not allowed to 
impersonateteam_user","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}

So, it seems that HTTPFS unconditionally tries to impersonate a user, even 
though it’s running as that same user. Is there a way to somehow disable 
impersonation?

Thanks for your help.

Regards,
Nathaniel



Re: HTTPFS without impersonation

2015-06-03 Thread Larry McCay
Out of curiosity, what is the added benefit of having HttpFs run as separate 
team users give you?
If the APIs are invoked with SPNEGO or a user.name of the appropriate user 
don’t you get the same permissions based protections?

Generally speaking, gateways such as HttpFs provide access on behalf of 
endusers.

On Jun 3, 2015, at 7:44 AM, Nathaniel Braun 
mailto:n.br...@criteo.com>> wrote:

Hi,

Thanks for your answer.

With this setup, only the HTTP user will be able to impersonate other users, so 
HTTPFS has to run with the HTTP user.

Instead, I need users to run HTTPFS with their own user, not with the HTTP user.

Thanks

From: Wellington Chevreuil [mailto:wellington.chevre...@gmail.com]
Sent: mercredi 3 juin 2015 13:41
To: user@hadoop.apache.org
Subject: Re: HTTPFS without impersonation


Hi, do u have below property on core-site.xml file used by your hdfs?


hadoop.proxyuser.HTTP.hosts
*
  
  
hadoop.proxyuser.HTTP.groups
*
  

Hello all,

We need to run several HTTPFS instances on our Hadoop cluster, with different 
users (basically, one HTTPFS per team).

In our setup, each HTTPFS instance runs as a team user and is allowed write 
access to that user’s directory only (so, HTTPFS does not run as the httpfs 
user).

However, this setup does not work, as we get exceptions related to 
impersonation, such as this one:

{"RemoteException":{"message":"User: team_user is not allowed to 
impersonateteam_user","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}

So, it seems that HTTPFS unconditionally tries to impersonate a user, even 
though it’s running as that same user. Is there a way to somehow disable 
impersonation?

Thanks for your help.

Regards,
Nathaniel



Re: WebHdfs API

2015-06-02 Thread Larry McCay
To build on what Manoj has described, there are few aspects to your usecase and 
cluster interaction from outside the cluster that you need to be aware of…

* If the cluster is secured - you will need to use SPNEGO - HttpClient does 
support this too
* if you need to invoke the REST APIs as each authenticated enduser rather than 
as a single identity representing the application then you will need your 
application to be configured as a trusted proxy to HDFS
* if you have the cluster firewalled off from the rest of your network then you 
will need to punch holes in the firewall for the host:ports that you need 
access to

You can simplify the above by use a reverse proxy like Apache Knox to be an API 
Gateway for the REST APIs that you need access to.
See http://knox.apache.org/ for more details.

On Jun 2, 2015, at 9:56 AM, Manoj Babu 
mailto:manoj...@gmail.com>> wrote:

Hi - You can invoke the REST service using HTTP request with help of any HTTP 
clients. Example if it is an Java web application you can use Apache commons 
HTTP client.

Thanks.

On Tuesday, June 2, 2015, Carmen Manzulli 
mailto:carmenmanzu...@gmail.com>> wrote:
Hi,
I would like to know hot to use WebHDFS methods of RESTful Web Services to open 
and read a file from a web applicationinstead of using curl!

For example, i would know ho to create a client and submit a Get HTTP request 
to the namenode.

Thanks in advance,

Carmen.


--
Cheers!
Manoj.




Re: Sr.Technical Architect/Technical Architect/Sr. Hadoop /Big Data Developer for CA, GA, NJ, NY, AZ Locations_(FTE)Full Time Employment

2014-11-12 Thread Larry McCay
Everyone should be aware that replying to this mail results in sending your
papers to everyone on the list


On Wed, Nov 12, 2014 at 8:17 PM, mark charts  wrote:

> Hello.
>
>
> I am interested. Attached are my cover letter and my resume.
>
>
> Mark Charts
>
>
>   On Wednesday, November 12, 2014 2:45 PM, Amendra Singh Gangwar <
> amen...@exlarate.com> wrote:
>
>
> Hi,
>
>
> Please let me know if you are available for this FTE position for CA, GA,
> NJ, NY with good travel.
> Please forward latest resume along with Salary Expectations, Work
> Authorization & Minimum joining time required.
>
> Job Descriptions:
>
> Positions: Technical Architect/Sr. Technical Architect/ Sr.
> Hadoop /Big Data Developer
>
> Location  : CA, GA, NJ, NY
>
> Job Type  : Full Time Employment
>
> Domain: BigData
>
>
>
> Requirement 1:
>
> Sr. Technical Architect: 12+ years of experience in the implementation
> role
> of high end software products in telecom/ financials/ healthcare/
> hospitality domain.
>
>
>
> Requirement 2:
>
> Technical Architect: 9+ years of experience in the implementation role of
> high end software products in telecom/ financials/ healthcare/ hospitality
> domain.
>
>
>
> Requirement 3:
>
> Sr. Hadoop /Big Data Developer 7+ years of experience in the
> implementation
> role of high end software products in telecom/ financials/ healthcare/
> hospitality domain.
>
>
>
> Education: Engineering Graduate, .MCA, Masters/Post Graduates (preferably
> IT/ CS)
>
>
>
> Primary Skills:
>
> 1. Expertise on Java/ J2EE and should still be hands on.
>
> 2. Implemented and in-depth knowledge of various java/ J2EE/ EAI patterns
> by
> using Open Source products.
>
> 3. Design/ Architected and implemented complex projects dealing with the
> considerable data size (GB/ PB) and with high complexity.
>
> 4. Sound knowledge of various Architectural concepts (Multi-tenancy, SOA,
> SCA etc) and capable of identifying and incorporating various NFR’s
> (performance, scalability, monitoring etc)
>
> 5. Good in database principles, SQL, and experience working with large
> databases (Oracle/ MySQL/ DB2).
>
> 6. Sound knowledge about the clustered deployment Architecture and should
> be
> capable of providing deployment solutions based on customer needs.
>
> 7. Sound knowledge about the Hardware (CPU, memory, disk, network,
> Firewalls
> etc)
>
> 8. Should have worked on open source products and also contributed towards
> it.
>
> 9. Capable of working as an individual contributor and within team too.
>
> 10. Experience in working in ODC model and capable of presenting the
> Design
> and Architecture to CTO’s, CEO’s at onsite
>
> 11. Should have experience/ knowledge on working with batch processing/
> Real
> time systems using various Open source technologies lik Solr, hadoop,
> NoSQL
> DB’s, Storm, kafka etc.
>
>
>
> Role & Responsibilities (Technical Architect/Sr. Technical Architect)
>
> •Anticipate on technological evolutions.
>
> •Coach the technical team in the development of the technical architecture.
>
> •Ensure the technical directions and choices.
>
> •Design/ Architect/ Implement various solutions arising out of the large
> data processing (GB’s/ PB’s) over various NoSQL, Hadoop and MPP based
> products.
>
> •Driving various Architecture and design calls with bigdata customers.
>
> •Working with offshore team and providing guidance on implementation
> details.
>
> •Conducting sessions/ writing whitepapers/ Case Studies pertaining to
> BigData
>
> •Responsible for Timely and quality deliveries.
>
> •Fulfill organization responsibilities – Sharing knowledge and experience
> within the other groups in the org., conducting various technical sessions
> and trainings.
>
>
>
> Role & Responsibilities (Sr. Hadoop /Big Data Developer)
>
> •Implementation of various solutions arising out of the large data
> processing (GB’s/ PB’s) over various NoSQL, Hadoop and MPP based products
>
> •Active participation in the various Architecture and design calls with
> bigdata customers.
>
> •Working with Sr. Architects and providing implementation details to
> offshore.
>
> •Conducting sessions/ writing whitepapers/ Case Studies pertaining to
> BigData
>
> •Responsible for Timely and quality deliveries.
>
> •Fulfill organization responsibilities – Sharing knowledge and experience
> within the other groups in the org., conducting various technical sessions
> and trainings
>
> --
>
> Thanks & Regards,
>
> Amendra Singh
>
> Exlarate LLC
>
> Cell : 323-250-0583
>
> E-mail :amen...@exlarate.com
>
> www.exlarate.com
>
> 
>
> Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail
> cannot be considered Spam as long as we include contact information and a
> remove link for removal from our mailing list. To be removed from our
> mailing list reply with "Remove" and include your "original email
> address/addresses" in the subject heading. Include complete
> address/a

Re: How to login a user with password to Kerberos Hadoop instead of ticket cache or key tab file ?

2014-10-06 Thread Larry McCay
Well, it seems to be committed to branch-2 - so I assume it will make it
into the next 2.x release.


On Mon, Oct 6, 2014 at 2:51 PM, Xiaohua Chen  wrote:

> Hi Larry,
>
> Thanks! This is the very right approach I am looking for.  Currently
> I am using Hadoop 2.3.0 , seems this API
> UserGroupInformation.getUGIFromSubject(subject) is only available from
> Hadoop 3.0.0 , which seems is not released yet. So when can I expect
> to get the downloadable for Hadoop 3.0.0 ?
>
> Thank you very much and best regards!
>
> Sophia
>
>
>
> On Mon, Oct 6, 2014 at 10:57 AM, Larry McCay 
> wrote:
> > You may find this approach interesting.
> > https://issues.apache.org/jira/browse/HADOOP-10342
> >
> > The idea is that you preauthenticate using JAAS/krb5 or something in your
> > application and then leverage the resulting java Subject to assert the
> > authenticated identity.
> >
> > On Mon, Oct 6, 2014 at 1:51 PM, Xiaohua Chen 
> wrote:
> >>
> >> Hi Experts,
> >>
> >> We have a use case which needs to login user into Kerberos hadoop
> >> using the kerberos user's name and password.
> >>
> >> I have searched around and only found that
> >> 1) one can login a user  from ticket cache ( this is the default one)
> or
> >> 2) login a user from this user's keytab file e.g.
> >>  UserGroupInformation.loginUserFromKeytabAndReturnUGI("sochen",
> >> "/tmp/sochen.keytab"));
> >>
> >> Can you shed some light how I can login a user using his kerberos
> >> password and get a UserGroupInformation object so I can invoke
> >> doAs() to access the HDFS file system ?
> >>
> >> Thanks a lot!
> >>
> >> Sophia
> >
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the
> reader of
> > this message is not the intended recipient, you are hereby notified that
> any
> > printing, copying, dissemination, distribution, disclosure or forwarding
> of
> > this communication is strictly prohibited. If you have received this
> > communication in error, please contact the sender immediately and delete
> it
> > from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: How to login a user with password to Kerberos Hadoop instead of ticket cache or key tab file ?

2014-10-06 Thread Larry McCay
You may find this approach interesting.
https://issues.apache.org/jira/browse/HADOOP-10342

The idea is that you preauthenticate using JAAS/krb5 or something in your
application and then leverage the resulting java Subject to assert the
authenticated identity.

On Mon, Oct 6, 2014 at 1:51 PM, Xiaohua Chen  wrote:

> Hi Experts,
>
> We have a use case which needs to login user into Kerberos hadoop
> using the kerberos user's name and password.
>
> I have searched around and only found that
> 1) one can login a user  from ticket cache ( this is the default one)  or
> 2) login a user from this user's keytab file e.g.
>  UserGroupInformation.loginUserFromKeytabAndReturnUGI("sochen",
> "/tmp/sochen.keytab"));
>
> Can you shed some light how I can login a user using his kerberos
> password and get a UserGroupInformation object so I can invoke
> doAs() to access the HDFS file system ?
>
> Thanks a lot!
>
> Sophia
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Reg HttpFS

2014-05-16 Thread Larry McCay
Hi Manoj -

This is often done by going through a gateway or intermediary that is
configured as a trusted proxy to the cluster. That is, the intermediary can
authenticate to the target services as itself with kerberos and dispatch
the REST request with a doas parameter that indicates the identity of the
user to issue the request on behalf of.

This is precisely what Apache Knox does for such deployments. You may want
to take a look there.
http://knox.apache.org

Currently, out of the box, Knox has an authentication provider to
authentication HTTP Basic credentials against an LDAP server.
There is an ApacheDS LDAP server as part of the Knox distribution as well -
for quickly testing your deployment.

Feel free to engage the Knox user/dev lists.

HTH,

--larry


On Thu, May 15, 2014 at 4:44 AM, Manoj Babu  wrote:

> Hi Alejandro,
>
> Thanks for your response. Right now i am following this approach from
> edge-node where kerberos is configured. I am not able to understand the hit
> provided can you provide a sample to trigger request from other external
> machine to authenticate where kerberos not configured in client where
> request is to be triggered?
>
> $ kinit
>
> After entering pwd
>
> $ curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt
> http://localhost:14000/webhdfs/v1/?op=liststatus
>
> Cheers!
> Manoj.
>
>
> On Fri, May 9, 2014 at 12:13 PM, Alejandro Abdelnur wrote:
>
>> Manoj,
>>
>> Please look at
>> http://hadoop.apache.org/docs/r2.4.0/hadoop-hdfs-httpfs/httpfs-default.htmllook
>>  at the 'httpfs.authentication.*' properties.
>>
>> Thanks.
>>
>>
>> On Sun, May 4, 2014 at 5:27 AM, Manoj Babu  wrote:
>>
>>> Hi,
>>>
>>> How to accesss files in hdfs using HttpFS that is protected by kerberos?
>>> Kerberos authentication works only where is is configured ex: edge node.
>>> If i am triggering request from other system then how do i authenticate?
>>>
>>> Kindly advise.
>>>
>>> Cheers!
>>> Manoj.
>>>
>>
>>
>>
>> --
>> Alejandro
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDP 2.0 REST API

2014-01-29 Thread Larry McCay
As you can see from the response - you are missing the operation from the
parameters.
See the API docs available:
http://hadoop.apache.org/docs/r1.0.4/webhdfs.html

For an example see the following:

curl -i  " 
http://dc-bigdata5.bateswhite.com:50070/webhdfs/v1/tmp?op=GETFILESTATUS";




On Wed, Jan 29, 2014 at 7:53 AM, Clay McDonald <
stuart.mcdon...@bateswhite.com> wrote:

>
>
> Hello, I'm attempting to access hdfs from my browser, but when I goto the
> URL http:// http://dc-bigdata5.bateswhite.com:50070/webhdfs/v1/
>
>
>
> I get the following error;
>
>
>
> {"RemoteException":{"exception":"UnsupportedOperationException","javaClassName":"java.lang.UnsupportedOperationException","message":"op=NULL
> is not supported"}}
>
>
>
> Any ideas???
>
>
>
> Thanks,
>
> *Clay McDonald*
> Database Administrator
>
> Bates White, LLC
> 1300 Eye St, NW, Suite 600 East
> Washington, DC 20005
> Main: 202.408.6110
> Cell: 202.560.4101
> Direct: 202.747.5962
> Email: clay.mcdon...@bateswhite.com
>
> 
> This electronic message transmission contains information from Bates
> White, LLC, which may be confidential or privileged. The information is
> intended to be for the use of the individual or entity named above. If you
> are not the intended recipient, be aware that any disclosure, copying,
> distribution, or use of the contents of this information is prohibited.
>
> If you have received this electronic transmission in error, please notify
> me by telephone at 202.747.5962 or by electronic mail at
> clay.mcdon...@bateswhite.com immediately.
>
> *
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS authentication

2014-01-10 Thread Larry McCay
Hi Pinak -

If you want to use the REST interface of webhdfs then you can setup Knox as
the Hadoop REST Gateway and authentication against LDAP or other stores
through the Apache Shiro integration. This opens up your authentication
possibilities.

http://knox.incubator.apache.org/

It would then proxy your access to HDFS and the rest of Hadoop through the
gateway.

If you intend to only use the Hadoop command like tooling then you are
limited to only Kerberos for real authentication.

HTH.

--larry


On Fri, Jan 10, 2014 at 3:31 AM, Juan Carlos  wrote:

> As far as I know, the only authentication method available in hdfs 2.2.0
> is Kerberos, so it's not possible to authenticate with an URL.
> Regards
>
>
> 2014/1/10 Pinak Pani 
>
>> Does HDFS provide any build in authentication out of the box? I wanted to
>> make explicit access to HDFS from Java. I wanted people to access HDFS
>> using "username:password@hdfs://client.skynet.org:9000/user/data" or
>> something like that.
>>
>> I am new to Hadoop. We are planning to use Hadoop mainly for Archiving
>> and probably processing at a later time. The idea is customers can setup
>> their own HDFS cluster and provide us the HDFS URL to dump the data to.
>>
>> Is it possible to have access to HDFS in a similar way we access
>> databases using credential?
>>
>> Thanks.
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Larry McCay
Hi Visioner -

Depending on your actual installation, you may have all of the other APIs
available to the CLI clients as well.
This would potentially be an valid usecase for Apache Knox - in the
incubator still - see: http://knox.incubator.apache.org/

Knox provides you with a Web API Gateway for Hadoop. There is of course
support for webhdfs built into the gateway.

What this would allow you todo is wall off your Hadoop cluster with
appropriate networking techniques - such as firewalls - and only open the
Knox Gateway port to the network that you external users have access to.

You can then authenticate incoming REST requests using BASIC authentication
against LDAP or you can build a customer authentication provider for your
environment - if needed.

You would want to switch to the webhdfs API for moving files into HDFS
though.

I would encourage you to subscribe to the user/dev lists for Knox and start
a discussion there. We would be happy to help you with your web app access
there.

thanks,

--larry



On Fri, Aug 30, 2013 at 7:51 AM, Nitin Pawar wrote:

> ssh has nothing to do with hdfs.
>
> there are three ways someone would want to write into hdfs
> 1) HDFS java api
> 2) hadoop command line tools
> 3) Webhdfs (doing post, put etc)
>
> In all above cases, there is no role of ssh. So you can assume that as
> long as no one has access to ssh-keys, no one can get into your hardware
> cluster. This does not mean that you have safe hdfs.
> To setup hdfs security you will need to
> 1) Isolate your cluster from public networks. (Even if your cluster has
> public ips, your network should only allows traffic from known addreses)
> 2) Default hdfs security is like POSIX systems, so you can check that
> 3) You really want to security then you can go for kerberos based
> authentications, do to anything on your cluster.
>
>
> Please wait for few experts to give you some ideas.
>
>
> On Fri, Aug 30, 2013 at 4:43 PM, Visioner Sadak 
> wrote:
>
>> Thanks a ton Nitin just wanted to confirm for the point below
>>
>> an external user wont be able to write in to our cluster using any API
>> right as we didnt included his ip in our cluster using password less ssh
>> for him i guess ssh will prompt a password for writes and reads correct me
>> if i am wrong :)
>>
>>
>> only admin has ssh access to linux clusters
>> >if no one has ssh access then password less ssh does not do any harm.
>>
>> On Fri, Aug 30, 2013 at 12:35 PM, Nitin Pawar wrote:
>>
>>> well have access to read from hdfs using webhdfs :
>>> ===>you may want to secure it with IP and username based authentications
>>>
>>> as of now we dunt  have any security specific to hdfs user level we have
>>> se permissions=true for a particular user
>>> >if you are managing user level access control then it should be
>>> technically safe that anyone other that hdfs superuser can not create and
>>> change permissions for user directories.
>>>
>>> only admin has ssh access to linux clusters
>>> >if no one has ssh access then password less ssh does not do any
>>> harm.
>>>
>>>
>>> On Fri, Aug 30, 2013 at 12:17 PM, Visioner Sadak <
>>> visioner.sa...@gmail.com> wrote:
>>>
 well have access to read from hdfs using webhdfs

 as of now we dunt  have any security specific to hdfs

 user level we have se permissions=true for a particular user

 only admin has ssh access to linux clusters






 On Fri, Aug 30, 2013 at 12:14 PM, Nitin Pawar 
 wrote:

> Visioner,
> is your cluster accessible on public network?
> What kind of hdfs security you have kept in place?
> what is your cluster security?(user level, intranet level)
> who all have ssh-keys to login to any node on the cluster?
>
>
>
>
> On Fri, Aug 30, 2013 at 12:08 PM, Visioner Sadak <
> visioner.sa...@gmail.com> wrote:
>
>> also we have done a password-less ssh within our clusters only so
>> that  we can access the cluster but i guess this wont be the case for an
>> external user right
>>
>>
>> On Fri, Aug 30, 2013 at 12:05 PM, Visioner Sadak <
>> visioner.sa...@gmail.com> wrote:
>>
>>> Hello friends we use filesystem.copyFrmLocal method of java
>>> api within a tomcat conntainer  to move data in to   hadoop clusters, 
>>> will
>>> any other unauthorised user will be able to write in to our hadoop 
>>> cluster
>>> using the java api or is any extra authenticaton needed from our side
>>>
>>
>>
>
>
> --
> Nitin Pawar
>


>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>
>
> --
> Nitin Pawar
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any

Re: How to hide web hdfs URL

2013-08-15 Thread Larry McCay
Depending on your deployment scenario you can look into using Knox Gateway
to front webhdfs.
The host:port of the gateway will then be used in your webapp but the
endpoint would require authentication.
Webhdfs would be completely hidden behind the Gateway.

http://knox.incubator.apache.org/

You can start a discussion on the user list there, if you like.


On Thu, Aug 15, 2013 at 11:06 AM, Raj K Singh  wrote:

> n
> ot sure with the java but javascript can really help you in that,you can
> find my examples of url shortening javascript api on web.
>
> 
> Raj K Singh
> http://www.rajkrrsingh.blogspot.com
> Mobile  Tel: +91 (0)9899821370
>
>
> On Thu, Aug 15, 2013 at 8:25 PM, Visioner Sadak 
> wrote:
>
>> cant i do it using java code instead of using a third party webpage
>>
>>
>> On Thu, Aug 15, 2013 at 8:20 PM, Raj K Singh wrote:
>>
>>> visit tinyURL.com or rarme.com / path.im
>>>
>>> 
>>> Raj K Singh
>>> http://www.rajkrrsingh.blogspot.com
>>> Mobile  Tel: +91 (0)9899821370
>>>
>>>
>>> On Thu, Aug 15, 2013 at 8:10 PM, Visioner Sadak <
>>> visioner.sa...@gmail.com> wrote:
>>>
 could you give a hint on how to use it


 On Thu, Aug 15, 2013 at 8:05 PM, Raj K Singh wrote:

> use tinyURL
>
> 
> Raj K Singh
> http://www.rajkrrsingh.blogspot.com
> Mobile  Tel: +91 (0)9899821370
>
>
> On Thu, Aug 15, 2013 at 8:00 PM, Visioner Sadak <
> visioner.sa...@gmail.com> wrote:
>
>> Hello friends,
>>
>> I have a front end jsp application which uses the below url to show
>> my hadoop file is there a way to hide the url coz it exposes port and
>> server name in my jsp page source
>>
>>
>> http://termin1:50070/webhdfs/v1/NN1Home/new_file_d561yht35-9a1a-4a7b-9n.jpg?op=OPEN
>> "
>>
>
>

>>>
>>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Work on research project "Hadoop Security Design"

2013-02-27 Thread Larry McCay
Hi Thomas -

I think that you need articulate the problems that you want to solve for
your university environment.
The subject that you chose indicates "inter-cloud environment" - so
depending on the inter-cloud problems that currently exist for your
environment there may be interesting work from the Rhino effort or with
Knox.

It seems that you are leaning toward data protection and encryption as a
solution to some problem within your stated problem subject.
I'd be interested in the usecase that you are addressing with it that is
"inter-cloud".
Another family of issues that would be interesting in the inter-cloud space
would be various identity federation issues across clouds.

@Charles - by GatewayFS do you mean HttpFS and are you asking whether Knox
is related to it?
If so, Knox is not directly related to HttpFS though it will leverage
lessons learned and hopefully the experience of those involved.
The Knox gateway is more transparent and committed to serving REST APIs to
numerous Hadoop services rather than just HDFS.
The pluggable providers of Knox gateway will also facilitate easier
integration with customer's identity infrastructure in on-prem and cloud
provider environments.

Hope that helps to draw the distinction between Knox and HttpFS.

thanks,

--larry

On Wed, Feb 27, 2013 at 9:40 AM, Charles Earl wrote:

> Is this in any way related to GatewayFS?
> I am also curious whether any one knows of plans to incorporate
> homomorphic encryption or secure multiparty into the rhino effort.
> C
>
> On Feb 27, 2013, at 9:30 AM, Nitin Pawar wrote:
>
> I am not sure if you guys have heard it or not
>
> HortonWorks is in process to incubate a new apache project called Knox for
> hadoop security.
> More on this you can look at
>
> http://hortonworks.com/blog/introducing-knox-hadoop-security/
>
> http://wiki.apache.org/incubator/knox
>
>
> On Wed, Feb 27, 2013 at 7:51 PM, Thomas Nguy  wrote:
>
>> Thank you very much Panshul, I'll take a look.
>>
>> Thomas.
>>
>>   --
>> *De :* Panshul Whisper 
>> *À :* user@hadoop.apache.org; Thomas Nguy 
>> *Envoyé le :* Mercredi 27 février 2013 13h53
>> *Objet :* Re: Work on research project "Hadoop Security Design"
>>
>> Hello Thomas,
>>
>> you can look into this project. This is exactly what you are doing, but
>> at a larger scale.
>> https://github.com/intel-hadoop/project-rhino/
>>
>> Hope this helps,
>>
>> Regards,
>> Panshul
>>
>>
>> On Wed, Feb 27, 2013 at 1:49 PM, Thomas Nguy wrote:
>>
>> Hello developers !
>>
>> I'm a student at the french university "Ensimag" and currently doing my
>> master research on "Software security". Interested by cloud computing, I
>> chose for subject : "Secure hadoop cluster inter-cloud environment".
>> My idea is to develop a framework in order to improve the security of the
>> Hadoop cluster running on the cloud of my uni. I have started by
>> checking the "Hadoop research projects" proposed  on Hadoop Wiki and the
>> following subject fits with mine:
>>
>> "Hadoop Security Design:
>>  An end-to-end proposal for how to support authentication and client
>> side data encryption/decryption, so that large data sets can be stored in a
>> public HDFS and only jobs launched by authenticated users can map-reduce or
>> browse the data"
>>
>> I would like to know if there are already some developers on it so we can
>> discuss... To be honest, I'm kinda a "beginner" regarding Hadoop and
>> cloud cumputing so if would be really great if you had some advices or
>> hints for my research.
>>
>> Best regards.
>> Thomas
>>
>>
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>>
>>
>
>
> --
> Nitin Pawar
>
>
>