Thank you for writing back with a detailed clarification.

Regarding encryption at rest, HDFS is adding it as HDFS-6134, so likely
there will be a new core feature option for the ecosystem to consider
shortly.

​> I don’t feel one technology or one company or one small group or one
approach can solve this problem. This has to be addressed by the community
working together. This would also require a lot of support from each
dependent projects and lot of co-ordination. And there would be multiple
security solutions available for the end users to pick from.​
​
Completely agreed. However, the desired community cooperation has both
technical and political components. I think there are some concerns about
how successful an outcome Argus may produce, informed by experience.
Perhaps it would be worthwhile to address those concerns. Argus proposes to
develop a common security infrastructure for the Hadoop ecosystem. In my
opinion (and informed by personal experience) we have new incubating Hadoop
ecosystem security projects like Sentry and Knox and proposals such as
Argus because Hadoop core is locked down. Argus et. al. are like the
proverbial blocked river (user demand for features) seeking a new route
around a landslide (obvious poisonous contention and litigation-via-JIRA on
every significant topic). I would be curious your thoughts on how to avoid
the same end state in the Argus project. In my opinion, it would be a
tragedy if a potential solution ends up perpetuating the dysfunction it
seeks to bypass to a greater proportion of Foundation projects instead. A
Hadoop ecosystem project attempting to remain independent from the
dysfunction of Hadoop core would be well advised to stay away from adoption
of Argus components (security is so critical) if the governance of Argus
perpetuates that dysfunction. By the way, it is also not too late for Knox
and Sentry.



On Wed, Jul 16, 2014 at 11:34 PM, Don Bosco Durai <bdu...@hortonworks.com>
wrote:

> > How do you define the 'Hadoop complex eco-system'? If that definition
> Agreed, complex is a relative term. I used the term complex, because now
> more than 20 products use Hadoop and list is growing. There are 10 products
> listed on http://hadoop.apache.org/. Then there are others projects like
> Accumulo, Impala, Storm, Kafka, Falcon, Pig, Flume, Sqoop, Oozie, etc.
> which uses HDFS or support/enable other products within Hadoop ecosystem.
> If we dig deeper, each component might have multiple processes (Name Node,
> Data Node, Job Tracker, Storm Nimbus Server, HBase Master Servers, HBase
> Regions Servers, HA, etc). With YARN, now user can run their applications
> in the cluster, which is a great feature, but it is very scary from
> security point of view, because now users can write their custom
> application and run it within a secure data center.
>
> I don’t feel one technology or one company or one small group or one
> approach can solve this problem. This has to be addressed by the community
> working together. This would also require a lot of support from each
> dependent projects and lot of co-ordination. And there would be multiple
> security solutions available for the end users to pick from.
>
> > includes projects such as HBase, we have significant security controls,
> so
> The mature projects have started beefing up their security features. In
> recent releases, HBase added cell based access control and encryption, HDFS
> added advanced ACLs and now working on file level encryptions, Hive added
> ATZ-NG, no encryption yet. The newer ones like Solr, Storm, Falcon have
> very basic security control. On the good news side, most components have
> started supporting Kerberos and SSL. But encryption at rest is still a
> challenge. In most cases it is all or none, except probably HBase and
> Accumulo. Access control and auditing is also not that mature among the
> newer projects. The goal is here is not to reinvent or impose on each
> project, but to reuse the existing security technologies consistently
> across projects and at the same extend it where applicable.
>
> > or the combination of Hive+Sentry would agree with that statement either.
> Personally, Hive is my ideal role model for all hadoop projects to follow.
> Out of the box, it has inbuilt access control, but also provides APIs to
> plug your authorization model. Now security projects like Argus can extend
> it to support attribute based access control, cell based access control,
> tagging, multi-tenancy, auditing, etc. Users based on their security
> requirement or appetite might decide to go with the default or choose one
> of the other security providers. Similar requirements might be there for
> HBase, but expecting all Hadoop components to keep up with each other is
> counter productive, while a dedicated security provider (project) might do
> more extensive and uniform job. Users might also pick multiple security
> providers within their cluster to address specific security concerns.
>
> Since we are on the topic of complexity, one of the reason Hadoop is
> popular is because of its openness. Hive might be on top of anything, e.g.
> on HDFS,  HBase+HDFS, flat file, etc. While you can access SQL queries via
> Hive, you can also write Pig or MR job to access the underlying HDFS file
> directly. This is a powerful feature, which now gives them ability to run
> sophisticated analytical jobs or use enterprise grade BI tool. But this
> also allows users to circumvent Hive’s native security. For Hive or any
> native component, cross component security is out of scope (and should be).
> This problem can be solved by security providers like Argus, who can
> enforce adequate security consistently across components or project
> boundaries.
>
> Happy to discuss more on this topic.
>
> Thanks
>
> Bosco
>
>
> On Jul 16, 2014, at 7:38 PM, Andrew Purtell <apurt...@apache.org> wrote:
>
> > This statement might not be quite right:
> >
> >> Even within Hadoop complex eco-system, each components have limited or
> no
> > security controls.
> >
> > How do you define the 'Hadoop complex eco-system'? If that definition
> > includes projects such as HBase, we have significant security controls,
> so
> > that wouldn't be a correct statement. Not sure those working on Accumulo,
> > or the combination of Hive+Sentry would agree with that statement either.
> >
> > It's not necessary to survey the Hadoop ecosystem before incubating of
> > course, or even after, but it sounds like that might be a good idea.
> >
> >
> >
> > On Wed, Jul 16, 2014 at 5:06 PM, Don Bosco Durai <bdu...@hortonworks.com
> >
> > wrote:
> >
> >> Hi JB
> >>
> >> We will be centralizing the administration and auditing for Knox. And we
> >> will be also standardizing the authentication for web applications for
> all
> >> components within Hadoop ecosystem, for which we might consider Shiro. I
> >> would like to understand more about Syncope and see how production
> ready it
> >> is...
> >>
> >> The principle is to leverage existing security solutions where
> applicable.
> >> Even within Hadoop complex eco-system, each components have limited or
> no
> >> security controls. Instead of re-inventing everything, we will extend
> the
> >> core component security capabilities and add where needed. So the
> security
> >> is uniform, plug able and scalable.
> >>
> >> Providing a layered security along with central administration and
> >> auditing capabilities will enhance the security, usability, enterprise
> >> integration, compliance, etc. which will lead to more adoption of Apache
> >> Hadoop and projects working within its eco system.
> >>
> >> Regards
> >>
> >> Bosco
> >>
> >> `
> >> On Jul 16, 2014, at 12:12 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> it looks interesting.
> >>>
> >>> Do you have an idea about the interactions with other projects (Knox,
> >> Shiro, Syncope, whatever) ?
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> --
> >>> Jean-Baptiste Onofré
> >>> jbono...@apache.org
> >>> http://blog.nanthrax.net
> >>> Talend - http://www.talend.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >>> For additional commands, e-mail: general-h...@incubator.apache.org
> >>>
> >>
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> entity to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> immediately
> >> and delete it from your system. Thank You.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >>
> >
> >
> > --
> > Best regards,
> >
> >   - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to