Cool. The doc will need some refinement as it isn't entirely accurate. In addition we need to separate between Airflow as a client of kerberized services (this is what is talked about in the astronomer doc) vs kerberizing airflow itself, which the API supports.
In general to access kerberized services (airflow as a client) one needs to start the ticket renewer with a valid keytab. For the hooks it isn't always required to change the hook to support it. Hadoop cli tools often just pick it up as their client config is set to do so. Then another class is there for HTTP-like services which are accessed by urllib under the hood, these typically use SPNEGO. These often need to be adjusted as it requires some urllib config. Finally, there are protocols which use SASL with kerberos. Like HDFS (not webhdfs, that uses SPNEGO). These require per protocol implementations. From the top of my head we support kerberos client side now with: * Spark * HDFS (snakebite python 2.7, cli and with the upcoming libhdfs implementation) * Hive (not metastore afaik) Two things to remember: * If a job (ie. Spark job) will finish later than the maximum ticket lifetime you probably need to provide a keytab to said application. Otherwise you will get failures after the expiry. * A keytab (used by the renewer) are credentials (user and pass) so jobs are executed under the keytab in use at that moment * Securing keytab in multi tenancy airflow is a challenge. This also goes for securing connections. This we need to fix at some point. Solution for now seems to be no multi tenancy. Kerberos seems harder than it is btw. Still, we are sometimes moving away from it to OAUTH2 based authentication. This gets use closer to cloud standards (but we are on prem) B. Sent from my iPhone > On 27 Jul 2018, at 17:41, Hitesh Shah <hit...@apache.org> wrote: > > Hi Taylor > > +1 on upstreaming this. It would be great if you can submit a pull request > to enhance the apache airflow docs. > > thanks > Hitesh > > >> On Thu, Jul 26, 2018 at 2:32 PM Taylor Edmiston <tedmis...@gmail.com> wrote: >> >> While we're on the topic, I'd love any feedback from Bolke or others who've >> used Kerberos with Airflow on this quick guide I put together yesterday. >> It's similar to what's in the Airflow docs but instead all on one page >> and slightly >> expanded. >> >> >> https://github.com/astronomerio/airflow-guides/blob/master/guides/kerberos.md >> (or web version <https://www.astronomer.io/guides/kerberos/>) >> >> One thing I'd like to add is a minimal example of how to Kerberize a hook. >> >> I'd be happy to upstream this as well if it's useful (maybe a Concepts > >> Additional Functionality > Kerberos page?) >> >> Best, >> Taylor >> >> >> *Taylor Edmiston* >> Blog <https://blog.tedmiston.com/> | CV >> <https://stackoverflow.com/cv/taylor> | LinkedIn >> <https://www.linkedin.com/in/tedmiston/> | AngelList >> <https://angel.co/taylor> | Stack Overflow >> <https://stackoverflow.com/users/149428/taylor-edmiston> >> >> >> On Thu, Jul 26, 2018 at 5:18 PM, Driesprong, Fokko <fo...@driesprong.frl> >> wrote: >> >>> Hi Ry, >>> >>> You should ask Bolke de Bruin. He's really experienced with Kerberos and >> he >>> did also the implementation for Airflow. Beside that he worked also on >>> implementing Kerberos in Ambari. Just want to let you know. >>> >>> Cheers, Fokko >>> >>> Op do 26 jul. 2018 om 23:03 schreef Ry Walker <r...@astronomer.io> >>> >>>> Hi everyone - >>>> >>>> We have several bigCo's who are considering using Airflow asking into >> its >>>> support for Kerberos. >>>> >>>> We're going to work on a proof-of-concept next week, will likely >> record a >>>> screencast on it. >>>> >>>> For now, we're looking for any anecdotal information from organizations >>> who >>>> are using Kerberos with Airflow, if anyone would be willing to share >>> their >>>> experiences here, or reply to me personally, it would be greatly >>>> appreciated! >>>> >>>> -Ry >>>> >>>> -- >>>> >>>> *Ry Walker* | CEO, Astronomer <http://www.astronomer.io/> | >>> 513.417.2163 | >>>> @rywalker <http://twitter.com/rywalker> | LinkedIn >>>> <http://www.linkedin.com/in/rywalker> >>