Hello Lars,

I can't say I've personally run HDFS on Kubernetes with Kerberos enabled.
However, some of the issues you raise sound like they have some overlap
with the HDFS multi-homing features:

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html

Have you seen this? Does anything look helpful there?

Chris Nauroth


On Fri, Jun 24, 2022 at 4:55 AM Lars Francke <lars.fran...@gmail.com> wrote:

> Hi everyone,
>
> we're trying to get HDFS running in Kubernetes using Kerberos.
> This has some challenges as you might expect.
> We have created an issue for that including a spike:
> https://issues.apache.org/jira/browse/HDFS-16577
>
> Currently (as of 3.2.2, but reading through the release notes this doesn't
> seem to have changed since then) DataNodes use the same properties for
> deciding which port to bind each service to, as for deciding which ports
> are included in the `DatanodeRegistration` sent to the NameNode. Further,
> NameNodes overwrite the DataNode's IP address with the incoming address
> during registration.
>
> Both of these prevent external users from connecting to DataNodes that are
> hosted behind some sort of NAT (such as Kubernetes).
>
> We'd go ahead with a proper implementation/PR but we thought about asking
> for comments/feedback first. Maybe someone else has already done some work
> here that we might have missed etc.
>
> Thank you!
>
> Cheers,
> Lars
>

Reply via email to