Hi everyone,

we're trying to get HDFS running in Kubernetes using Kerberos.
This has some challenges as you might expect.
We have created an issue for that including a spike:
https://issues.apache.org/jira/browse/HDFS-16577

Currently (as of 3.2.2, but reading through the release notes this doesn't
seem to have changed since then) DataNodes use the same properties for
deciding which port to bind each service to, as for deciding which ports
are included in the `DatanodeRegistration` sent to the NameNode. Further,
NameNodes overwrite the DataNode's IP address with the incoming address
during registration.

Both of these prevent external users from connecting to DataNodes that are
hosted behind some sort of NAT (such as Kubernetes).

We'd go ahead with a proper implementation/PR but we thought about asking
for comments/feedback first. Maybe someone else has already done some work
here that we might have missed etc.

Thank you!

Cheers,
Lars

Reply via email to