Hi All,
We are trying to make major version upgrades of Accumulo and Hadoop:
Accumulo: 1.8.1 --> 2.0.1
Hadoop: 2.8.2 --> 3.0.3
They run as single non-clustered containers in a Docker environment. We have a
separate Zookeeper container running which they talk to at version 3.4.10.
Other client containers authenticate with Kerberos in order to retrieve
information from Accumulo. We have a KDC running on a separate authentication
VM which is reachable by both containers.
Once upgrading their major versions we get a problem with their Kerberos
authentication. On initialisation, Accumulo and Hadoop run kinit commands to
generate themselves ticket-granting-tickets (TGTs) which are valid for 24h, and
the full application works as expected. After 24h, however, our client
containers can no longer authenticate and access information from Accumulo,
despite the clients having valid service tickets for Accumulo.
If we manually regenerate the TGT within Accumulo with another kinit command
the problem still persists.
If we manually change the KDC's configuration to issue 10 minute tickets rather
than 24h, then authentication breaks after 10 minutes regardless of each
component's krb5.conf file.
Pre-upgrade all our containers must have been able to retrieve new tickets once
their old ones had expired, but this no longer seems to be the case. The only
way to fix the problem is by restarting the containers. Below are the
configuration files for the various containers - any variables surrounded by
"<>" are substituted in at runtime and point to valid paths / files / values.
Accumulo accumulo.properties:
general.kerberos.keytab=<KRB_KEYTAB>
general.kerberos.principal=<KRB_PRINCIPAL>
instance.rpc.sasl.enabled=true
instance.secret=<SECRET>
instance.security.authenticator=org.apache.accumulo.server.security.handler.KerberosAuthenticator
instance.security.authorizor=org.apache.accumulo.server.security.handler.KerberosAuthorizor
instance.security.permissionHandler=org.apache.accumulo.server.security.handler.KerberosPermissionHandler
instance.volumes=<HDFS_VOLUMES>
instance.zookeeper.host=<ZOOKEEPERS>
rpc.sasl.qop=auth
trace.token.property.keytab=<KRB_KEYTAB>
trace.token.type=org.apache.accumulo.core.client.security.tokens.KerberosToken
trace.user=<KRB_PRINCIPAL@>
tserver.cache.data.size=<CACHE_DATA_SIZE>
tserver.cache.index.size=<CACHE_INDEX_SIZE>
tserver.memory.maps.max=<MEMORY_MAPS_MAX>
tserver.memory.maps.native.enabled=false
tserver.sort.buffer.size=<SORT_BUFFER_SIZE>
tserver.walog.max.size=<WALOG_MAX_SIZE>
Accumulo accumulo-client.properties:
instance.name=accumulo
instance.zookeepers=<ZOOKEEPERS>
instance.zookeepers.timeout=30s
auth.type=kerberos
auth.principal=<KRB_PRINCIPAL>
auth.token=<KRB_KEYTAB>
sasl.enabled=true
sasl.qop=auth
sasl.kerberos.server.primary=accumulo
Accumulo / Hadoop / Zookeeper krb5.conf (all identical):
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
default_realm = NRAC.UK
#default_ccache_name = KEYRING:persistent:%{uid}
udp_preference_limit = 1
[realms]
NRAC.UK = {
kdc = <ldap-server>
admin_server = <ldap-server>
default_domain = <our_domain>
database_module = openldap_ldapconf
}
[domain_realm]
.<our_domain> = <OUR_DOMAIN>
<our_domain> = <OUR_DOMAIN>
[dbdefaults]
ldap_kerberos_container_dn = cn=krbContainer,dc=nrac,dc=uk
[dbmodules]
openldap_ldapconf = {
db_library = kldap
ldap_kdc_dn = "cn=nrac-ldapadm,dc=nrac,dc=uk"
ldap_kadmind_dn = "cn=nrac-ldapadm,dc=nrac,dc=uk"
ldap_service_password_file = /etc/krb5kdc/service.keyfile
ldap_servers = ldaps://<ldap-server>
ldap_conns_per_server = 5
}
Any help would be much appreciated, many thanks.
Alex Sparks
Public