Ok, so we can confirm that the TabletServer logged in. This is what your .out file is telling us.

This is looking like you don't have your software configured correctly. As this error message is trying to tell you, HDFS supports three types of authentication tokens: "simple", "token", and "kerberos". "Simple" is what is used by default (without Kerberos). "Kerberos" refers to clients with a Kerberos ticket. Ignore "token" for now as it's irrelevant.

We can tell from the stack trace that the TabletServer made an RPC to a datanode. For some reason, this RPC was requesting "simple" authentication and not "kerberos". The datanode is telling you that it's not allowed to accept your "simple" token and that you need to use "kerberos" (or "token").

If I had to venture a guess, it would be that you have Accumulo configured to use the wrong Hadoop configuration files, notably core-site.xml and hdfs-site.xml.

Try the command `accumulo classpath` command and verify that the Hadoop configuration files included there are the correct ones (the ones that are configured for Kerberos).

- Josh

roman.drap...@baesystems.com wrote:
Hi Josh,

Tried on the tserver - does not really give more information. I am trying just 
bin/start-here.sh on the slave (master is successfully running).

This is what I can see in ".out" log

2016-01-26 22:29:18,233 [security.SecurityUtil] INFO : Attempting to login with keytab as 
accumulo/<host>@<realm>
2016-01-26 22:29:18,371 [util.NativeCodeLoader] WARN : Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2016-01-26 22:29:18,685 [security.SecurityUtil] INFO : Succesfully logged in as user 
accumulo/<host>@<realm>

This is ".log" log - it looks like something with Zookeper Utils..

2016-01-26 22:29:19,963 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose 
set to false in hdfs-site.xml: data loss is possible on hard system reset or 
power loss
2016-01-26 22:29:19,966 [conf.Property] DEBUG: Loaded class : 
org.apache.accumulo.server.fs.PerTableVolumeChooser
2016-01-26 22:29:20,050 [zookeeper.ZooUtil] ERROR: Problem reading instance id out of 
hdfs at hdfs://<host>:8020/accumulo/instance_id
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not 
enabled.  Available:[TOKEN, KERBEROS]
         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
         at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
         at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
         at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
         at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1965)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1946)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
         at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
         at 
org.apache.accumulo.core.zookeeper.ZooUtil.getInstanceIDFromHdfs(ZooUtil.java:59)
         at 
org.apache.accumulo.core.zookeeper.ZooUtil.getInstanceIDFromHdfs(ZooUtil.java:51)
         at 
org.apache.accumulo.server.client.HdfsZooInstance._getInstanceID(HdfsZooInstance.java:137)
         at 
org.apache.accumulo.server.client.HdfsZooInstance.getInstanceID(HdfsZooInstance.java:121)
         at 
org.apache.accumulo.server.conf.ServerConfigurationFactory.<init>(ServerConfigurationFactory.java:113)
         at 
org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:2952)
         at 
org.apache.accumulo.tserver.TServerExecutable.execute(TServerExecutable.java:33)
         at org.apache.accumulo.start.Main$1.run(Main.java:93)
         at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
         at com.sun.proxy.$Proxy22.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:554)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy23.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1963)
         ... 16 more
2016-01-26 22:29:20,053 [tserver.TabletServer] ERROR: Uncaught exception in 
TabletServer.main, exiting
java.lang.RuntimeException: Can't tell if Accumulo is initialized; can't read 
instance id at hdfs://<host>:8020/accumulo/instance_id
         at 
org.apache.accumulo.core.zookeeper.ZooUtil.getInstanceIDFromHdfs(ZooUtil.java:76)
         at 
org.apache.accumulo.core.zookeeper.ZooUtil.getInstanceIDFromHdfs(ZooUtil.java:51)
         at 
org.apache.accumulo.server.client.HdfsZooInstance._getInstanceID(HdfsZooInstance.java:137)
         at 
org.apache.accumulo.server.client.HdfsZooInstance.getInstanceID(HdfsZooInstance.java:121)
         at 
org.apache.accumulo.server.conf.ServerConfigurationFactory.<init>(ServerConfigurationFactory.java:113)
         at 
org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:2952)
         at 
org.apache.accumulo.tserver.TServerExecutable.execute(TServerExecutable.java:33)
         at org.apache.accumulo.start.Main$1.run(Main.java:93)
         at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: SIMPLE 
authentication is not enabled.  Available:[TOKEN, KERBEROS]
         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
         at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
         at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
         at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
         at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1965)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1946)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
         at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
         at 
org.apache.accumulo.core.zookeeper.ZooUtil.getInstanceIDFromHdfs(ZooUtil.java:59)
         ... 8 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
         at org.apache.hadoop.ipc.Client.call(Client.java:1468)
         at org.apache.hadoop.ipc.Client.call(Client.java:1399)
         at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
         at com.sun.proxy.$Proxy22.getListing(Unknown Source)
         at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:554)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
         at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy23.getListing(Unknown Source)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1963)
         ... 16 more

-----Original Message-----
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 21:39
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos

Your confusion is stemming from what stop-all.sh is actually doing (although, I 
still have no idea how stopping processes has any bearing on how they start 
:smile:). Notably, this script will invoke `accumulo admin stopAll` to trigger 
a graceful shutdown before stopping the services hard (`kill`).

So, as you would run these scripts as the 'accumulo' user without Kerberos, you 
should also be logged in as the 'accumulo' Kerberos user when starting them. 
This might be missing from the docs. Please do suggest where some documentation 
should be added to cover this.

If it doesn't go without saying, this is a separate issue from your services 
not logging in correctly. Can you share logs? Try enabling 
-Dsun.security.kr5b.debug=true in the appropriate environment variable (for the 
service you want to turn it on for) in accumulo-env.sh and then start the 
services again (hopefully, sharing that too if the problem isn't obvious).

roman.drap...@baesystems.com wrote:
I want to believe in this, but what I see contradicts this statement..

I do bin/stop-all.sh on master.

If I have a ticket cache for hdfs user, I don't see any errors.
If I don't a have ticket cache for hdfs user, I see these errors.

I can see that all slaves and master successfully logged in as accumulo user.

However slaves are failing straight away due to the error I posted in the 
previous email. I also see this error when I stop the master and I don't' have 
ticket cache for hdfs user, however I don't see it if I have ticket cache (as 
per above)... It's kind of a reflection of the previous problem with vfs.




-----Original Message-----
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 21:08
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos

Ok, let me repeat: running a `kinit` in your local shell has *no
bearing* on what Accumulo is doing. This is fundamentally not how it works. 
There are libraries in the JDK which perform the login with the KDC using the 
keytab you provide in accumulo-site.xml. Accumulo is not using the ticket cache 
which your `kinit` creates.

</rant>

You should see a message in the log stating that the Kerberos login happened 
(or didn't). The server should exit if it fails to log in (but I don't know if 
I've actively tested that). Do you see this message?
Does it say you successfully logged in (and the principal you logged in as)?

roman.drap...@baesystems.com wrote:
Ok, there is some progress. So these issues were definitely related to VFS 
classloader - now works both on the client and master - so I guess a bug is 
found.

And it looks like there is a very similar issue related to
instance_id

On the slaves (does not matter whether I do kinit hdfs or not) I always receive 
when I start the node:

           2016-01-26 20:36:41,744 [tserver.TabletServer] ERROR: Uncaught 
exception in TabletServer.main, exiting
           java.lang.RuntimeException: Can't tell if Accumulo is
initialized; can't read instance id at
hdfs://cr-platform-qa23-01.cyberreveal.local:8020/accumulo/instance_i
d

On the master I can see the same issue when I do bin/stop-all.sh without kinit 
hdfs and it disappears if I have a hdfs ticket.

I tried both: hadoop fs -chown -R accumulo:hdfs /accumulo and hadoop
fs -chown -R accumulo:accumulo /accumulo - same behavior

Any thoughts please?




-----Original Message-----
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 20:08
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos

The normal classloader (on the local filesystem) which is configured out of the 
box.

roman.drap...@baesystems.com wrote:
Hi Josh,

I can confirm that issue on the master is related to VFS classloader!  
Commented out classloader and now it works without kinit. So it seems it tries 
loading classes before Kerberos authentication happened. What classloader 
should I use instead?

Regards,
Roman

-----Original Message-----
From: roman.drap...@baesystems.com
[mailto:roman.drap...@baesystems.com]
Sent: 26 January 2016 19:43
To: user@accumulo.apache.org
Subject: RE: Accumulo and Kerberos

Hi Josh,

Two quick questions.

1) What should I use instead of HDFS classloader? All examples seem to be from 
hdfs.
2) Whan 1.7.1 release is scheduled for (approx.) ?

Regards,
Roman

-----Original Message-----
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 19:01
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos

I would strongly recommend that you do not use the HDFS classloader. It is 
known to be very broken in what you download as 1.7.0. There are a number of 
JIRA issues about this which stem from a lack of a released commons-vfs2-2.1.

That being said, I have not done anything with running Accumulo out of HDFS 
with Kerberos enabled. AFAIK, you're in untraveled waters.

re: the renewal bug: When the ticket expires, the Accumulo service
will die. Your options are to deploy a watchdog process that would
restart the service, download the fix from the JIRA case and rebuild
Accumulo yourself, or build 1.7.1-SNAPSHOT from our codebase. I
would recommend using 1.7.1-SNAPSHOT as it should be the least
painful (1.7.1-SNAPSHOT now is likely to not change significantly
from what is ultimately released as 1.7.1)

roman.drap...@baesystems.com wrote:
Hi Josh,

Yes, will do. Just in the meantime - I can see a different issue on slave 
nodes. If I try to start in isolation (bin/start-here.sh) with or without doing 
kinit I always see the error below.

2016-01-26 18:31:13,873 [start.Main] ERROR: Problem initializing
the class loader java.lang.reflect.InvocationTargetException
             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
             at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
             at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
             at java.lang.reflect.Method.invoke(Method.java:606)
             at org.apache.accumulo.start.Main.getClassLoader(Main.java:68)
             at org.apache.accumulo.start.Main.main(Main.java:52)
Caused by: org.apache.commons.vfs2.FileSystemException: Could not determine the type of file 
"hdfs://<hostname>/platform/lib/.*.jar".
             at 
org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:1522)
             at 
org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:489)
             at 
org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.resolve(AccumuloVFSClassLoader.java:143)
             at 
org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.resolve(AccumuloVFSClassLoader.java:121)
             at 
org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.getClassLoader(AccumuloVFSClassLoader.java:211)
             ... 6 more
Caused by: org.apache.hadoop.security.AccessControlException:
SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

I guess it might be different to what I observe on the master node. If I don't 
get ticket explicitly, I get the error mentioned in the previous email. However 
if do (and it does not matter for what user I have a ticket now - whether it's 
accumulo, hdfs or hive) - it works. So I started to think, maybe the problem 
related to some action (for example to vfs as per above) that tries to access 
HDFS before doing a proper authentication with Kerberos? Any ideas?

Also, if we go live with 1.7.0 - what approach would you recommend for renewing 
tickets? Does it require stopping and starting the cluster?

Regards,
Roman



-----Original Message-----
From: Josh Elser [mailto:josh.el...@gmail.com]
Sent: 26 January 2016 18:10
To: user@accumulo.apache.org
Subject: Re: Accumulo and Kerberos

Hi Roman,

Accumulo services (TabletServer, Master, etc) all use a keytab to automatically 
obtain a ticket from the KDC when they start up. You do not need to do anything 
with kinit when starting Accumulo.

One worry is ACCUMULO-4069[1] with all presently released versions (most 
notably 1.7.0 which you are using). This is a bug in which services did not 
automatically renew their ticket. We're working on a 1.7.1, but it's not out 
yet.

As for debugging your issue, take a look at the Kerberos section on debugging 
in the user manual [2]. Take a very close look at the principal the service is 
using to obtain the ticket and what the principal is for your keytab. A good 
sanity check is to make sure you can `kinit` in the shell using the keytab and 
the correct principal (rule out the keytab being incorrect).

If you still get stuck, collect the output specifying 
-Dsun.security.krb5.debug=true in accumulo-env.sh (per the instructions) and 
try enabling log4j DEBUG on org.apache.hadoop.security.UserGroupInformation.

- Josh

[1] https://issues.apache.org/jira/browse/ACCUMULO-4069
[2]
http://accumulo.apache.org/1.7/accumulo_user_manual.html#_debugging

roman.drap...@baesystems.com wrote:
Hi there,

Trying to setup Accumulo 1.7 on Kerberized cluster. Only
interested in master/tablets to be kerberized (not end-users).
Configured everything as per manual:

1)Created principals

2)Generated glob keytab

3)Modified accumulo-site.xml providing general.kerberos.keytab and
general.kerberos.principal

If I start as accumulo user I get: Caused by: GSSException: No
valid credentials provided (Mechanism level: Failed to find any
Kerberos
tgt)

However, if I give explicitly a token with kinit and keytab
generated above in the shell - it works as expected. To my
understanding Accumulo has to obtain tickets automatically? Or the
idea is to write a cron job and apply kinit to every tablet server per day?

Regards,

Roman

Please consider the environment before printing this email. This
message should be regarded as confidential. If you have received
this email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in
hard copy by an authorised signatory. The contents of this email
may relate to dealings with other companies under the control of
BAE Systems Applied Intelligence Limited, details of which can be
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.

Reply via email to