[ 
https://issues.apache.org/jira/browse/ACCUMULO-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408511#comment-15408511
 ] 

Michael Wall commented on ACCUMULO-2971:
----------------------------------------

Moving this to 1.8.1, it doesn't change any public behavior and can be a bug 
fix.

> ChangeSecret tool should refuse to run if no write access to HDFS
> -----------------------------------------------------------------
>
>                 Key: ACCUMULO-2971
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2971
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0, 1.5.1, 1.6.0
>            Reporter: Sean Busbey
>            Assignee: Michael Miller
>              Labels: newbie
>             Fix For: 1.8.1
>
>
> Currently, the ChangeSecret tool doesn't do any check to ensure the user 
> running it has the ability to write to /accumlo/instance_id.
> In the event that an admin knows the instance secret but runs the command as 
> a user who can not write to the instance_id, the result is an unhelpful error 
> message and a disconnect between HDFS and zookeeper.
> Example for cluster with instance named "foobar"
> {code}
> [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id
> Found 1 items
> -rw-r--r--   3 accumulo accumulo          0 2014-07-02 09:05 
> /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4
> [busbey@edge ~]$ accumulo org.apache.accumulo.server.util.ChangeSecret
> old zookeeper password: 
> new zookeeper password: 
> Thread "org.apache.accumulo.server.util.ChangeSecret" died Permission denied: 
> user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=busbey, access=WRITE, inode="/accumulo":accumulo:accumulo:drwxr-x--x
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>       at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>       at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1489)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:355)
>       at 
> org.apache.accumulo.server.util.ChangeSecret.updateHdfs(ChangeSecret.java:150)
>       at 
> org.apache.accumulo.server.util.ChangeSecret.main(ChangeSecret.java:66)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.accumulo.start.Main$1.run(Main.java:141)
>       at java.lang.Thread.run(Thread.java:662)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
>  Permission denied: user=busbey, access=WRITE, 
> inode="/accumulo":accumulo:accumulo:drwxr-x--x
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4846)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2911)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1238)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>       at $Proxy16.delete(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:408)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>       at $Proxy17.delete(Unknown Source)
>       at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1487)
>       ... 9 more
> [busbey@edge ~]$ hdfs dfs -ls /accumulo/instance_id
> Found 1 items
> -rw-r--r--   3 accumulo accumulo          0 2014-07-02 09:05 
> /accumulo/instance_id/cb977c77-3e13-4522-b718-2b487d722fd4
> [busbey@edge ~]$ zookeeper-client
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> [zk: localhost:2181(CONNECTED) 0] get /accumulo/instances/foobar
> 1528cc95-2600-4649-a50e-1645404e9d6c
> cZxid = 0xe00034f45
> ctime = Wed Jul 02 09:27:58 PDT 2014
> mZxid = 0xe00034f45
> mtime = Wed Jul 02 09:27:58 PDT 2014
> pZxid = 0xe00034f45
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 36
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 1] ls 
> /accumulo/1528cc95-2600-4649-a50e-1645404e9d6c
> [users, monitor, problems, root_tablet, gc, hdfs_reservations, table_locks, 
> namespaces, recovery, fate, tservers, tables, next_file, tracers, config, 
> dead, bulk_failed_copyq, masters]
> [zk: localhost:2181(CONNECTED) 2] ls 
> /accumulo/cb977c77-3e13-4522-b718-2b487d722fd4
> [users, problems, monitor, root_tablet, hdfs_reservations, gc, table_locks, 
> namespaces, recovery, fate, tservers, tables, next_file, tracers, config, 
> masters, bulk_failed_copyq, dead]
> {code}
> What's worse, in this condition the cluster will properly come up and show 
> everything fine if the old instance secret is used.
> However, clients and servers will now end up looking at different zookeeper 
> nodes depending on wether they used HDFS to get the instance_id or if they 
> use a ZK instance name lookup to get it so long as they use the corresponding 
> instance secret.
> Furthermore, if an admin uses the CleanZooKeeper utility  subsequent to this 
> failure, it'll cause the loss of the zookeeper nodes the server processes are 
> looking at.
> The utility should do a sanity check that /accumulo/instance_id is writable 
> prior to changing zookeeper. It should also wait to update the instance name 
> to instand_id pointer in zookeeper until after HDFS has been updated.
> Workaround: manually edit the HDFS instance_id to match the new instance id 
> found zk for the instance name and proceed as though the secret change had 
> succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to