[ 
https://issues.apache.org/jira/browse/HDFS-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893236#comment-16893236
 ] 

Istvan Fajth commented on HDFS-13322:
-------------------------------------

To document a new learning point regarding this change, I would like to add the 
following information:

FUSE is figuring out the environment of the caller based on its pid that is 
passed on to the FUSE code from the kernel in the FuseContext structure. With 
the pid fuse-dfs is turning to /proc/(context->pid)/environ file to find the 
KRB5CCNAME environment variable.

After the change the connection cache keys are consists of a (username, 
kerberos ticket cache path) pair if authentication is set to KERBEROS, so for 
every pair we hold a different connection build with the ticket cache path, and 
we use that later on. With SIMPLE authentication, the ticket cache path is 
presented in the pair as the \0 character always.

For this the ticket cache pathin case of KERBEROS authentication is being read 
from /proc/(context->pid)/environ on every access.

In the Linux [proc file system man 
page|http://man7.org/linux/man-pages/man5/proc.5.html] the following is written 
for /proc/[pid]/environ:
{quote}This file contains the *initial* environment that was set when the 
currently executing program was started via execve(2).{quote}

This can lead to odd behaviors in case the access is not happening in a new 
process but it is part of a process that exported the KRB5CCNAME environment 
variable. So for example in a shell when executing the following commands, FUSE 
will not be able to read the KRB5CCNAME variable from the 
/proc/(context->pid)/environ file:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ echo "foo" > /mnt/hdfs/tmp/foo.txt{code}
This is because in this case echo is happening in the shell, and the shell's 
process id will be there in context->pid, and the /proc/(context->pid)/environ 
file will not contain the environment variable KRB5CCNAME as it is not part of 
the initial environment.

In the meantime the following will work because cp will be a new process which 
inherits the environment from the current shell:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ echo "foo" > /tmp/foo.txt
$ cp /tmp/foo.txt /mnt/hdfs/tmp/foo.txt{code}
 

To workaround this behaviour, the caller has to ensure that the initial 
environment of every accessing process has the correct KRB5CCNAME set. So for 
example the echo example would work correctly the following way:
{code:java}
$ export KRB5CCNAME=/tmp/myticketcache
$ /bin/sh
$ echo "foo" > /mnt/hdfs/tmp/foo.txt
$ exit{code}

> fuse dfs - uid persists when switching between ticket caches
> ------------------------------------------------------------
>
>                 Key: HDFS-13322
>                 URL: https://issues.apache.org/jira/browse/HDFS-13322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fuse-dfs
>    Affects Versions: 2.6.0
>         Environment: Linux xxxxxx.xx.xx.xxx 3.10.0-514.el7.x86_64 #1 SMP Wed 
> Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>  
>            Reporter: Shoeb Sheyx
>            Assignee: Istvan Fajth
>            Priority: Minor
>             Fix For: 3.2.0
>
>         Attachments: HDFS-13322.001.patch, HDFS-13322.002.patch, 
> HDFS-13322.003.patch, TestFuse.java, TestFuse2.java, catter.sh, catter2.sh, 
> perftest_new_behaviour_10k_different_1KB.txt, perftest_new_behaviour_1B.txt, 
> perftest_new_behaviour_1KB.txt, perftest_new_behaviour_1MB.txt, 
> perftest_old_behaviour_10k_different_1KB.txt, perftest_old_behaviour_1B.txt, 
> perftest_old_behaviour_1KB.txt, perftest_old_behaviour_1MB.txt, 
> testHDFS-13322.sh, test_after_patch.out, test_before_patch.out
>
>
> The symptoms of this issue are the same as described in HDFS-3608 except the 
> workaround that was applied (detect changes in UID ticket cache) doesn't 
> resolve the issue when multiple ticket caches are in use by the same user.
> Our use case requires that a job scheduler running as a specific uid obtain 
> separate kerberos sessions per job and that each of these sessions use a 
> separate cache. When switching sessions this way, no change is made to the 
> original ticket cache so the cached filesystem instance doesn't get 
> regenerated.
>  
> {{$ export KRB5CCNAME=/tmp/krb5cc_session1}}
> {{$ kinit user_a@domain}}
> {{$ touch /fuse_mount/tmp/testfile1}}
> {{$ ls -l /fuse_mount/tmp/testfile1}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile1*}}
> {{$ export KRB5CCNAME=/tmp/krb5cc_session2}}
> {{$ kinit user_b@domain}}
> {{$ touch /fuse_mount/tmp/testfile2}}
> {{$ ls -l /fuse_mount/tmp/testfile2}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile2*}}
> {{   }}{color:#d04437}*{{** expected owner to be user_b **}}*{color}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to