[ 
https://issues.apache.org/jira/browse/HADOOP-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657348#comment-13657348
 ] 

Todd Lipcon commented on HADOOP-9439:
-------------------------------------

{code}
+  static String empty[] = new String[0];
{code}

Should probably be static final, right? And rename to {{NO_GROUPS}} or 
{{EMPTY_STIRNG_ARRAY}} or something a little more descriptive

----

{code}
+  native static String[] getGroupsForUser(String username, String empty[],
+        boolean reentrant);
{code}

I don't get why {{empty}} is a parameter here... aside from it being a 
premature optimization (is allocating an empty array that bad?), it seems like 
the JNI code could just grab this field itself and return a reference, rather 
than taking it as a parameter.

----
{code}
+      groups = getGroupsForUser(user, empty, useReentrant );
{code}
Style nit: extra space here before ')'

----
{code}
+      snprintf(buf, sizeof(buf), "getgrouplist error %d", ret);
+      THROW(env, "java/lang/RuntimeException", buf);
{code}

Can we use strerror here on the return, assuming it's probably a standard 
errnum?

----
{code}
+    ret = hadoop_group_info_fetch(ginfo, uinfo->gids[i]);
+    if (!ret) {
{code}
Here if the group info lookup fails for a particular gid, you end up swallowing 
the error without any warnings, etc. Maybe instead we should just return the 
numeric gid as a group? This is what the 'groups' shell command does:

{code}
todd@todd-w510:~$ groups todd
todd : groups: cannot find name for group ID 1050
1050 adm dialout cdrom audio video plugdev lpadmin admin sambashare mrtest 
wireshark
{code}

----
- I noticed you use different locks for getgrnam and getpwnam. But when I was 
working on HADOOP-7156, I found that the buggy implementations were racey in 
other parts of their code -- ie a concurrent getgrnam and a concurrent getpwnam 
might race with eachother and cause either to fail. I know that you added this 
function to deal with a crash we saw on a production cluster - did you verify 
that with the same buggy underlying implementation that the workaround prevents 
the issue?

We may have to actually share this lock and configuration with the lock used in 
NativeIO.c.

Additionally, when I did that patch, I found it simpler to continue to use the 
reentrant functions wrapped with a lock -- less repeated code, etc. 
                
> JniBasedUnixGroupsMapping: fix some crash bugs
> ----------------------------------------------
>
>                 Key: HADOOP-9439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9439
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>    Affects Versions: 2.0.4-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HADOOP-9439.001.patch, HDFS-4640.002.patch
>
>
> JniBasedUnixGroupsMapping has some issues.
> * sometimes on error paths variables are freed prior to being initialized
> * re-allocate buffers less frequently (can reuse the same buffer for multiple 
> calls to getgrnam)
> * allow non-reentrant functions to be used, to work around client bugs
> * don't throw IOException from JNI functions if the JNI functions do not 
> declare this checked exception.
> * don't bail out if only one group name among all the ones associated with a 
> user can't be looked up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to