[ https://issues.apache.org/jira/browse/HADOOP-17079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272430#comment-17272430 ]
Xiaoyu Yao commented on HADOOP-17079: ------------------------------------- Thanks [~daryn] for the comments. Here are my thoughts on adding a new method for GroupCacheLoader#getGroupsSet. Many GroupMappingServiceProvider implementations have already used Set internally (e.g., LdapGroupsMapping#lookupGroup) or use additional step to dedup the list (e.g., ShellBasedUnixGroupsMapping). It is expensive to convert between Set and List back-and-forth with the the existing list-based getGroups() method in GroupMappingServiceProvider interface . Can you elaborate the proposal to change GroupCacheLoader#load? Can we avoid the two conversions? Set -> List ((GroupMappingServiceProvider Impl)) and List->Set (GroupCacheLoader). > Optimize UGI#getGroups by adding UGI#getGroupsSet > ------------------------------------------------- > > Key: HADOOP-17079 > URL: https://issues.apache.org/jira/browse/HADOOP-17079 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Priority: Major > Fix For: 3.4.0 > > Attachments: HADOOP-17079.002.patch, HADOOP-17079.003.patch, > HADOOP-17079.004.patch, HADOOP-17079.005.patch, HADOOP-17079.006.patch, > HADOOP-17079.007.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > UGI#getGroups has been optimized with HADOOP-13442 by avoiding the > List->Set->List conversion. However the returned list is not optimized to > contains lookup, especially the user's group membership list is huge > (thousands+) . This ticket is opened to add a UGI#getGroupsSet and use > Set#contains() instead of List#contains() to speed up large group look up > while minimize List->Set conversions in Groups#getGroups() call. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org