[ 
https://issues.apache.org/jira/browse/ACCUMULO-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226654#comment-13226654
 ] 

Keith Turner commented on ACCUMULO-164:
---------------------------------------

John made the comment offline that determining if a set of patterns matches 
disjoint sets of column families may not be possible.  I think this is may be 
true for regular expressions.   However, it may be easy to determine this 
automatically with limited wildcarding.   

If only prefix wildcards were allowed, it seems like the following algorithm 
would ensure they are disjoint.

{noformat}
  boolean isDisjoint(Set<String> prefixes){
     while(prefixes.size() > 1){
       String shortestPrefix = removeShortestString(prefixes);
       for(String prefix : prefixes){
         if(prefix.startsWith(shortestPrefix)){
           return false;
         }
       }
     }
     return true;
  }
{noformat}

Does this seem correct? For suffixes, startsWith() would be replaced with 
endsWith().  So maybe we can handle all prefix wildcards or all suffix 
wildcards.  Can we verify anything else is disjoint?  I do not think so.

The following wildcards could match overlapping sets.

{noformat}
  *a*
  *b*
{noformat}

And so could the following.

{noformat}
  foo*
  *bar
{noformat}

So even though the literal parts of the above wildcards are unique, they can 
still match overlapping data. 
 


                
> Add support for wildcards/regexes in locality group setting.
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-164
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-164
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, master, tserver
>            Reporter: John Vines
>
> We should look into adding the ability to specify locality group columns as 
> either wildcarding or regexes. I'm unsure of the feasibility of this, hence 
> the lack of fix date.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to