[jira] [Commented] (PHOENIX-1287) Use the joni byte[] regex engine in place of j.u.regex

James Taylor (JIRA) Thu, 12 Mar 2015 12:08:56 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359190#comment-14359190
 ]


James Taylor commented on PHOENIX-1287:
---------------------------------------

QueryServices would only be used in this case to store a new config option and 
then use this option to determine if the old versus new regex implementation is 
used. Something like this:
{code}
public interface QueryServices extends SQLCloseable {
    public static final String KEEP_ALIVE_MS_ATTRIB = 
"phoenix.query.keepAliveMs";
    public static final String THREAD_POOL_SIZE_ATTRIB = 
"phoenix.query.threadPoolSize";
    public static final String USE_BYTE_BASED_REGEX_ATTRIB = 
"phoenix.regex.byteBased"; // new config param
...
{code}
Then add a default value in QueryServicesOption like this:
{code}
public class QueryServicesOptions {
        public static final int DEFAULT_KEEP_ALIVE_MS = 60000;
        public static final boolean DEFAULT_USE_BYTE_BASED_REGEX = true; // use 
byte based by default
{code}
Then during query compilation, in ExpressionCompiler we'd have code like this 
to determine which expression to instantiate:
{code}
    @Override
    public Expression visitLeave(LikeParseNode node, List<Expression> children) 
throws SQLException {
        ...
        QueryServices services = context.getConnection().getQueryServices();
        boolean useByteBasedRegex = getProps().getBoolean(
            QueryServices.USE_BYTE_BASED_REGEX_ATTRIB,
            QueryServicesOptions.DEFAULT_USE_BYTE_BASED_REGEX);
        Expression expression;
        if (useByteBasedRegex) {
            expression = ByteBasedLikeExpression.create(children, 
node.getLikeType());
        } else {
            expression = LikeExpression.create(children, node.getLikeType());
        }
        ...
{code}
where you implement ByteBasedLikeExpression using the j.u.regex implementation. 
You'd do the same kind of switching logic in RegexpReplaceFunction, 
RegexpSubstrFunction, and RegexpSplitFunction. For these, you'd use the 
nodeClass annotation in the built-in function annotation to define a factory. 
See RoundFunction for an example of this.



> Use the joni byte[] regex engine in place of j.u.regex
> ------------------------------------------------------
>
>                 Key: PHOENIX-1287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1287
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>              Labels: gsoc2015
>
> See HBASE-11907. We'd get a 2x perf benefit plus it's driven off of byte[] 
> instead of strings.Thanks for the pointer, [~apurtell].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-1287) Use the joni byte[] regex engine in place of j.u.regex

Reply via email to