[
https://issues.apache.org/jira/browse/PHOENIX-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359190#comment-14359190
]
James Taylor commented on PHOENIX-1287:
---------------------------------------
QueryServices would only be used in this case to store a new config option and
then use this option to determine if the old versus new regex implementation is
used. Something like this:
{code}
public interface QueryServices extends SQLCloseable {
public static final String KEEP_ALIVE_MS_ATTRIB =
"phoenix.query.keepAliveMs";
public static final String THREAD_POOL_SIZE_ATTRIB =
"phoenix.query.threadPoolSize";
public static final String USE_BYTE_BASED_REGEX_ATTRIB =
"phoenix.regex.byteBased"; // new config param
...
{code}
Then add a default value in QueryServicesOption like this:
{code}
public class QueryServicesOptions {
public static final int DEFAULT_KEEP_ALIVE_MS = 60000;
public static final boolean DEFAULT_USE_BYTE_BASED_REGEX = true; // use
byte based by default
{code}
Then during query compilation, in ExpressionCompiler we'd have code like this
to determine which expression to instantiate:
{code}
@Override
public Expression visitLeave(LikeParseNode node, List<Expression> children)
throws SQLException {
...
QueryServices services = context.getConnection().getQueryServices();
boolean useByteBasedRegex = getProps().getBoolean(
QueryServices.USE_BYTE_BASED_REGEX_ATTRIB,
QueryServicesOptions.DEFAULT_USE_BYTE_BASED_REGEX);
Expression expression;
if (useByteBasedRegex) {
expression = ByteBasedLikeExpression.create(children,
node.getLikeType());
} else {
expression = LikeExpression.create(children, node.getLikeType());
}
...
{code}
where you implement ByteBasedLikeExpression using the j.u.regex implementation.
You'd do the same kind of switching logic in RegexpReplaceFunction,
RegexpSubstrFunction, and RegexpSplitFunction. For these, you'd use the
nodeClass annotation in the built-in function annotation to define a factory.
See RoundFunction for an example of this.
> Use the joni byte[] regex engine in place of j.u.regex
> ------------------------------------------------------
>
> Key: PHOENIX-1287
> URL: https://issues.apache.org/jira/browse/PHOENIX-1287
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Labels: gsoc2015
>
> See HBASE-11907. We'd get a 2x perf benefit plus it's driven off of byte[]
> instead of strings.Thanks for the pointer, [~apurtell].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)