>From Dmitry Lychagin <[email protected]>:

Dmitry Lychagin has posted comments on this change. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243 )

Change subject: [ASTERIXDB-2680][FUN] Add support to regexp_matches() and 
regexp_split()
......................................................................


Patch Set 3:

(4 comments)

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/string/regexp_split/003/regexp_split.000.query.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/string/regexp_split/003/regexp_split.000.query.sqlpp:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/string/regexp_split/003/regexp_split.000.query.sqlpp@21
PS3, Line 21: regexp_split("C:\\Program Files\\couchbase\\server\\bin", 
"[\\\\]"),
replace "couchbase" with something else in this and next line. (AsterixDB, for 
example)


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpMatchesDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpMatchesDescriptor.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpMatchesDescriptor.java@82
PS3, Line 82:                             for (String match : matches) {
This is the find()/group() loop, so we can just do: while (matcher.find()) { 
String match = matcher.group(); ... }


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpSplitDescriptor.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpSplitDescriptor.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/StringRegExpSplitDescriptor.java@71
PS3, Line 71:                         String[] splits = 
srcPtr.toString().split(patternPtr.toString());
At some point we should reimplement it using RegExpMatcher. String.split() 
compiles regex pattern and creates a return array for each invocation. 
RegExpMatcher reuses the Matcher as long as the pattern remains the same. We 
could save on object construction if we do our own implementation. Please file 
a bug for tracking this optimization you'd rather do it in a separate change.


https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/utils/RegExpMatcher.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/utils/RegExpMatcher.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243/3/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/utils/RegExpMatcher.java@213
PS3, Line 213:     public List<String> getMatches() {
we already have a public find() method, let's just introduce group() method 
that'd return matcher.group(). then the caller can do find()/group() loop.Then 
we can also remove allMatches field



--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/4243
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Iccf5ba14f5c8b8cf4bcd6dd6e412bb515d68dd74
Gerrit-Change-Number: 4243
Gerrit-PatchSet: 3
Gerrit-Owner: Hussain Towaileb <[email protected]>
Gerrit-Reviewer: Ali Alsuliman <[email protected]>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Dmitry Lychagin <[email protected]>
Gerrit-Reviewer: Hussain Towaileb <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Comment-Date: Tue, 03 Dec 2019 23:38:04 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Gerrit-MessageType: comment

Reply via email to