LIKE operator: make its performance insensitive to the length of string when
pattern is immediatelly founded
------------------------------------------------------------------------------------------------------------
Key: CORE-4306
URL: http://tracker.firebirdsql.org/browse/CORE-4306
Project: Firebird Core
Issue Type: Improvement
Components: Engine
Affects Versions: 3.0 Alpha 1
Reporter: Pavel Zotov
Priority: Minor
It was found that performance of LIKE operator becomes very poor in case when
string have big length - e.g. 32K - though the character that matches to
pattern is just at the beginning of this string.
TEST:
=====
set stat on;
set term ^;
execute block returns(cnt int) as
declare n int = 10000000;
declare s varchar(32760);
begin
cnt=0;
s=rpad('q',32760,' '); -- string has huge length but BEGINS with character `q`
while (n>0) do begin
cnt = cnt + iif( s like '%q%', 1, 0); -- such pattern should be found
immediatelly
n=n-1;
end
suspend;
end^
set term ;^
set stat off;
output:
======
Current memory = 2450596208
Delta memory = 390320
Max memory = 2450628912
Elapsed time= 34.18 sec -- <<<<<<<<<<<<<<<<<<<<< :(((
Cpu = 0.00 sec
Buffers = 524288
Reads = 16
Writes = 0
Fetches = 37
SQL> set term ;^
SQL> set stat off;
Compare with:
============
set stat on;
set term ^;
execute block returns(cnt int) as
declare n int = 10000000;
declare s varchar(50);
begin
cnt=0;
s=rpad('q',50,' '); -- same string, but of small length
while (n>0) do begin
cnt = cnt + iif( s like '%q%', 1, 0);
n=n-1;
end
suspend;
end^
set term ;^
set stat off;
output:
=====
Current memory = 2450530424
Delta memory = -65784
Max memory = 2450628912
Elapsed time= 6.68 sec -- <<<<<<<<<<<<<<<<< much better
Cpu = 0.00 sec
Buffers = 524288
Reads = 0
Writes = 0
Fetches = 2
PS. compare with Java, (only FYI):
==================
import static java.lang.System.*;
import java.util.regex.*;
class MatchesCount {
public static void main(String[] args) {
int pad = args.length > 0 ? Integer.parseInt(args[0]) : 32760;
String s = String.format("%1$-" + pad + "s", "q");
//out.println("s=>"+s+"<");
Pattern p = Pattern.compile(".*?q.*?");
Matcher m;
int cnt=0;
long t0 = currentTimeMillis();
for(int i=0; i < 10000000; i++) {
m=p.matcher(s);
cnt += m.find() ? 1 : 0;
}
out.println("s.length()="+s.length()+", cnt="+cnt+", done at "+(
currentTimeMillis() - t0 )+" ms");
}
}
====
D:\JAVA>java MatchesCount 10
s.length()=10, cnt=10000000, done at 5562 ms
D:\JAVA>java MatchesCount
s.length()=32760, cnt=10000000, done at 5860 ms
D:\JAVA>java MatchesCount 256000
s.length()=256000, cnt=10000000, done at 5594 ms
D:\JAVA>java MatchesCount 512000
s.length()=512000, cnt=10000000, done at 5735 ms
As we can see, Java has regexp engine that is NO sensitive to the length of
source string (in case when pattern is founded immediatelly).
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel