Bruce Momjian <[EMAIL PROTECTED]> writes:
> Can this improvement get merged up into CVS current, or did you already
> do that Tom?
It's irrelevant to current.
regards, tom lane
---(end of broadcast)---
TIP 2: you can get off
Can this improvement get merged up into CVS current, or did you already
do that Tom?
---
Tatsuo Ishii wrote:
> > Nice work, Tatsuo! Wade, can you confirm that this patch solves your
> > problem?
> >
> > Tatsuo, please commi
Tom Lane kirjutas K, 05.02.2003 kell 08:12:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> > Another idea is to make special regex type and store the regexes
> > pre-parsed (i.e. in some fast-load form) ?
>
> Seems unlikely that going out to disk could beat just recompiling the
> regexp.
We have
> Nice work, Tatsuo! Wade, can you confirm that this patch solves your
> problem?
>
> Tatsuo, please commit into REL7_3 branch only --- I'm nearly ready to do
> a wholesale replacement of the regex code in HEAD, so you wouldn't
> accomplish much except to create a merge problem for me ...
Ok. I h
Confirmed. Looks like a 100-fold increase. Thanx guys.
Explain output can be seen here:
http://arch.wavefire.com/pgregex.txt
-Wade Klaver
At 09:59 AM 2/5/03 -0500, Tom Lane wrote:
>Tatsuo Ishii <[EMAIL PROTECTED]> writes:
>> Ok. The original complain can be sasily solved at least for single
>>
Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> Ok. The original complain can be sasily solved at least for single
> byte encoding databases. With the small patches(against 7.3.1)
> included, I got following result.
Nice work, Tatsuo! Wade, can you confirm that this patch solves your
problem?
Tatsuo,
Hannu Krosing <[EMAIL PROTECTED]> writes:
> Another idea is to make special regex type and store the regexes
> pre-parsed (i.e. in some fast-load form) ?
Seems unlikely that going out to disk could beat just recompiling the
regexp. They're not *that* slow to compile ... at least not when we
avoid
Tom Lane kirjutas K, 05.02.2003 kell 01:35:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > Speaking of which, is there (or should there be) some mechanism for
> > increasing the size of the compiled pattern cache? Perhaps a GUC var?
>
> I thought about that while I was messing with the code, but I
Ok. The original complain can be sasily solved at least for single
byte encoding databases. With the small patches(against 7.3.1)
included, I got following result.
test1:
select count(*) from tenk1 where 'quotidian' ~ string4;
count
---
0
(1 row)
Time: 113.81 ms
test2:
select count(*)
Neil Conway <[EMAIL PROTECTED]> writes:
> Speaking of which, is there (or should there be) some mechanism for
> increasing the size of the compiled pattern cache? Perhaps a GUC var?
I thought about that while I was messing with the code, but I don't
think there's much point in it, unless someone w
On Tue, 2003-02-04 at 17:26, Tom Lane wrote:
> Proof of concept:
> [...]
Very cool work, Tom.
> In the first case there are only four distinct patterns used, so we're
> running with cached precompiled regexes. In the other cases a new regex
> compilation must occur at each row.
Speaking of whic
Hannu Krosing <[EMAIL PROTECTED]> writes:
> Tom Lane kirjutas T, 04.02.2003 kell 21:18:
>> What advantages does it have to make it worth considering?
> Should be the same as pcre + support for wide chars.
Well, if someone wants to do the legwork to try it, that interface
should work just about co
Proof of concept:
PG 7.3 using regression database:
regression=# select count(*) from tenk1 where 'quotidian' ~ string4;
count
---
0
(1 row)
Time: 676.14 ms
regression=# select count(*) from tenk1 where 'quotidian' ~ stringu1;
count
---
0
(1 row)
Time: 3426.96 ms
regression=
Tom Lane kirjutas T, 04.02.2003 kell 21:18:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> > If we are going into code-lifting business, we should also consider
> > Pythons sre
>
> What advantages does it have to make it worth considering?
Should be the same as pcre + support for wide chars.
--
On Tue, 2003-02-04 at 13:21, Tom Lane wrote:
> After some further research, pcre does seem like an interesting
> alternative. Both pcre and Spencer's new code have essentially
> Berkeley-style licenses, so there's no problem there.
Keep in mind that pcre has an advertising clause in its license
(
Hannu Krosing <[EMAIL PROTECTED]> writes:
> If we are going into code-lifting business, we should also consider
> Pythons sre
What advantages does it have to make it worth considering?
regards, tom lane
---(end of broadcast)
On Tue, 2003-02-04 at 18:21, Tom Lane wrote:
> 4. pcre looks like it's probably *not* as well suited to a multibyte
> environment. In particular, I doubt that its UTF8 compile option was
> even turned on for the performance comparison Neil cited --- and the man
> page only promises "experimental,
> > It would be a delight to be able to use more advanced (IMHO) Perl-
> > compatible regexes in PostgreSQL.
>
> After some further research, pcre does seem like an interesting
> alternative. Both pcre and Spencer's new code have essentially
> Berkeley-style licenses, so there's no problem there.
Jon Jensen <[EMAIL PROTECTED]> writes:
> It would be a delight to be able to use more advanced (IMHO) Perl-
> compatible regexes in PostgreSQL.
After some further research, pcre does seem like an interesting
alternative. Both pcre and Spencer's new code have essentially
Berkeley-style licenses, s
On Tue, 4 Feb 2003, Neil Conway wrote:
> Spencer's implementation is outperformed by some other RE engines,
> notably PCRE (www.pcre.org). But switching to another engine might
> impose backward-compatibility problems, in terms of the details of the
> RE syntax.
It would be a delight to be able t
Neil Conway <[EMAIL PROTECTED]> writes:
> Sounds like we had about the same idea at about the same time -- I
> emailed Henry Spencer inquiring about the new RE engine last night.
I just did that this morning ;-) ... but more as politeness than
anything else. AFAICT from searching the net, packagi
On Tue, 2003-02-04 at 17:15, Neil Conway wrote:
> On Tue, 2003-02-04 at 11:59, Tom Lane wrote:
> > I'm about to go off and look at whether we can absorb the Tcl regex
> > package, which is Spencer's new baby. That will not be a solution for
> > 7.3.anything, but it could be an answer for 7.4.
>
>
On Tue, 2003-02-04 at 16:59, Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > Given that this problem isn't a regression, I don't think we need to
> > delay 7.3.2 to fix it (of course, a fix for 7.3.3 and 7.4 is essential,
> > IMHO).
>
> No, I've had to abandon my original thought tha
On Tue, 2003-02-04 at 11:59, Tom Lane wrote:
> I'm about to go off and look at whether we can absorb the Tcl regex
> package, which is Spencer's new baby. That will not be a solution for
> 7.3.anything, but it could be an answer for 7.4.
Sounds like we had about the same idea at about the same ti
Neil Conway <[EMAIL PROTECTED]> writes:
> Given that this problem isn't a regression, I don't think we need to
> delay 7.3.2 to fix it (of course, a fix for 7.3.3 and 7.4 is essential,
> IMHO).
No, I've had to abandon my original thought that it was a localized bug,
so it's not going to be fixed i
On Tue, 2003-02-04 at 11:24, wade wrote:
> I redid my trials with the same data set on 7.2.3 --with-multibyte and I
> get the same brutal performance hit, so it is definitely a
> multibyte-specific problem.
Given that this problem isn't a regression, I don't think we need to
delay 7.3.2 to fix i
wade <[EMAIL PROTECTED]> writes:
> I redid my trials with the same data set on 7.2.3 --with-multibyte and I
> get the same brutal performance hit, so it is definitely a
> multibyte-specific problem.
>
> There are only about 1000 words that appear more than once (2 or 3 times)
> in 27k rows.
Righ
OK,
I redid my trials with the same data set on 7.2.3 --with-multibyte and I
get the same brutal performance hit, so it is definitely a
multibyte-specific problem.
WRT the distribution of the data in the table, I used the following:
All g-words in /usr/share/dict with different processes attac
Next question: may I guess that you weren't using MULTIBYTE in 7.2?
After still more digging, I'm coming round to the opinion that the
problem is that MULTIBYTE is forced on in 7.3, and this imposes a
factor-of-256 overhead in a bunch of the operations in regcomp.c.
In particular, compiling a case
Wade, how many distinct patterns do you have in that table? What's the
population distribution (in particular, do the top 32 patterns account
for most of the table)?
It's looking like the issue is not so much that the 7.3 code is
completely broken, as that its LRU replacement policy for precompil
Well, IMHO I would rather see a delay of the roll-out by a day or two
than see a release with such a serious performance glitch. Especially
since I personally have been shooting my big mouth off to all my geek
friends on the leaps and bounds PG has made in the last few releases. With
my luck on
Sigh. It seems that somebody broke caching of compiled regexes,
so that your regex is recompiled each time it's used. I haven't
dug into the logic yet, but I think it must have been a mistake
in Thomas' change to make the regex cache be searched circularly:
2002-06-14 22:49 thomas
* sr
At 05:51 PM 2/3/03 -0500, Tom Lane wrote:
>wade <[EMAIL PROTECTED]> writes:
>> Here is the profile information. I included a log of the session that
>> generated it at the top of the gprof output. If there is any other info I
>> can help you with, please let me know.
>
>A four-second test isn't
wade <[EMAIL PROTECTED]> writes:
> Here is the profile information. I included a log of the session that
> generated it at the top of the gprof output. If there is any other info I
> can help you with, please let me know.
A four-second test isn't long enough to gather any statistically
meaningf
At 10:52 PM 1/31/03 -0500, Tom Lane wrote:
>wade <[EMAIL PROTECTED]> writes:
>> We recently upgraded a project from 7.2 to 7.3.1 to make use of some of
>> the cool new features in 7.3. The installed version is CVS stable from
>> yesterday. However, we noticed a major performance hit in POSIX re
At 08:31 PM 2/1/03 +0800, Christopher Kings-Lynne wrote:
>Why on earth are you using a CVS version!?!?!?!
>
>Chris
>
This problem manifests itself under 7.3.1 release as well. CVS is used so
we can access patches to the SRF stuff implemented after 7.3.1 was released.
Tom... any links that documen
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
> Why on earth are you using a CVS version!?!?!?!
I assume he meant tip of REL7_3 branch --- which is a perfectly
reasonable thing to install, even if there are still a few fixes
to go before we call it 7.3.2.
regards, to
Why on earth are you using a CVS version!?!?!?!
Chris
On Fri, 31 Jan 2003, wade wrote:
> Hello,
> We recently upgraded a project from 7.2 to 7.3.1 to make use of some of
> the cool new features in 7.3. The installed version is CVS stable from
> yesterday. However, we noticed a major performa
wade <[EMAIL PROTECTED]> writes:
> We recently upgraded a project from 7.2 to 7.3.1 to make use of some of
> the cool new features in 7.3. The installed version is CVS stable from
> yesterday. However, we noticed a major performance hit in POSIX regular
> expression matches against columns usin
Hello,
We recently upgraded a project from 7.2 to 7.3.1 to make use of some of
the cool new features in 7.3. The installed version is CVS stable from
yesterday. However, we noticed a major performance hit in POSIX regular
expression matches against columns using the ~* operator.
http://arch.w
40 matches
Mail list logo