=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
> Attached is a version that stores the minimal and maximal frequencies in
> the Numbers array, has the aforementioned assertion and more nicely
> ordered functions in ts_selfuncs.c.
Applied with some small corrections.
Tom Lane wrote:
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
[EMAIL PROTECTED] wrote:
Well whaddya know. It turned out that my new company has a
'Fridays-are-for-any-opensource-hacking-you-like' policy, so I got a
full day to work on the patch.
Hm, does their name start with
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] wrote:
> Well whaddya know. It turned out that my new company has a
> 'Fridays-are-for-any-opensource-hacking-you-like' policy, so I got a
> full day to work on the patch.
Hm, does their name start with G?
> Attach
[EMAIL PROTECTED] wrote:
Quoting Tom Lane <[EMAIL PROTECTED]>:
I wrote:
... One possibly
performance-relevant point is to use DatumGetTextPP for detoasting;
you've already paid the costs by using VARDATA_ANY etc, so you might
as well get the benefit.
Actually, wait a second. That code does
Quoting Tom Lane <[EMAIL PROTECTED]>:
I wrote:
... One possibly
performance-relevant point is to use DatumGetTextPP for detoasting;
you've already paid the costs by using VARDATA_ANY etc, so you might
as well get the benefit.
Actually, wait a second. That code doesn't work at all on toasted
I wrote:
> ... One possibly
> performance-relevant point is to use DatumGetTextPP for detoasting;
> you've already paid the costs by using VARDATA_ANY etc, so you might
> as well get the benefit.
Actually, wait a second. That code doesn't work at all on toasted data,
because it's trying to use V
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
> Pre-sorting introduced one problem (see XXX in code): it's not easy
> anymore to get the minimal frequency of MCELEM values. I was using it to
> assert that the selectivity of a tsquery node containing a lexeme not in
> MCELEM is no
On Tue, 2008-08-26 at 12:45 +0200, Jan Urbański wrote:
> > put it in a file called selfuncs_ts.c so it is similar to the existing
> > filename?
>
> I followed the pattern of ts_parse.c, ts_utils.c and so on.
> Also, I see geo_selfuncs.c. No big deal, though, I can move it.
No don't worry. You'r
Jan Urbański wrote:
Tom Lane wrote:
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]>
writes:
Simon Riggs wrote:
put it in a file called selfuncs_ts.c so it is similar to the existing
filename?
I followed the pattern of ts_parse.c, ts_utils.c and so on.
Also, I see geo_selfuncs.c. No big
Tom Lane wrote:
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
Simon Riggs wrote:
put it in a file called selfuncs_ts.c so it is similar to the existing
filename?
I followed the pattern of ts_parse.c, ts_utils.c and so on.
Also, I see geo_selfuncs.c. No big deal, though, I can
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <[EMAIL PROTECTED]> writes:
> Simon Riggs wrote:
>> put it in a file called selfuncs_ts.c so it is similar to the existing
>> filename?
> I followed the pattern of ts_parse.c, ts_utils.c and so on.
> Also, I see geo_selfuncs.c. No big deal, though, I can move it.
Simon Riggs wrote:
On Thu, 2008-08-14 at 22:27 +0200, Jan Urbański wrote:
Jan Urbański wrote:
+ * ts_selfuncs.c
Not sure why this is in its own file
I couldn't decide where to put it, so I came up with this.
put it in a file called selfuncs_ts.c so it is similar to the existing
filenam
On Thu, 2008-08-14 at 22:27 +0200, Jan Urbański wrote:
> Jan Urbański wrote:
> + * ts_selfuncs.c
Not sure why this is in its own file, but if it must be could we please
put it in a file called selfuncs_ts.c so it is similar to the existing
filename?
--
Simon Riggs www.2ndQuadrant.c
Jan Urbański wrote:
> Yeah, I got that idea, but then I thought the chances of touching the
> same element during binary search twice were very small. Especially now
> when the detoasting occurs only when we hit a text Datum that has the
> same length as the sought lexeme.
> Still, I can do
Alvaro Herrera wrote:
Jan Urbański wrote:
Heikki Linnakangas wrote:
Sounds like a plan. In (2), it's even better to detoast the values
lazily. For a typical one-word tsquery, the binary search will only
look at a small portion of the elements.
Hm, how can I do that? Toast is still a bit bla
Jan Urbański wrote:
> Heikki Linnakangas wrote:
>> Sounds like a plan. In (2), it's even better to detoast the values
>> lazily. For a typical one-word tsquery, the binary search will only
>> look at a small portion of the elements.
>
> Hm, how can I do that? Toast is still a bit black magic to
Jan Urbański wrote:
Heikki Linnakangas wrote:
Jan Urbański wrote:
So right now the idea is to:
(1) pre-sort STATISTIC_KIND_MCELEM values
(2) build an array of pointers to detoasted values in tssel()
(3) use binary search when looking for MCELEMs during tsquery analysis
Sounds like a plan.
Heikki Linnakangas wrote:
Jan Urbański wrote:
So right now the idea is to:
(1) pre-sort STATISTIC_KIND_MCELEM values
(2) build an array of pointers to detoasted values in tssel()
(3) use binary search when looking for MCELEMs during tsquery analysis
Sounds like a plan. In (2), it's even bet
Heikki Linnakangas wrote:
Jan Urbański wrote:
So right now the idea is to:
(1) pre-sort STATISTIC_KIND_MCELEM values
(2) build an array of pointers to detoasted values in tssel()
(3) use binary search when looking for MCELEMs during tsquery analysis
Sounds like a plan. In (2), it's even bet
Jan Urbański <[EMAIL PROTECTED]> writes:
> Heikki Linnakangas wrote:
>> Speaking of which, a lot of time seems to be spent on detoasting. I'd like to
>> understand that a better. Where is the detoasting coming from?
>
> Hmm, maybe bttext_pattern_cmp does some detoasting? It calls
> PG_GETARG_TEXT_
Jan Urbański wrote:
So right now the idea is to:
(1) pre-sort STATISTIC_KIND_MCELEM values
(2) build an array of pointers to detoasted values in tssel()
(3) use binary search when looking for MCELEMs during tsquery analysis
Sounds like a plan. In (2), it's even better to detoast the values
Heikki Linnakangas wrote:
Jan Urbański wrote:
Not good... Shall I try sorting pg_statistics arrays on text values
instead of frequencies?
Yeah, I'd go with that. If you only do it for the new
STATISTIC_KIND_MCV_ELEMENT statistics, you shouldn't need to change any
other code.
OK, will do.
Jan Urbański wrote:
Not good... Shall I try sorting pg_statistics arrays on text values
instead of frequencies?
Yeah, I'd go with that. If you only do it for the new
STATISTIC_KIND_MCV_ELEMENT statistics, you shouldn't need to change any
other code.
Hmm. There has been discussion on raising
Heikki Linnakangas wrote:
Jan Urbański wrote:
26763 3.5451 AllocSetCheck
Make sure you disable assertions before profiling.
Awww, darn. OK, here goes another set of results, without casserts this
time.
=== CVS HEAD ===
number of clients: 10
number of transactions per client: 10
Jan Urbański wrote:
26763 3.5451 AllocSetCheck
Make sure you disable assertions before profiling. Although I'm actually
a bit surprised the overhead isn't more than 3.5%, I've seen much higher
overheads on other tests, but it's still skewing the results.
- Heikki
--
Sent via pgsql-hack
Heikki Linnakangas wrote:
Jan Urbański wrote:
through it. The only tiny ugliness is that there's one function used
for qsort() and another for bsearch(), because I'm sorting an array of
texts (from pg_statistic) and I'm binary searching for a lexeme
(non-NULL terminated string with length).
Jan Urbański wrote:
Heikki Linnakangas wrote:
Jan Urbański wrote:
Another thing are cstring_to_text_with_len calls. I'm doing them so I
can use bttextcmp in bsearch(). I think I could come up with a
dedicated function to return text Datums and WordEntries (read:
non-NULL terminated strings wi
Heikki Linnakangas wrote:
Jan Urbański wrote:
Another thing are cstring_to_text_with_len calls. I'm doing them so I
can use bttextcmp in bsearch(). I think I could come up with a
dedicated function to return text Datums and WordEntries (read:
non-NULL terminated strings with a given length).
Jan Urbański wrote:
Another thing are cstring_to_text_with_len calls. I'm doing them so I
can use bttextcmp in bsearch(). I think I could come up with a dedicated
function to return text Datums and WordEntries (read: non-NULL
terminated strings with a given length).
Just keep them as cstrings
Heikki Linnakangas wrote:
Jan Urbański wrote:
Here's a WIP patch implementing an oprrest function for tsvector @@
tsquery and tsquery @@ tsvector.
The idea is (quoting a comment)
/*
* Traverse the tsquery preorder, calculating selectivity as:
*
* selec(left_oper) * selec(right_oper) in A
Jan Urbański wrote:
Here's a WIP patch implementing an oprrest function for tsvector @@
tsquery and tsquery @@ tsvector.
The idea is (quoting a comment)
/*
* Traverse the tsquery preorder, calculating selectivity as:
*
* selec(left_oper) * selec(right_oper) in AND nodes,
*
* selec(lef
Hi,
I know Commit Fest is in progress, as well as the holiday season. But
the Summer of Code ends in about three weeks, so I'd like to request a
bit of out-of-order processing :)
My previous mail sent to -hackers is here:
http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
I had prob
Jan Urbański wrote:
The idea is (quoting a comment)
/*
* Traverse the tsquery preorder, calculating selectivity as:
Ekhm.
This should of course read "postorder"...
--
Jan Urbanski
GPG key ID: E583D7D2
ouden estin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To mak
Here's a WIP patch implementing an oprrest function for tsvector @@
tsquery and tsquery @@ tsvector.
The idea is (quoting a comment)
/*
* Traverse the tsquery preorder, calculating selectivity as:
*
* selec(left_oper) * selec(right_oper) in AND nodes,
*
* selec(left_oper) + selec(right
34 matches
Mail list logo