We need multi-key B-tree like index for such problem.
Our full text search engine is based on arrays and we need to find quickly
is some number exists in array - some kind of index over int array.
We're currently testing GiST approach and seems will have some conclusions
soon. I think multi-key B-tree like index would be better in my
opinion, but this requires to much work and knowledge of postgres's internals.
Yesterday I read about UBTree, seems like it's good for index and query
sets. Currently postgres has no set specific methods.
Regards,
Oleg
On Fri, 8 Dec 2000, mlw wrote:
> Date: Fri, 08 Dec 2000 20:17:34 -0500
> From: mlw <[EMAIL PROTECTED]>
> To: Tom Lane <[EMAIL PROTECTED]>
> Cc: Hackers List <[EMAIL PROTECTED]>
> Subject: Re: [HACKERS] OK, does anyone have any better ideas?
>
> Tom Lane wrote:
> >
> > mlw <[EMAIL PROTECTED]> writes:
> > > I have a working version of a text search engine. I want to make it work
> > > for Postgres (I will be releasing it GPL). It can literally find the
> > > occurrence of a string of words within 5 million records in a few
> > > milliseconds.
> >
> > Where are the records coming from? Are they inside the database?
> > (If not, why do you care about integrating this with Postgres?)
> >
> > It seems like the right way to integrate this sort of functionality
> > is to turn it into a kind of index, so that you can do
> >
> > SELECT * FROM mytable WHERE keyfield ~~~ 'search string';
> >
> > where ~~~ is the name of some operator that is associated with the
> > index. The temporary-table approach you are taking seems inherently
> > klugy, and would still be awkward even if we had functions returning
> > recordsets...
>
> Oh! Another method I tried and just could not get working was returning
> an array of integers. I as thinking about "select * from table where
> key_field in ( textsearch('bla bla') ), but I haven't been able to get
> that to work, and as per a previous post and belatedly reading a FAQ,
> this would probably still force a full table scan.
>
> Another method I thought about was having a table with some maximum
> number of zero initialized records, and trying something like:
>
> create table temp_table as select * from ts_template limit
> textsearch('bla bla', 10);
>
> select filltable(temp_table, 10);
>
> select * from table where key_field = temp_table.key;
>
> As you can see, all of these ideas are heinous hacks, there has to be a
> better way. Surely someone has a better idea.
>
> --
> http://www.mohawksoft.com
>
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83