Michael Monnerie wrote:
> Aren't most searches of the form "%searchstring%"? If yes,
All text-searches (against headervalues or messageblks/mimeparts) are
like that. Searches against XXXfield tables are exact, though possibly
case-insensitive.
> the index is useless, as you can only search "searchstring%" to use the
> index. And if the index can't be used, it could be dropped altogether. I
> did some searches with Kmail, the string I searched for shows what I
> told Kmail to search (ZMI_*, e.g. ZMI_SUBJECT searched for subject)
All headers searches are performed against the headervalue table. The
subjectfield table is for thread=orderedsubject only. The other xxxfield
tables are used for sorting. So I guess you are correct: the index on
dbmail_headervalues.headervalue might as well be dropped.
> [snip]
> 2) Every single search was of the form "%searchtext%", so no index could
> be used. I mean, in dbmail 2.2 the index
> »dbmail_headervalue_3« btree ("substring"(headervalue, 0, 255))
> was never actually used and is therefore useless. And I guess 99% of all
> searches will be performed with "where X contains Y", and that's why it
> will have to be searched as X ILIKE %Y%. We could drop the index and the
> speed wil be increased, as a never used index only costs performance.
agreed. But the indexes on the xxxfield tables should stay.
> Or is there any client that performs a search like Y%? I guess no. At
> least, when I search for messages I always search "where X contains Y"
> and not "where X begins with Y".
That type of search is not possible with IMAP-SEARCH. 'Contains' is the
operative here.
> 3) Searches are performed only on substring(headervalue,0,255). The
> whole "headervalue" is never actually searched, and nobody seems to have
> found that. This means either
> a) people don't search in such long headervalues
> or
> b) people were searching but not finding messages and wondering, still
> they didn't report because they can't believe the computer is wrong.
Both are equally likely. Generally, imap clients tend to prefer
client-side searching.
> Is it true that we only search the first 255 chars anyway? Then get rid
> of the longer headervalues.
> If we want to search >255 chars in headers: is the index useful at all?
For myself, I wouldn't mind chopping the headervalue field to 255 chars.
But for the sake of correctness we should keep it as a text/blob.
> If %search% is used all the time, no index can be used and therefore be
> dropped. Starting from PostgreSQL 8.3, a full text index could be used
> http://www.postgresql.org/docs/8.3/static/textsearch.html
I'm not sure that would be compliant with imap's simple substring matching.
> But I guess we don't want database specific searches?
We already have a framework in place to do database specific queries. So
I don't see a fundamental problem there.
--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
[email protected]
http://twister.fastxs.net/mailman/listinfo/dbmail-dev