On Wed, Jul 29, 2009 at 04:46:43PM +0200, Radek Novotnnn wrote:
> is there in the roadmap of postgre integration of fulltext searching in 
> documents saved in blobs (bytea)?

Do you mean bytea or large-objects?

> Would be very very nice (postgre users can be proud to be first) to save 
> documents into bytea and search that field via to_tsvector, to_tsquery ...

This seems easy; for large objects, just use lo_export() to dump the
blob out to the filesystem, and then use something like pl/perl to run
antiword on it, saving the results to another file and then returning
the file line-by-line as a SETOF TEXT (I think this is the best way of
handling things in case the resulting text file is enormous anyway).  If
this code was called "runfilter" we can use it like:

  UPDATE myfiles f SET tsidx = (
    SELECT ts_accum(to_tsvector(t))
    FROM runfilter(f.loid) t);

Where we've defined ts_accum to be:

  CREATE AGGREGATE ts_accum (tsvector) (
    SFUNC = tsvector_concat,
    STYPE = tsvector,
    INITCOND = ''
  );

bytea is different because you know when the values has changed (i.e.
write a trigger) but you need to write more code to get the bytea value
out into the filesystem.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to