date:20080126

Re: [GENERAL] tsearch2: stop words and stemming separate?

2008-01-26 Thread Oleg Bartunov


On Sat, 26 Jan 2008, Sushant Sinha wrote:


I want to remove stop words but do not want to stem the words.  Is there
an interface in tsearch2 that allows me to do this?

Basically I am trying to implement spelling corrections and do not want
to correct stop words.


Create custom dictionary using simple (or just add stop words to simple)
and use it before english stemmer, which has NO stop words !

=# insert into pg_ts_dict
  (SELECT 'remove_stopwords', dict_init,
   'contrib/english.stop',
   dict_lexize,
   'simple dictionary with stop words'
FROM pg_ts_dict
WHERE dict_name = 'simple');

insert into pg_ts_dict
  (SELECT 'en_stem_no_stopwords', dict_init,
   '',
   dict_lexize,
   'english stemmer without stop words'
FROM pg_ts_dict
WHERE dict_name = 'en_stem');




Thanks,
-Sushant.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[GENERAL] tsearch2: stop words and stemming separate?

2008-01-26 Thread Sushant Sinha

I want to remove stop words but do not want to stem the words.  Is there
an interface in tsearch2 that allows me to do this?

Basically I am trying to implement spelling corrections and do not want
to correct stop words.

Thanks,
-Sushant.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Simple row serialization?

2008-01-26 Thread Adam Rich

> I'd like to implement some simple data logging via triggers on a small
> number of infrequently updated tables and I'm wondering if there are
> some helpful functions, plugins or idioms that would serialize a row


If you're familiar with perl, you can try PL/Perl.

http://www.postgresql.org/docs/8.2/interactive/plperl-triggers.html

Here's an example (untested).  If you're using quotes and colons as delimeters,
you may also need to escape those in your data.



CREATE OR REPLACE FUNCTION log_change() RETURNS trigger AS $$
my ($old_serialized, $new_serialized);

foreach my $col (keys %{$_TD->{old}}) {
$old_serialized .= "'" . $col ."':'" . $_TD->{old}{$col} . "',";
}
foreach my $col (keys %{$_TD->{new}}) {
$new_serialized .= "'" . $col ."':'" . $_TD->{new}{$col} . "',";
}

my $qry = spi_prepare('insert into log_tbl values ($1,$2)', VARCHAR, 
VARCHAR);
spi_exec_prepared($qry, $old_serialized, $new_serialized);
spi_freeplan($qry);

return;
$$ LANGUAGE plperl;




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Simple row serialization?

2008-01-26 Thread Ivan Voras


Adam Rich wrote:

I'd like to implement some simple data logging via triggers on a small
number of infrequently updated tables and I'm wondering if there are
some helpful functions, plugins or idioms that would serialize a row



If you're familiar with perl, you can try PL/Perl.


Thanks, but another solution has been suggested to me, much simpler:

create or replace function data_log() returns trigger as $$
declare
sdata   text;
begin
sdata = new;
insert into log(data) values (sdata);
return NULL;
end;
$$ language plpgsql;

create trigger data_insert after insert on data
for each row execute procedure data_log();

(from idea by Tom Lane)



signature.asc
Description: OpenPGP digital signature

[GENERAL] Simple row serialization?

2008-01-26 Thread Ivan Voras


Hi,

I'd like to implement some simple data logging via triggers on a small 
number of infrequently updated tables and I'm wondering if there are 
some helpful functions, plugins or idioms that would serialize a row 
(received for example in a AFTER INSERT trigger) into a string that I'd 
store in the log table. There's a limited number of field types 
involved: varchars, integers and booleans. I'm not looking for anything 
fancy, comma-separated string result would be just fine; even better, 
something like a dictionary ("field_name":"field_value",...) would be 
nice. The reason for trying to do it this way is that want to have a 
single log table to log many tables (again, they are infrequently 
updated). I need this for PostgreSQL 8.1.


I got suggestions to try composite types but I don't think they could be 
useful for this. What I need is possibly a generic "row" type ("any" and 
"record" generate syntax error in CREATE TABLE) - any ideas on where to 
start looking?






signature.asc
Description: OpenPGP digital signature

Re: [GENERAL] can't create index with 'dowcast' row

2008-01-26 Thread Louis-David Mitterrand

On Fri, Jan 25, 2008 at 12:17:16AM -0500, Tom Lane wrote:
> Louis-David Mitterrand <[EMAIL PROTECTED]> writes:
> > CREATE UNIQUE INDEX visit_idx ON visit_buffer USING btree (id_session, 
> > id_story, created_on::date);
> 
> > psql:visit_pkey.sql:5: ERROR:  syntax error at or near "::"
> 
> The reason that didn't work is that you need parentheses around an index
> expression (otherwise the CREATE INDEX syntax would be ambiguous).

This worked fine once I changed the type to a simple 'timestamp'.

> > CREATE UNIQUE INDEX visit_idx ON visit_buffer USING btree (id_session, 
> > id_story, extract(date from created_on));
> > psql:visit_pkey.sql:4: ERROR:  functions in index expression must be 
> > marked IMMUTABLE
> 
> I take it created_on is timestamp with time zone, not plain timestamp?
> The problem here is that the coercion to date is not immutable because
> it depends on the timezone setting.  (The other way would have failed
> too, once you got past the syntax detail.)  You need to figure out
> what your intended semantics are --- in particular, whose idea of
> midnight should divide one day from the next --- and then use a
> unique index on something like
> 
>   ((created_on AT TIME ZONE 'Europe/Paris')::date)
> 
> Note that the nearby recommendation to override the immutability
> test with a phonily-immutable wrapper function would be a real bad
> idea, because such an index would misbehave anytime someone changed
> their timezone setting.

Thanks Tom for that explanation. 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] tsearch2: stop words and stemming separate?

[GENERAL] tsearch2: stop words and stemming separate?

Re: [GENERAL] Simple row serialization?

Re: [GENERAL] Simple row serialization?

[GENERAL] Simple row serialization?

Re: [GENERAL] can't create index with 'dowcast' row

6 matches

Site Navigation

Mail list logo

Footer information