date:20070918

Re: [GENERAL] help w/ SRF function

2007-09-18 Thread A. Kretschmer

am  Mon, dem 17.09.2007, um  9:21:22 +0800 mailte Ow Mun Heng folgendes:
> CREATE OR REPLACE FUNCTION foo_func(fromdate timestamp, todate
> timestamp, code text)
>   RETURNS SETOF foo AS
> $BODY$
>   SELECT
>   TRH.ID,
>   TRH.data1,
>   TRH.data2,
>   FROM D 
>   INNER JOIN  TS 
>ON TS.id = D.id
>   inner join TRH
>on ts.id = trh.id
>   WHERE D.start_timestamp BETWEEN fromdate AND todate
>   And D.code IN (code)
> $BODY$
> LANGUAGE 'sql' IMMUTABLE STRICT;
> 
> How can I go about this this? The above will fail due to missing columns
> fromdate/todate/code.

Use $1, 2 and $3 within the function-body instead fromdate, todate and
code.

Example:

test=# select * from n;
 feld1  | feld2
+---
 Frank  |23
 Frank  |31
 Stefan |32
 Stefan |22
 Jochen |29
(5 rows)

test=*# create or replace function nn(int) returns setof n as $$ select * from 
n where feld2=$1; $$ language sql;
CREATE FUNCTION
test=*# select * from nn(22);
 feld1  | feld2
+---
 Stefan |22
(1 row)



Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Problem dropping table

2007-09-18 Thread Mikko Partio

On 9/18/07, Ken Logan <[EMAIL PROTECTED]> wrote:
>
>
> On Tue, 2007-09-18 at 11:10 -0700, Alan Hodgson wrote:
>
> On Tuesday 18 September 2007 10:30, Ken Logan <[EMAIL PROTECTED]> wrote:> 
> When we try to drop the table we get the error:> ERROR:  "member_pkey" is an 
> index
> You have to remove the table from it's Slony set before you can drop it. 
> Slony does some hackish things to subscriber tables that make them unusable 
> for normal DDL operations.
> Your master probably isn't too thrilled, either, since it probably thinks the 
> table is still in a set.
>
>
> We actually used slonik_drop_table so its been properly removed from
> slony.
>
>  In any case, you'd probably have better luck with this on the Slony list. 
> I'm not at all sure your problem can be fixed without discarding the slave, 
> but someone there might know.
>
>
> I was afraid of that... since there doesn't seem to be any remnants left
> behind by slony affecting this table I'm not sure I can ask a sensible
> question about it on the slony list, so I guess I'll need to either dig into
> the source for what the postgresql error message means or reinit the slave
> (again).
>

It is a known issue with slony (the foreign key causes it), check the slony
mailing list archives for more info. AFAIK you have to use slonik and
EXECUTE SCRIPT ... EXECUTE ONLY ON NODE x to remove the table.

Regards

MP

Re: [GENERAL] Q:Aggregrating Weekly Production Data. How do you do it?

2007-09-18 Thread Ow Mun Heng

On Tue, 2007-09-18 at 09:55 -0500, Scott Marlowe wrote:
> On 9/17/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> 
> > Just wondering how everyone is doing aggregration of production data.
> 
> Where I work, we aggregate by the minute in the application, then dump
> those stats entries into the database en masse from each machine.  So,
[snip]
> The fail1/2/3 are types of failures, the totalresptime
> time is all the successful requests added together, so that
> totalresptime/success = average for that minute, and max is the
> longest running request ending in that minute.

So, it's basicaly a summation rather than using avg() all the way.
That would mean using something like bigint or something to deal with
the large numbers
> 
> > (getting min/max/count isn't much of an issue. Stdev is the main issue I
> > believe)
> >
> > One such instance I've read about is..
> 
> Isn't stddev() working for you?

Stdev is based on a population of data. if I do slices and if I want to
stdev across 2 months (8 weeks eg) then it would be wrong.

> 
> What I do is aggregate the minute time slices by grouping by
> date_trunc('xxx',timestamp) and then use that as a subselect to a
> query that does the stddev() outside of that.  works pretty well for
> us, and makes it easy to identify trends.

Trending analysis is very critical and the only reference I've found on
how to get "stdev" is based on what I posted.

> One of the things we started doing is to aggregate the last six weeks
> data by the day / hour and then comparing the last 24 hours worth of
> data to those six weeks worth to see trends for each hour of each day.
>  The queries are huge and ugly, but they work, and run in about 4
> minutes on a fairly good sized chunk of data.  We have about 150k to
> 300k entries a day put into this db.

I'm not sure how many K entries in a day(yet to count it) but I'm
getting into trouble w/ one-2-many relationships and PG is choosing to
do nested loops etc. (lots of left joins on same tables)

So, I've to resort to using SRF and function scans. a >2 hour Complex
query (on my laptopn,PG, runs in 20min on high end mssql) is reduced to
~60secs.

Thanks for the response.

BTW, are user queries an issue? Does it interfere with the loading.
(That's one of my concerns over here). I noticed  that you do the calc
at the app before sending it to the DB, which is a good thing and every
minute too. (is it adviseable to do it in chunks of 1min? less data per
minute vs per 30min/1 hour )

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Uninstall mess Postgresql 7.3.4, 8.1.4 on Redhat 9

2007-09-18 Thread MargaretGillon

I had two versions of postgresql running on a Redhat 9 server. Today I 
shut down the 7.3.4 version and uninstalled it using the RPM's. 
Unfortunately, as far as I can tell, this deleted the postgres user 
account and the system's knowledge of the  posgresql 8.1.4 software. The 
system will no longer run 8.1.4 even when I go to the /bin/ to run the 
commands. The data is still present and so is the 8.1.4 software. I remade 
the postgres account on the server and gave it permission's to 8.1.4 
directories but still no dice. I have multiple backups of the database 
that are okay. 

Would my best bet be to download and install 8.2.5 and import the backup? 
Or do I need to get 8.1.4 running, and how do I do that? 

Please reply to me as well as the list as I am on the digest version. 

TIA
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 
*** ***
Margaret Gillon, IS Dept., Chromalloy Los Angeles, ext. 297

This e-mail message and any attachment(s) are for the sole use of the 
intended recipient(s) and may contain proprietary and/or confidential 
information which may be privileged or otherwise protected from 
disclosure.  Any unauthorized review, use, disclosure or distribution is 
prohibited.  If you are not the intended recipient(s), please contact the 
sender by reply email and destroy the original message and any copies of 
the message as well as any attachment(s) to the original message.

[GENERAL] Version 8.2.5 for Windows doesn't startup normally after upgrading from 8.2.4

2007-09-18 Thread Walter Roeland

Hello,

I just upgraded from 8.2.4 to 8.2.5 on Windows but the service doesn't
startup normally.
This is the first time I have trouble with an upgrade of version 8.2.
I have the following non standard configuration:
 - Program directory: C:\Archivos de programa\PostgreSQL\8.2\bin
 - Data Directory: D:\PostgreSQL\data
 - SSL ON
 - Working database is IOP_IPR (there is no POSTGRES database)
-
When using
pg_ctl start -D D:\PostgreSQL\data\
The server comes up normally, 2 messages appear:
LOG:  could not load root certificate file "root.crt": No such
file or directory
DETAIL:  Will not verify client certificates.
The log file contains only:
2007-09-18 14:36:25LOG:  database system was shut down at 2007-09-18
14:33:27
2007-09-18 14:36:25LOG:  checkpoint record is at 29/D1D4100
2007-09-18 14:36:25LOG:  redo record is at 29/D1D4100; undo record
is at 0/0; shutdown TRUE
2007-09-18 14:36:25LOG:  next transaction ID: 0/326345; next OID:
42275
2007-09-18 14:36:25LOG:  next MultiXactId: 1; next MultiXactOffset:
0
2007-09-18 14:36:25LOG:  database system is ready

-
When started as a service with the following value of
pgsql-8.2\ImagePath in the registry:
"C:\Archivos de programa\PostgreSQL\8.2\bin\pg_ctl.exe"
runservice -w -N "pgsql-8.2" -D "D:\PostgreSQL\data\"

The log file contains: 
2007-09-18 14:28:34LOG:  database system was shut down at 2007-09-18
14:26:02
2007-09-18 14:28:34LOG:  checkpoint record is at 29/D1D4060
2007-09-18 14:28:34LOG:  redo record is at 29/D1D4060; undo record
is at 0/0; shutdown TRUE
2007-09-18 14:28:34LOG:  next transaction ID: 0/326343; next OID:
42275
2007-09-18 14:28:34LOG:  next MultiXactId: 1; next MultiXactOffset:
0
2007-09-18 14:28:34LOG:  could not load root certificate file
"root.crt": No such file or directory
2007-09-18 14:28:34DETAIL:  Will not verify client certificates.
2007-09-18 14:28:34LOG:  database system is ready
2007-09-18 14:28:35 127.0.0.1 postgres postgres FATAL:  the database
system is starting up

And then an endless list of
2007-09-18 14:28:35LOG:  could not load root certificate file
"root.crt": No such file or directory
2007-09-18 14:28:35DETAIL:  Will not verify client certificates.
2007-09-18 14:28:35 127.0.0.1 postgres postgres FATAL:  database
"postgres" does not exist
2007-09-18 14:28:36LOG:  could not load root certificate file
"root.crt": No such file or directory
2007-09-18 14:28:36DETAIL:  Will not verify client certificates.
2007-09-18 14:28:36 127.0.0.1 postgres postgres FATAL:  database
"postgres" does not exist

And I have to abort the startup.

Maybe the next is a hint:
When I had blocked the access to localhost with SSL=ON (using hostnossl
pg_hba.conf) there was a constant complaint (2 times per second) with:
127.0.0.1 postgres postgres FATAL:  no pg_hba.conf entry for host
"127.0.0.1", user "postgres", database "postgres", SSL on

-
Have I something wrong with the configuration of the service?

Thanks on forehand.

--
Walter Roeland


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Tsearch2 - spanish

2007-09-18 Thread Felipe de Jesús Molina Bravo

Hi

You are rigth, the output of "show lc_ctype;" is C.

Then I did is:

prueba1=# show lc_ctype;
lc_ctype 
-
 es_MX.ISO8859-1
(1 row)

and do it

 % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1

(how you do say)

and "createdb -E iso8859-1 prueba1" and finally tsearch2

the original problem is resolved

prueba1=# select to_tsvector('espanol','melón');
 to_tsvector 
-
 'melón':1
(1 row)


but if I change the sentece for it:

prueba1=# select to_tsvector('espanol','melón  perro mordelón');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> 


??? lost the connection ... the server is up  any idea?

The synonym is intentional


thanks in advanced


El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev escribió:
> > LC_CTYPE="POSIX"
> 
> 
> pls, output of "show lc_ctype;" command. If it's C locale then I can identify 
> problem - characters diacritical mark (as ó) is not an alpha character, and 
> ispell dictionary will fail. To fix that you should run initdb with options:
> % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1
> or
> % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
> 
> In last case you should also recode all dictionary's datafile in utf8 
> encoding.
> 
> >>> prueba=# select to_tsvector('espanol','melón');
> >>> ERROR:  Affix parse error at 506 line
> >> and
> >>> prueba=# select lexize('sp','melón');
> >>>  lexize  
> >>> -
> >>>  {melon}
> >>> (1 row)
> sp is a Snowball stemmer, it doesn't require affix file, so it works.
> 
> By the way, why is synonym dictionary paced after ispell? is it intentional?
> Usually, synonym dictionary goes first, then ispell and after all of them 
> snowball.
> 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Problem dropping table

2007-09-18 Thread Ken Logan

On Tue, 2007-09-18 at 11:10 -0700, Alan Hodgson wrote:

> On Tuesday 18 September 2007 10:30, Ken Logan <[EMAIL PROTECTED]> wrote:
> > When we try to drop the table we get the error:
> > ERROR:  "member_pkey" is an index
> 
> You have to remove the table from it's Slony set before you can drop it. 
> Slony does some hackish things to subscriber tables that make them 
> unusable for normal DDL operations.
> 
> Your master probably isn't too thrilled, either, since it probably 
> thinks the table is still in a set.

We actually used slonik_drop_table so its been properly removed from
slony. 

> In any case, you'd probably have better luck with this on the Slony 
> list. I'm not at all sure your problem can be fixed without discarding 
> the slave, but someone there might know.

I was afraid of that... since there doesn't seem to be any remnants left
behind by slony affecting this table I'm not sure I can ask a sensible
question about it on the slony list, so I guess I'll need to either dig
into the source for what the postgresql error message means or reinit
the slave (again).

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Greg Williamson


Richard Broersma Jr wrote:

--- Gregory Williamson <[EMAIL PROTECTED]> wrote:

  

A very low fill factor means that pages are
"sparse" and so inserts and updates are less likely to trigger massive b-tree rebalancings. 



I take it that "massive b-tree rebalancings" could cause a problem with the 
performance of disk
writing though-put from UPDATEs and INSERTs?

Regards,
Richard Broersma Jr.
  
Precisely -- even if it can keep everything in RAM it can occupy quite a 
few cycles to rebalance a large b-tree. And eventually those changes do 
need to get written to disk so the next checkpoint (I think) will also 
have more work.

G

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Richard Broersma Jr

--- Gregory Williamson <[EMAIL PROTECTED]> wrote:

> A very low fill factor means that pages are
> "sparse" and so inserts and updates are less likely to trigger massive b-tree 
> rebalancings. 

I take it that "massive b-tree rebalancings" could cause a problem with the 
performance of disk
writing though-put from UPDATEs and INSERTs?

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Windows Auto-Vacuum in 8.2.4 or 8.2.5

2007-09-18 Thread Richard Broersma Jr

A while back it was pointed out the that the Windows version of 8.2.3 had a bug 
that prevented
auto-vacuum from working correctly.

http://archives.postgresql.org/pgsql-general/2007-04/msg00139.php

I wasn't able to determine from the release notes if this bug was fixed in 
versions 8.2.4 or
8.2.5.  Does any know if it was?

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Gregory Williamson

Sorry for top-posting -- challenged reader.

Can't speak directly to PostgreSQL but in Informix the fill factor is useful 
for tweaking indexes. A very high fill factor is useful for tables that are 
static -- any inserts or changes to the index trigger a *lot* of moving of 
b-tree branches. But the high fill factor means that each page has more useful 
data references in it. A very low fill factor means that pages are "sparse" and 
so inserts and updates are less likely to trigger massive b-tree rebalancings. 

I've never used it on PostgreSQL (yet!) but am looking forward to it.

Beware of premature optimization!

HTH,

Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company

Confidentiality Notice: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and may contain confidential and 
privileged information and must be protected in accordance with those 
provisions. Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the sender by 
reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)

-Original Message-
From: [EMAIL PROTECTED] on behalf of Richard Broersma Jr
Sent: Tue 9/18/2007 10:29 AM
To: Phoenix Kiula; Bill Moran
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

--- Phoenix Kiula <[EMAIL PROTECTED]> wrote:

> What constitutes a "small fill factor"? Would 70 be good? I guess my
> current must have been the default, which the manual says is 100.

On the following link:

http://www.postgresql.org/docs/8.2/interactive/sql-createindex.html#SQL-CREATEINDEX-STORAGE-PARAMETERS

I found this:

"B-trees use a default fillfactor of 90, but any value from 10 to 100 can be 
selected."

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Q:Aggregrating Weekly Production Data. How do you do it?

2007-09-18 Thread btober

Ow Mun Heng wrote:
> Data which runs in the vicinity of a few million a week.
>
> What are the methods which will effectively provide the
> min/max/average/count/stdev of the weekly sample size based on different
> products/software mix etc.
>
> and still be able to answer correctly, what's the average of data_1 over
> the pass 2 months?

So, are you viewing the problem as one of figuring out how to avoid 
having to store all this raw data permanently but still have the 
statistical summary value results available without having to 
recalculate each time?

>
> I can't just take the average of an 8 averages of each week)
>
> eg:
> wk   avg data_1
> w1 - 2
> ...average of past 2 months = ave(w1-w8) which is statistically wrong.

One approach would be to to save the count of values as well as the 
average. Then your eight-week average is calculated as a weighted 
average, i.e., each weeks average is weighted by the corresponding count 
of data values:

CREATE TABLE summary_data
(
summary_date timestamp,
num_of_values integer,
avg_of_values numeric
);

Then, for instance, the eight-week average is computed as

SELECT
   AVG(num_of_values * avg_of_values)/SUM(num_of_values) AS eight_week_avg
FROM summary_data
WHERE summary_date BETWEEN (eight weeks ago) AND now();

>
> (getting min/max/count isn't much of an issue. Stdev is the main issue I
> believe)

You probably need to store each of those weekly summary statistic values 
too, at least for min and max, i.e.,

CREATE TABLE summary_data
(
summary_date timestamp,
num_of_values integer,
avg_of_values numeric,
min_of_values numeric,
max_of_values numeric
);

Then, for instance, overall basic statistics are computed as

SELECT
   COUNT(num_of_values),
   AVG(num_of_values * avg_of_values),
   MIN(min_of_values),
   MAX(max_of_values)
FROM summary_data
WHERE summary_date BETWEEN (eight weeks ago) AND now();

Extending this design to include the variance is a more complicated. 
While you can compute the average for the the past eight weeks using a 
weighted average of each of the separate eight weeks, I think you 
actually need the raw data values for the whole eight weeks to figure 
the standard deviation -- I don't readily see how you could without it. 
A different possibility would be to maintain a running average and 
variance (rather than windowing those statistics for the sliding 
eight-week period), and then taking a weekly snap shot of the running 
values to use for trending.

>
> One such instance I've read about is..
>
> 1. Calculate sum of square of each sale
> ...
> 8. Stdev will be the square root of step 7
>
> The results are valid (verified with actual data) but I don't understand
> the logic. All the Statistical books I've read marked stdev as sqrt
> (sum(x - ave(x))^2 / (n - 1). The formula is very different, hence the
> confusion.

I think you copied the std dev formula incorrectly, but the eight step 
algorithm can be shown to get you to the same point, after a little 
analytical manipulation. The advantage to one over the other is that the 
eight-step procedure can be used to implement an on-the-fly calculation 
of mean and variance, i.e., you can maintain a "running average" and 
update the statistics as you collect more data and not have to maintain 
the entire detail data set history, as would be required by an attempt 
to implement the definition directly.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] pgcrypto: is an IV needed with pgp_sym_encrypt()?

2007-09-18 Thread Bill Moseley

On Tue, Sep 18, 2007 at 11:26:17AM +0300, Marko Kreen wrote:
> Few bytes being same is normal.  Those are PGP packet header,
> telling "this is symmetrically encrypted session key packet,
> with length X" plus some more details.

I see.  So, you are saying no need to generate my own IV to prepend to
the plain text before encrypting.

> If more that few bytes are same, and if the salt is not randomised
> it _could_ be a sign of problem. Either pgcrypto bug or failure
> to initialize random generator.  If you suspect a problem, please
> send me few example encryptions with keys and your setup details
> (postgres version, openssl or not, os version)

No, it was only a few bytes that were similar, so the headers explain
that.

> > Besides following the PCI DSS and external audit procedures, the plan
> > is to use pgcrypto (pgp_sym_encrypt() with AES-256) as part of a
> > credit card storage server.  The server and db are SSL only and the
> > key is passed from the application and never stored anyplace (except
> > in memcached on other servers during the session).  The key is a
> > user's plain text password plus an application-specific secret.  So,
> > each row has its own key.  Passwords must be changed periodically,
> > etc.
> 
> I don't know details of your setup, but I strongly suggest you
> look into using public-key crypto.  That allow you separate keys
> for encryption and decryption.  So in webserver where users only
> input credit cards, you keep only public keys, so anybody cracking
> that won't be able to decrypt data.

I need to look at that more.  But I've seen that suggested where one
needs to decrypt the data at a later time.  We don't have that need.

Our plan was to never store any keys.  Every user must log in to the
application with a password.  Their account passwords are only stored
hashed on disk, so we don't know their passwords.

The plan is to encrypt their plain-text password with a secret known
by the application only and stored into memcached.  It's this
plain-text password that will be sent to a separate server to encrypt
and (and decrypt) their credit card data when the user make a
transaction.

We only need to store the credit card data to allow subsequent charges
to their "card on file" -- and that only happens when a user logs in
and processes a transaction.  We don't have any way to decrypt the
data without this password stored in the session.

If someone hacks an application server they could pluck active user's
passwords from memcached and also find the application's secret word.

Then if they also hacked the credit card server they could then
decrypt the data using passwords they were able to sniff.

See any glaring holes?

Thanks for the help!

-- 
Bill Moseley
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Performance Issues (was: "like" vs "substring" again)

2007-09-18 Thread John D. Burger


Christian Schröder wrote:

Or would it be possible to tweak how the planner determines the  
selectivity? I have read in the docs (chapter 54.1) that in case of  
more than one condition in the where clause, independency is  
assumed. In my case ("... where test like '11%' and test not like  
'113%'") this is clearly not the case, so it might be an  
interesting point to address.


I think the planner does think about the interactions of  
inequalities, so if you can express your query with less-than and  
friends, or even with BETWEEN, you might get a better plan.  I don't  
know the details of your setup, but you can do things like this with  
any ordered type:


   where test between '11' and '113'
or test >= '114'

I know this does not match the exact semantics of your query, but  
hopefully you get the idea.


- John D. Burger
  MITRE
---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] Problem dropping table

2007-09-18 Thread Alan Hodgson

On Tuesday 18 September 2007 10:30, Ken Logan <[EMAIL PROTECTED]> wrote:
> When we try to drop the table we get the error:
> ERROR:  "member_pkey" is an index

You have to remove the table from it's Slony set before you can drop it. 
Slony does some hackish things to subscriber tables that make them 
unusable for normal DDL operations.

Your master probably isn't too thrilled, either, since it probably 
thinks the table is still in a set.

In any case, you'd probably have better luck with this on the Slony 
list. I'm not at all sure your problem can be fixed without discarding 
the slave, but someone there might know.

-- 
"Corruptissima republica, plurimae leges" (The more corrupt the state, 
the more laws.) - Tacitus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] Problem dropping table

2007-09-18 Thread Ken Logan


We are using postgresql 8.2.3 with slony1 1.2.8 and we're having
problems trying to drop a table on the slony1 secondary that was
sucessfully dropped on the master database.

I'm sure this is just because I'm missing something, but it doesn't seem
like there should be any reason the table cannot be dropped. The problem
seems to be due to a foreign key constraint on the table we are trying
to drop, I don't believe any other tables reference the table we are
trying to drop with foreign key constraints.

The schema of the table is as follows:
   Table "public.member_note"
  Column   |  Type   | Modifiers
---+-+---
 member_id | bigint  | not null
 notes_id  | bigint  | not null
 notes_idx | integer | not null
Foreign-key constraints:
"fkb5d78a7737864f78" FOREIGN KEY (member_id) REFERENCES member(id)
"fkb5d78a77503a5452" FOREIGN KEY (notes_id) REFERENCES note(id)

When we try to drop the table we get the error:
ERROR:  "member_pkey" is an index

This error seems to have something to do with the primary key in the
related member table:
 Table "public.member"
  Column   |  Type  | Modifiers
---++---
 discriminator | character varying(31)  | not null
 id| bigint | not null
 deleted   | boolean| not null
 status| character varying(255) | not null
 principal_id  | bigint |
 extravaluelist_id | bigint |
Indexes:
"member_pkey" PRIMARY KEY, btree (id)
"member_principal_id_key" btree (principal_id)
Foreign-key constraints:
"fk892776ba530919c4" FOREIGN KEY (extravaluelist_id) REFERENCES
extravaluelist(id)
"fk892776baa88520c6" FOREIGN KEY (principal_id) REFERENCES
principal(id)


We have also tried to drop the foreign key constraint on the table as an
intermediate step to getting rid of the table:
alter table member_note drop constraint fkb5d78a7737864f78;
ERROR:  "member_pkey" is an index

Anyone know how to drop this table or what the error means?

[GENERAL] Performance Issues (was: "like" vs "substring" again)

2007-09-18 Thread Christian Schröder


Hi list,
I am still fighting with the really slow database queries (see 
http://www.nabble.com/%22like%22-vs-%22substring%22-again-t4447906.html), 
and I still believe that the cause of the problem is that the query 
planner makes incorrect estimations about the selectivity of the "where" 
clauses.
I wondered if it is possible to make the query planner perform a 
sequential scan over a table *before* it starts planning? If I know that 
a table has only about 3000 rows, the overhead due to this sequential 
scan can be ignored. On the other hand, this would give the planner an 
exact data basis for his planning.
Or would it be possible to tweak how the planner determines the 
selectivity? I have read in the docs (chapter 54.1) that in case of more 
than one condition in the where clause, independency is assumed. In my 
case ("... where test like '11%' and test not like '113%'") this is 
clearly not the case, so it might be an interesting point to address.

Do you have any other tips for me?

Kind regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Optimizing "exists"

2007-09-18 Thread Tom Lane

Steve Crawford <[EMAIL PROTECTED]> writes:
> If the sub-select returns a large result set, will there be any benefit
> to adding "limit 1" to the sub-select or does the query planner
> automatically deduce that "limit 1" is the correct interpretation?

It does, although poking at it I notice a bit of a bug:

regression=# explain select * from tenk1 a join tenk1 b on a.unique1 = b.ten;
 QUERY PLAN 


 Merge Join  (cost=2287.89..2438.58 rows=1 width=488)
   Merge Cond: (a.unique1 = b.ten)
   ->  Index Scan using tenk1_unique1 on tenk1 a  (cost=0.00..1702.22 
rows=1 width=244)
   ->  Materialize  (cost=2287.89..2412.89 rows=1 width=244)
 ->  Sort  (cost=2287.89..2312.89 rows=1 width=244)
   Sort Key: b.ten
   ->  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 width=244)
(7 rows)

regression=# explain select exists(select * from tenk1 a join tenk1 b on 
a.unique1 = b.ten);
 QUERY PLAN 
 
-
 Result  (cost=4822.00..4822.01 rows=1 width=0)
   InitPlan
 ->  Nested Loop  (cost=0.00..4822.00 rows=1 width=488)
   ->  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 width=244)
   ->  Index Scan using tenk1_unique1 on tenk1 a  (cost=0.00..0.42 
rows=1 width=244)
 Index Cond: (a.unique1 = b.ten)
(6 rows)

The second case is correctly choosing a fast-start plan, but it's
reporting the sub-select's cost up to the next plan level as being the
full runtime instead of the expected partial runtime.  That has no
bad effect here, but might in a more complex situation where the
estimated subselect cost affected upper join order or some such.
If you were up against such a situation, an explicit LIMIT 1 would
probably help:

regression=# explain select exists(select * from tenk1 a join tenk1 b on 
a.unique1 = b.ten limit 1);
QUERY PLAN  
   
---
 Result  (cost=0.48..0.49 rows=1 width=0)
   InitPlan
 ->  Limit  (cost=0.00..0.48 rows=1 width=488)
   ->  Nested Loop  (cost=0.00..4822.00 rows=1 width=488)
 ->  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 
width=244)
 ->  Index Scan using tenk1_unique1 on tenk1 a  
(cost=0.00..0.42 rows=1 width=244)
   Index Cond: (a.unique1 = b.ten)
(7 rows)

Same subplan, but a saner cost estimate at the upper level...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Richard Broersma Jr

--- "A. Kretschmer" <[EMAIL PROTECTED]> wrote:

> You can do this, you need a untrusted language like plperlU or plsh.
> Then you can write a TRIGGER and call external programs.

This may be a silly question, will plsh work on a windows server?  I am pretty 
sure that plbat
doesn't exist :-).

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] update command question?

2007-09-18 Thread Martijn van Oosterhout

On Sun, Sep 16, 2007 at 08:03:56PM -0700, Charles.Hou wrote:
> i have the log analysis report by using the pgFouine. in the part of
> Slowest queries, the update commands " update com_status set
> travel=620.70001220703 where no=1"  have the high Av.duration(s). how
> should i do to solve this problem? why this simple command can take
> up
> the most time?

Maybe you are missing an index? In any case, you're going to have to
provide *much* more information (at the very least an explain analyse
of that statement) if you want any more detailed answer.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature

[GENERAL] tradeoffs for multi-schema or multi-db

2007-09-18 Thread Gauthier, Dave

Here's the situation...

 

I have 2 different apps that both require a separate shema, or maybe db.
There is actually one column in one table of each ot these db/schemas
that are in common and a desire to :cross" between them in some cases.
For example, an app for keeping track of the census results and another
app that keeps track of criminal cases in the justice system.  They
"shared" field is of course the citizen/defendant.  Two different apps
that should remain separate, but at times it would be nice to check the
legal status of the defendant by looking at the census data.

 

Considering the somewhat rare query that will need to bridge these 2
data sources (dbs or schemas), what are the pros/cons of having 2
schemas in the same DB vs 2 DBs?  What if the query is to be committed
to a PLpgsql function/procedure?  How awkward is it to bridge schemas vs
bridging dbs in that form?

Thanks for any advise/help.

 

-dave

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Steve Atkins



On Sep 17, 2007, at 9:50 AM, Bima Djaloeis wrote:


Hi there,

I am new to PostgreSQL, is it possible to create something so that

1) If I insert / update / delete an item from my DB...
2) ... an awk / shell / external program is executed in my UNIX  
System?


If yes, how do I do this and if no, thanks for telling.



Yes it's possible, but it's probably a really bad idea, so I'm not  
going to tell you how.


Instead, use a trigger to store a message in a queue table, then have  
an external persistent process poll the queue table (or use listen/ 
notify to sleep until new messages to be added to the queue).


Cheers,
  Steve



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Erik Jones


On Sep 17, 2007, at 11:50 AM, Bima Djaloeis wrote:


Hi there,

I am new to PostgreSQL, is it possible to create something so that

1) If I insert / update / delete an item from my DB...
2) ... an awk / shell / external program is executed in my UNIX  
System?


If yes, how do I do this and if no, thanks for telling.

Thanks for reading, any help is appreciated.


You could use a trigger function in an untrusted procedural language  
such as plperlu or plpythonu to do that.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread A. Kretschmer

am  Mon, dem 17.09.2007, um 18:50:46 +0200 mailte Bima Djaloeis folgendes:
> Hi there,
> 
> I am new to PostgreSQL, is it possible to create something so that
> 
> 1) If I insert / update / delete an item from my DB...
> 2) ... an awk / shell / external program is executed in my UNIX System?
> 
> If yes, how do I do this and if no, thanks for telling.

You can do this, you need a untrusted language like plperlU or plsh.
Then you can write a TRIGGER and call external programs.


Hope that helps, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Scott Marlowe

On 9/17/07, Bima Djaloeis <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> I am new to PostgreSQL, is it possible to create something so that
>
> 1) If I insert / update / delete an item from my DB...
> 2) ... an awk / shell / external program is executed in my UNIX System?
>
> If yes, how do I do this and if no, thanks for telling.

Yes.  you have to use an untrusted pl language, like pl/perlu or
pl/tclu and you have to be a superuser to create user defined
functions in those languages.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Rodrigo De León

On 9/17/07, Bima Djaloeis <[EMAIL PROTECTED]> wrote:
> Thanks for reading, any help is appreciated.

Triggers + Untrusted PL/Perl, see:
1) http://www.postgresql.org/docs/8.2/static/plperl-triggers.html
2) http://www.postgresql.org/docs/8.2/static/plperl-trusted.html

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Alter sequence restart with selected value...

2007-09-18 Thread Jeff Ross


Scott Marlowe wrote:

On 9/18/07, Jeff Ross <[EMAIL PROTECTED]> wrote:

I'm using copy to insert a bunch of rows into a new table with a unique
primary key.  Copy is correctly incrementing the primary key, but
apparently the sequence itself is never updated because when I go to
insert again I get a constraint violation.


Try using setval.

select setval('seqname',select max(id) from tablename));


!DSPAM:46eff995184363531088756!



Oh, I knew it was going to be something simple!

Than you, Scott.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Richard Broersma Jr

--- Phoenix Kiula <[EMAIL PROTECTED]> wrote:

> What constitutes a "small fill factor"? Would 70 be good? I guess my
> current must have been the default, which the manual says is 100.

On the following link:

http://www.postgresql.org/docs/8.2/interactive/sql-createindex.html#SQL-CREATEINDEX-STORAGE-PARAMETERS

I found this:

"B-trees use a default fillfactor of 90, but any value from 10 to 100 can be 
selected."

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

index fillfactor (was Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER)

2007-09-18 Thread Bill Moran

In response to "Phoenix Kiula" <[EMAIL PROTECTED]>:

> Thanks for a very informative post! One question:
> 
> > I'm not sure how to find the current value, but a smaller fill factor
> > on busy tables should lead to less fragmentation, thus more efficient
> > indexes over time.  Keep in mind that a smaller fill factor will also
> > lead to larger indexes initially.
> 
> What constitutes a "small fill factor"? Would 70 be good?

Unfortunately, I can't say.  I have not yet had the opportunity to
experiment with different fillfactors, so I can only speak in vague
estimations on this topic.

> I guess my
> current must have been the default, which the manual says is 100.

I expect it's at the default, but the docs say that is 90%:
http://www.postgresql.org/docs/8.2/static/sql-createindex.html

Where did you see 100?

> Or
> did you mean really small fill factor like 20? In this context, what
> is "packing" in the manual -- is that some kind of compression?

Hopefully, someone more knowledgeable will chime in with some wise
suggestions.  Barring that, I can only suggest you experiment to find
what works for your workload, but don't rule out the possibility that
extremely low fillfactor values might work well for you.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] Optimizing "exists"

2007-09-18 Thread Steve Crawford

Does the planner automatically add "limit 1" to "exists" sub-selects? In
other words, take an update like:
update foo
  set itexists = exists
  (select 1 from bar where bar.something = foo.something);

If the sub-select returns a large result set, will there be any benefit
to adding "limit 1" to the sub-select or does the query planner
automatically deduce that "limit 1" is the correct interpretation?

Cheers,
Steve


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Database reverse engineering

2007-09-18 Thread Paul Boddie

On 13 Sep, 06:12, [EMAIL PROTECTED] (Ow Mun Heng) wrote:
> On Mon, 2007-09-10 at 13:00 -0600, RC Gobeille wrote:
> > Or this one:
> >http://schemaspy.sourceforge.net/
>
> Can't seem to get it to connect to PG using the example.
>
> java -jar schemaSpy_3.1.1.jar -t pgsql -u operator -p operator -o
> test_db  -host localhost -db test_db

As the end of the error message suggests, you might want to try
something like this:

java -jar schemaSpy_3.1.1.jar -cp postgresql-xxx.jar -t pgsql -u
operator -p operator -o test_db  -host localhost -db test_db

The argument for -cp needs to be the full path to the .jar file, and
must obviously use the real filename of that file itself. You may then
get complaints about not finding the schema: I had to specify "-s
public", I think.

Paul

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Phoenix Kiula

Thanks for a very informative post! One question:


> I'm not sure how to find the current value, but a smaller fill factor
> on busy tables should lead to less fragmentation, thus more efficient
> indexes over time.  Keep in mind that a smaller fill factor will also
> lead to larger indexes initially.


What constitutes a "small fill factor"? Would 70 be good? I guess my
current must have been the default, which the manual says is 100. Or
did you mean really small fill factor like 20? In this context, what
is "packing" in the manual -- is that some kind of compression?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] help w/ SRF function

2007-09-18 Thread Ow Mun Heng

Hi,

I want to use a SRF to return multi rows.

current SRF is pretty static.

create type foo_type as (
id smallint
data1 int
data2 int
)

CREATE OR REPLACE FUNCTION foo_func()
  RETURNS SETOF foo AS
$BODY$
SELECT
TRH.ID,
TRH.data1,
TRH.data2,
FROM D 
INNER JOIN  TS 
 ON TS.id = D.id
inner join TRH
 on ts.id = trh.id
WHERE D.start_timestamp BETWEEN '8/1/2007' AND '9/8/2007'
And D.code IN ('ID_123')
$BODY$
LANGUAGE 'sql' IMMUTABLE STRICT;

I would like for the above to be a little bit more dynamic in that the
start_timestamp and the code can be input-fields.

eg:

CREATE OR REPLACE FUNCTION foo_func(fromdate timestamp, todate
timestamp, code text)
  RETURNS SETOF foo AS
$BODY$
SELECT
TRH.ID,
TRH.data1,
TRH.data2,
FROM D 
INNER JOIN  TS 
 ON TS.id = D.id
inner join TRH
 on ts.id = trh.id
WHERE D.start_timestamp BETWEEN fromdate AND todate
And D.code IN (code)
$BODY$
LANGUAGE 'sql' IMMUTABLE STRICT;

How can I go about this this? The above will fail due to missing columns
fromdate/todate/code.

Or should I use plpgsql as SQL cannot handle variable substitution?

What about doing  dynamic SQL eg:

Dsql = select X,Y,Z from foo, join bar on bar.a = foo.a 
   where D.start_timestamp between ' || fromdate ||' and ' ||
todate||'

execute DSQL

Thanks for any/all help.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Alter sequence restart with selected value...

2007-09-18 Thread Scott Marlowe

On 9/18/07, Jeff Ross <[EMAIL PROTECTED]> wrote:
> I'm using copy to insert a bunch of rows into a new table with a unique
> primary key.  Copy is correctly incrementing the primary key, but
> apparently the sequence itself is never updated because when I go to
> insert again I get a constraint violation.

Try using setval.

select setval('seqname',select max(id) from tablename));

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] update command question?

2007-09-18 Thread Charles.Hou

i have the log analysis report by using the pgFouine. in the part of
Slowest queries, the update commands " update com_status set
travel=620.70001220703 where no=1"  have the high Av.duration(s). how
should i do to solve this problem? why this simple command can take
up
the most time?

Av.duration(s) :21.06
Times executed: 9
total duration : 3m9s


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] NOT NULL Issue

2007-09-18 Thread Geoffrey Myers


Tom Lane wrote:

"Gustav Lindenberg" <[EMAIL PROTECTED]> writes:

Why is ''  not considered null is postgres (8.1.3)


Because they're different.  The SQL spec says that an empty string
is different from NULL, and so does every database in the world except
Oracle.  Oracle, however, does not define the standard.


If people would think of it in terms of an address it might make more 
sense to them.  An empty string has an address, so can a string, integer 
and so on.  When you think of NULL, think of it in the context of a NULL 
address.  It's not addressable, it's nothing, it's not set, it's not 
there.  I know it's not 100% accurate, but I think it helps folks 
understand the concept.


--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
 - Benjamin Franklin

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] ON INSERT => execute AWK/SH/EXE?

2007-09-18 Thread Bima Djaloeis

Hi there,

I am new to PostgreSQL, is it possible to create something so that

1) If I insert / update / delete an item from my DB...
2) ... an awk / shell / external program is executed in my UNIX System?

If yes, how do I do this and if no, thanks for telling.

Thanks for reading, any help is appreciated.

Re: [GENERAL] keeping 3 tables in sync w/ each other

2007-09-18 Thread Sascha Bohnenkamp

> Can I use inheritance? References?

inheritance

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] Alter sequence restart with selected value...

2007-09-18 Thread Jeff Ross


I'm using copy to insert a bunch of rows into a new table with a unique
primary key.  Copy is correctly incrementing the primary key, but
apparently the sequence itself is never updated because when I go to
insert again I get a constraint violation.

Here's the start of the new table creation:

CREATE TABLE training_programs (
trg_prg_id integer NOT NULL primary key DEFAULT
nextval('training_programs_trg_prg_id_seq'),

I then use copy (select about half the columns in the original table) to 
'/tmp/training_programs.txt'


and then

copy training_programs from '/tmp/training_programs.txt'

After this I get the following:

wykids=# select max(trg_prg_id) from training_programs;
 max
--
 4893

wykids=# select nextval('training_programs_trg_prg_id_seq');
 nextval
-
   1
(1 row)

Since I'm doing this against a copy of a live database in preparation
for running it against the real thing, I never know how many records
will be in training_programs.

I'm trying, then, to do something like this:

alter sequence training_programs_trg_prg_id_seq restart with (select
(max(trg_prg_id) + 1) from training_programs);

but that isn't working.

Thanks in advance for any help!

Jeff Ross


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] read-only queries on PITRslaves, any progress?

2007-09-18 Thread Alexander Staubo

On 9/18/07, Merlin Moncure <[EMAIL PROTECTED]> wrote:
> Florian has been hard at work on getting the lazy xid assignment patch
> in for 8.3, which is now complete.  AIUI, this is an important step
> towards read only pitr slave (with other advantages too), and shows
> that things are moving along.  I don't have a crystal ball, but I'm
> guessing you will start to hear more about this when the 8.3 beta is
> wrapped.

I was hoping for a tighter timeframe, but this is good to know, thanks.

Alexander.

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Inline Function documentation

2007-09-18 Thread Brijesh Shrivastav

Hi! All,
 
I have come cross many posting that talks about INLINE FUNCTION and
their utility in helping optimizer decide the right plan. However, other
than few posting in different forums I haven't been able to get my hands
on any sort of documentation that explains which function can be converted
to an INLINE function. I understand it has to be a simple function and
inline_function in backend\optimizer\util\clauses.c gave me some
more pointers but I was wondering if there is more comprehensive 
documentation available anywhere.
 
Thanks,
Brijesh

Re: [GENERAL] Tsearch2 - spanish

2007-09-18 Thread Teodor Sigaev


prueba=# select to_tsvector('espanol','melón');
ERROR:  Affix parse error at 506 line

and

prueba=# select lexize('sp','melón');
 lexize  
-

 {melon}
(1 row)


Looks very strange, can you provide list of dictionaries and configuration map?


I tried many dictionaries with the same results. Also I change the
codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to
iso88591") and got the same error

where  can I investigate for resolve about this problem?

My dictionary at 506 line had:

Where do you take this file? And what is encdoing/locale setting of your db?

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] stability issues

2007-09-18 Thread Scott Marlowe

On 9/18/07, Willy-Bas Loos <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm running a PostgreSQL 8.1.9 server on Debian etch (default installation,
> w/ PostGIS from Debian repositories).
> The machine has double Xeon 5130, 4 GB of DDR2 ECC Reg. RAM and a
> two-sata-disc Raid 0 cluster. I don't know the brand of the memory nor the
> HDDs. The Riad controller is a 3Ware 8006-2.
>
> Lately i've been confronted by some failures in several PostgreSQL clusters.
> 3 weeks ago a cluster crashed when i was reindexing it, beond my repair.
> Last week we've been getting Toast warnings and shared index problems on
> another cluster. I solved the shared index issue, but the Toast trouble
> remained, in more than one database on that cluster. I dumped the databases,
> created another cluster and shut the troubled one down. (problems were
> solved).
> I've asked our provider to do a filesystem check, (e2fsck -f /dev/sda1)
> which indicated that "there's nothing wrong with the filesystem".

Too little really.  This system needs to have serious diagnostics run
on it to find out what the problem is.

> This doesn't feel right.
> Is there anything else i should check?

Yes.  If you can't get them to run real tests with memtest86 etc, then
try running very large compiles, like the linux kernel with -j4 or -j8
and watch for sig 11s while doing it.  Set it up to run the compiles
in a look looking for errors in compiling.

Better yet, find a hosting provider who knows what they're doing.
Your data / uptime are worth it.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] stability issues

2007-09-18 Thread Tom Lane

"Willy-Bas Loos" <[EMAIL PROTECTED]> writes:
> I'm running a PostgreSQL 8.1.9 server on Debian etch (default installation,
> w/ PostGIS from Debian repositories).
> The machine has double Xeon 5130, 4 GB of DDR2 ECC Reg. RAM and a
> two-sata-disc Raid 0 cluster. I don't know the brand of the memory nor the
> HDDs. The Riad controller is a 3Ware 8006-2.

> Lately i've been confronted by some failures in several PostgreSQL clusters.
> 3 weeks ago a cluster crashed when i was reindexing it, beond my repair.
> Last week we've been getting Toast warnings and shared index problems on
> another cluster. I solved the shared index issue, but the Toast trouble
> remained, in more than one database on that cluster. I dumped the databases,
> created another cluster and shut the troubled one down. (problems were
> solved).
> I've asked our provider to do a filesystem check, (e2fsck -f /dev/sda1)
> which indicated that "there's nothing wrong with the filesystem".

> This doesn't feel right.

No, it sure doesn't.  It sounds to me like the hardware is getting
flaky.  Memory tests might be the first thing to run.  There's also the
old clean-and-reseat-all-the-boards-and-connectors exercise...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Q:Aggregrating Weekly Production Data. How do you do it?

2007-09-18 Thread Scott Marlowe

On 9/17/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote:

> Just wondering how everyone is doing aggregration of production data.

Where I work, we aggregate by the minute in the application, then dump
those stats entries into the database en masse from each machine.  So,
we wind up with rows something like this: (simplified)

id | parentid | host | timestamp | request | total | success | fail1 |
fail2 | fail3 | totalresptime | maxresptime
1 | NULL | jboss01 | 2007-09-17 12:02:03 | authcc | 120 | 112 | 4 | 2
| 2 | 48 | 12039
2 | 1 | jboss01 | 2007-09-17 12:02:03 | ccconnect | 120 | 118 | 0 | 0
| 2 | 423900 | 10394

where the id comes from a sequence, and parent ID ties our children
stats to their parents.  I.e. in this example authcc called ccconnect,
and so on.  The fail1/2/3 are types of failures, the totalresptime
time is all the successful requests added together, so that
totalresptime/success = average for that minute, and max is the
longest running request ending in that minute.

Then we can aggregate those minutes together, monitor individual
machine performance, etc.  Ours is actually more complex than this,
but you get the idea.

We have a cron job that checks the statistics every x minutes for high
failure rates and have it generate an alert email if any of our
requests go over a preset threshold.  This catches problems long
before anything shows up interesting in the logs most of the time.

> (getting min/max/count isn't much of an issue. Stdev is the main issue I
> believe)
>
> One such instance I've read about is..

Isn't stddev() working for you?

What I do is aggregate the minute time slices by grouping by
date_trunc('xxx',timestamp) and then use that as a subselect to a
query that does the stddev() outside of that.  works pretty well for
us, and makes it easy to identify trends.

One of the things we started doing is to aggregate the last six weeks
data by the day / hour and then comparing the last 24 hours worth of
data to those six weeks worth to see trends for each hour of each day.
 The queries are huge and ugly, but they work, and run in about 4
minutes on a fairly good sized chunk of data.  We have about 150k to
300k entries a day put into this db.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] read-only queries on PITRslaves, any progress?

2007-09-18 Thread Merlin Moncure

On 9/18/07, Alexander Staubo <[EMAIL PROTECTED]> wrote:
> On 9/15/07, Marinos Yannikos <[EMAIL PROTECTED]> wrote:
> > this might not be the best place to ask, but has there been any further
> > progress or an alternative project to Florian Pflug's SoC 2007 proposal
> > about enabling PITR slaves to serve read-only queries? It seems like an
> > elegant way to load-balance a PostgreSQL database with very little overhead.
>
> I too would love to hear about this.

Florian has been hard at work on getting the lazy xid assignment patch
in for 8.3, which is now complete.  AIUI, this is an important step
towards read only pitr slave (with other advantages too), and shows
that things are moving along.  I don't have a crystal ball, but I'm
guessing you will start to hear more about this when the 8.3 beta is
wrapped.

merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Bill Moran

In response to "Phoenix Kiula" <[EMAIL PROTECTED]>:

> > If you find that reindexing improves performance, then you should
> > investigate further.  Depending on the exact nature of the problem,
> > there are many possible solutions, three that come to mind:
> > * Add RAM/SHM
> 
> Can I add SHM with merely by managing the entry in sysctl.conf? My
> current values:
> 
> kernel.shmmax = 536870912
> kernel.shmall = 536870912

These values define the max allowed.  They exist to keep poorly written
applications from sucking up all the available memory.  Setting them
higher than is needed does not cause any problems, unless a greedy or
poorly-written application grabs all that memory.

> My "shared_buffers" in postgresql.conf is "2". From the website
> http://www.desknow.com/kb/idx/12/061/article/ I notice that shmmax
> should be sharedbuffer*8192, so I suppose my shmmax can be much lower
> than the above, but I raised it for performance. Am I wrong to do so?

It's completely impossible to tell without knowing more about your
physical hardware.  The rule of thumb is 1/3 physical RAM to start, then
adjust if more or less seems to help.  That advice is for versions of
PG >= 8.  If you're still running a 7.X version, upgrade.

How much RAM does this system have in it?  Unless you have other
applications running on this system using RAM, you should allocate
more of it to shared_buffers.  If 160M is 1/3 your RAM, you probably
need to add more RAM.

How big is your database?  If it's possible to fit it all in
shared_buffers, that will give you the best performance.  

> > * REINDEX on a regular schedule
> 
> This is sadly not really feasible, because we need to offer a 100%
> availability website. REINDEX does not work concurrently so it is not
> really an option for us. My max_fsm_pages and max_fsm_relations are
> way above the numbers that come up after the VACUUM ANALYZE VERBOSE
> run.

Hence my comment about "depending on your workload" and "investigating
the situation" to determine the best solution.  

> But still, the autovacuum stuff seems like it is not working at all.
> Some related entries in the conf file:
> 
> autovacuum   = on
> autovacuum_vacuum_cost_delay = 20
> vacuum_cost_delay= 20
> autovacuum_naptime   = 30
> stats_start_collector= on
> stats_row_level  = on
> autovacuum_vacuum_threshold  = 80
> autovacuum_analyze_threshold = 80
> 
> And yet, the db often slows down, at which point I manually login and
> run a manual VACUUM ANALYZE and it seems fine for some more time.
> Sometimes, I also restart pgsql and that seems to help for a while.

You don't mention *_scale_factor settings.  Those are going to be
important as well.  Based on your symptoms, it sounds like autovacuum
is not getting those tables vacuumed enough.  I recommend raising the
debug level and watching the logs to see if autovacuum is actually
getting tables vacuumed.  Consider lowering your *_scale_factor values
if not.  Or even reducing autovacuum_naptime.

> Another advice on these forums is to see "vmstat 1", without actually
> specifying how to draw inferences from it. The "free" version of it is
> coming up at decent rates, as follows:
> 
> 
> procs ---memory-- ---swap-- -io --system-- cpu
>  r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
>  1  0  29124 110760 108980 346773601   206   1400 4  2  1 85 
> 12
>  0  0  29124 110632 108980 346773600 0 0 1052   108  0  0 100 
>  0
>  2  0  29124 108840 108980 346773600 0 0 1112   299  1  1 98  > 0
>  1  0  29124 109288 108980 346773600 0 0 1073   319  2  1 98  > 0
> .

Explaining how to interpret the output of this command and determine
what to do with it is not something easily done in a short paragraph.
However, it looks like you've got a lot of RAM being used for the disk
cache.  That memory would probably be better used as shared_buffers, so
I suggest you increase that value considerably.

> > * (with newer version) reduce the fill factor and REINDEX
> 
> I think some of my tables are updated very frequently so a smaller
> fill factor will be nice. How can I find the current fill factor on my
> tables? Also, is there some method or science to calculating a decent
> fill factor -- size of table, number of indexes, frequency of updates,
> and such? We have one major table which faces a lot of INSERTs and
> UPDATES in a day (up to 10,000) but many many more SELECTs (up to 10
> million).

I'm not sure how to find the current value, but a smaller fill factor
on busy tables should lead to less fragmentation, thus more efficient
indexes over time.  Keep in mind that a smaller fill factor will also
lead to larger indexes initially.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Q:Aggregrating Weekly Production Data. How do you do it?

2007-09-18 Thread John D. Burger


Ow Mun Heng wrote:

The results are valid (verified with actual data) but I don't  
understand

the logic. All the Statistical books I've read marked stdev as sqrt
(sum(x - ave(x))^2 / (n - 1). The formula is very different, hence the
confusion.


A formula is not an algorithm.  In particular, the naive way of  
calculating variance or standard deviation has massive numerical  
instability problems - anything involving sums of squares does.   
There are a variety of alternate algorithms for stddev/variance, I  
presume your other algorithm is similarly trying to avoid these same  
issues (but I have not looked closely at it).  You can also see  
Wikipedia for one of the most well known, due to Knuth/Wellford:


  http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

- John D. Burger
  MITRE



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Sander Steffann

Hi,

> Now, I can merrily increase the shared_buffers, but the manual warns
> me against increasing the value too much because it is "per
> transaction" value.

Shared_buffers is not per-transaction. Where did you find this information?

- Sander



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] read-only queries on PITRslaves, any progress?

2007-09-18 Thread Alexander Staubo

On 9/15/07, Marinos Yannikos <[EMAIL PROTECTED]> wrote:
> this might not be the best place to ask, but has there been any further
> progress or an alternative project to Florian Pflug's SoC 2007 proposal
> about enabling PITR slaves to serve read-only queries? It seems like an
> elegant way to load-balance a PostgreSQL database with very little overhead.

I too would love to hear about this.

Alexander.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Phoenix Kiula

On 18/09/2007, Sander Steffann <[EMAIL PROTECTED]> wrote:
> Hi,
>
> > Can I add SHM with merely by managing the entry in sysctl.conf? My
> > current values:
> >
> > kernel.shmmax = 536870912
> > kernel.shmall = 536870912
> >
> > My "shared_buffers" in postgresql.conf is "2". From the website
> > http://www.desknow.com/kb/idx/12/061/article/ I notice that shmmax
> > should be sharedbuffer*8192, so I suppose my shmmax can be much lower
> > than the above, but I raised it for performance. Am I wrong to do so?
>
> You need to configure the kernel so it allows processes to use more shared
> memory. This does not mean that a process automatically uses it. For
> PostgreSQL you will need to increase shared_buffers to make it use the extra
> available shared memory. With your shared memory settings you can probably
> increase shared_buffers to about 65000.
>

Thanks, the IPCS command shows me this:

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x0052e2c1 6782976postgres  600176668672  2

Now, I can merrily increase the shared_buffers, but the manual warns
me against increasing the value too much because it is "per
transaction" value.

So here's the conflict for a novice like me:

1. Do not increase shared_buffer too much because it is per-transaction.
2. Do increase the SHM for performance, but it is only useful if you
also increase shared_buffer.

So which is it?

Would it help to increase the effective_cache_size? It is currently at
"512000".

I have 4GB ram on the machine, but am willing to devote about 2GB to pgsql.

Thanks!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Q:Aggregrating Weekly Production Data. How do you do it?

2007-09-18 Thread Ron Johnson

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/17/07 23:34, Ow Mun Heng wrote:
> Just wondering how everyone is doing aggregration of production data.
> 
> Data which runs in the vicinity of a few million a week.
> 
> What are the methods which will effectively provide the
> min/max/average/count/stdev of the weekly sample size based on different
> products/software mix etc.
> 
> and still be able to answer correctly, what's the average of data_1 over
> the pass 2 months? 

That's the purpose of data warehouses and ETL, isn't it?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG78prS9HxQb37XmcRAhhBAKCGoBYox6azDqxQpEbvMo/Zya8cAACgy5Y6
XtrDC35IE0TOcD29Iziorfs=
=XCZw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread Ron Johnson

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/18/07 05:54, Ow Mun Heng wrote:
> On Tue, 2007-09-18 at 06:52 -0400, Geoffrey wrote:
>> [EMAIL PROTECTED] wrote:
>>> Robert Wickert would like to recall the message, "August Monthly
>>> techdata split file printers for France and Denmark ".
>> In my experience, attempting to 'recall' an email message is a fruitless 
>> endeavor.  Seems to me that this is a 'Microsoft' creation.  I really 
>> don't understand the purpose, because by the time you consider 
>> 'recalling' the email message, it's already sitting in 1000s of inboxes...
> 
> It works for people that has the message _still_ in their exchange
> boxes. For those that don't, then... all is lost unfortunately.

Actually, it even works if you fetch mails from the Exchange server
to your local box.  But, as another poster mentioned, you still have
to be attached to an Exchange domain for it to work.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG78l2S9HxQb37XmcRAjmlAJ9pHW2sDN/c2y6Rng+mzv3te+h2LgCdFZqw
QYs2/bEoaWFlMW0+priEzTs=
=i/MW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Sander Steffann

Hi,

> Can I add SHM with merely by managing the entry in sysctl.conf? My
> current values:
> 
> kernel.shmmax = 536870912
> kernel.shmall = 536870912
> 
> My "shared_buffers" in postgresql.conf is "2". From the website
> http://www.desknow.com/kb/idx/12/061/article/ I notice that shmmax
> should be sharedbuffer*8192, so I suppose my shmmax can be much lower
> than the above, but I raised it for performance. Am I wrong to do so?

You need to configure the kernel so it allows processes to use more shared
memory. This does not mean that a process automatically uses it. For
PostgreSQL you will need to increase shared_buffers to make it use the extra
available shared memory. With your shared memory settings you can probably
increase shared_buffers to about 65000.

With the 'ipcs' command you can see how much shared memory PostgreSQL uses.
Look under 'Shared Memory Segments' to memory owned by user postgres.

- Sander



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] New PostgreSQL RPM sets are available for Fedora / RHEL

2007-09-18 Thread Devrim GÜNDÜZ

Hi,


The PostgreSQL New RPM Sets
2007-09-17

Versions: 8.2.5, 8.1.10, 8.0.14, 7.4.18, 7.3.20

Set labels: 8.2.5-1PGDG, 8.1.10-1PGDG, 8.0.14-1PGDG, 7.4.18-1PGDG,
7.3.20-1PGDG
-

-
Release Info:

The PostgreSQL RPM Building Project has released RPMs for 7.3.20, 7.4.18
8.0.14, 8.1.10 and 8.2.5 and they are available in main FTP site and its
mirrors.

We currently have RPMs for:

- Fedora 7
- Fedora 7-x86_64
- Red Hat Enterprise Linux Enterprise Server 3.0-x86_64
- Red Hat Enterprise Linux Enterprise Server 4
- Red Hat Enterprise Linux Enterprise Server 4-x86_64
- Red Hat Enterprise Linux Advanced Server 4
- Red Hat Enterprise Linux Advanced Server 4-x86_64
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 5 x86_64

More may (will) come later. I want to thank every package builder for
this great number of supported platforms.

Support for Fedora Core 6 will be abandoned in future releases, when it
reaches EOL.

Please let us know if you can assist us in building RPMS of missing Red
Hat / Fedora platforms.

For complete list of changes in RPM sets, please refer to the changelogs
in the RPMs. Use
 rpm -q -changelog package_name
for querying the changelog.

Point releases generally do not require a dump/reload from the previous
point, but please see the Release Notes to confirm procedures for
upgrading, especially if your current version is older than the last
point release.

The SRPMs are also provided. Please note that we have one SRPM for all
platforms.

We also have a howto document about RPM installation of PostgreSQL:

http://pgfoundry.org/docman/?group_id=148 

Please follow the instructions before installing/upgrading.

Most of the RPMs have been signed by the builder, and each directory
contains CURRENT_MAINTAINER file which includes the name/email of the
package builder and link to their PGP key.

If you experience problems with the RPMs or if you have feature
requests, please join

pgsqlrpms-general ( at ) pgfoundry ( dot ) org

More info about the list is found at:

http://lists.pgfoundry.org/mailman/listinfo/pgsqlrpms-general

The project page is:

http://pgfoundry.org/projects/pgsqlrpms

Please do not use these resources for issue running or using PostgreSQL
once it is installed.

Please download these files from:

http://www.postgresql.org/ftp/binary/v8.2.5/linux/
http://www.postgresql.org/ftp/binary/v8.1.10/linux/ 
http://www.postgresql.org/ftp/binary/v8.0.14/linux/ 
http://www.postgresql.org/ftp/binary/v7.4.18/linux/
http://www.postgresql.org/ftp/binary/v7.3.20/linux/ 

Regards,
-- 
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/




signature.asc
Description: This is a digitally signed message part

Re: [GENERAL] keeping 3 tables in sync w/ each other

2007-09-18 Thread btober


Ow Mun Heng wrote:

Hi,

I have 3 tables

foo
foo_loading_source1
foo_loading_source2

which is something like

create table foo (a int, b int, c int)
create table foo_loading_source1 (a int, b int, c int)
create table foo_loading_source2 (a int, b int, c int)

Is there a way which can be made easier to keep these 3 tables DDL in
sync?

the loading_sourceX tables are just a temporary-in-transit table for
data \copy'ied into the DB before being inserted into the main foo
table.

  



Since these are temporary tables, why don't you just create them on the 
fly as temporary tables?


CREATE TEMPORARY TABLE foo_loading_source1 (LIKE foo);

CREATE TEMPORARY TABLE foo_loading_source2 (LIKE foo);

Then do your loading process. Then you don't really have to worry about 
maintaining the loading tables at all.





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Phoenix Kiula

> Unfortunately, folks like Phoenix are looking for yes/no answers, and
> with many of these questions, the _correct_ answer is "it depends on
> your workload"


I wanted merely to simplify the advice that gets dispensed on this
list, often conflicting to novice ears like mine. So I appreciate your
notes very much.


> If you find that reindexing improves performance, then you should
> investigate further.  Depending on the exact nature of the problem,
> there are many possible solutions, three that come to mind:
> * Add RAM/SHM


Can I add SHM with merely by managing the entry in sysctl.conf? My
current values:

kernel.shmmax = 536870912
kernel.shmall = 536870912

My "shared_buffers" in postgresql.conf is "2". From the website
http://www.desknow.com/kb/idx/12/061/article/ I notice that shmmax
should be sharedbuffer*8192, so I suppose my shmmax can be much lower
than the above, but I raised it for performance. Am I wrong to do so?


> * REINDEX on a regular schedule


This is sadly not really feasible, because we need to offer a 100%
availability website. REINDEX does not work concurrently so it is not
really an option for us. My max_fsm_pages and max_fsm_relations are
way above the numbers that come up after the VACUUM ANALYZE VERBOSE
run.

But still, the autovacuum stuff seems like it is not working at all.
Some related entries in the conf file:

autovacuum   = on
autovacuum_vacuum_cost_delay = 20
vacuum_cost_delay= 20
autovacuum_naptime   = 30
stats_start_collector= on
stats_row_level  = on
autovacuum_vacuum_threshold  = 80
autovacuum_analyze_threshold = 80

And yet, the db often slows down, at which point I manually login and
run a manual VACUUM ANALYZE and it seems fine for some more time.
Sometimes, I also restart pgsql and that seems to help for a while.

Another advice on these forums is to see "vmstat 1", without actually
specifying how to draw inferences from it. The "free" version of it is
coming up at decent rates, as follows:


procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 1  0  29124 110760 108980 346773601   206   1400 4  2  1 85 12
 0  0  29124 110632 108980 346773600 0 0 1052   108  0  0 100  0
 2  0  29124 108840 108980 346773600 0 0 1112   299  1  1 98  0
 1  0  29124 109288 108980 346773600 0 0 1073   319  2  1 98  0
.




> * (with newer version) reduce the fill factor and REINDEX
>


I think some of my tables are updated very frequently so a smaller
fill factor will be nice. How can I find the current fill factor on my
tables? Also, is there some method or science to calculating a decent
fill factor -- size of table, number of indexes, frequency of updates,
and such? We have one major table which faces a lot of INSERTs and
UPDATES in a day (up to 10,000) but many many more SELECTs (up to 10
million).

Thanks.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] statements of an unfinished transaction

2007-09-18 Thread Martijn van Oosterhout

On Tue, Sep 18, 2007 at 02:04:32PM +0300, Sabin Coanda wrote:
> Hi there,
> 
> I'd like to find the sessions that provide unclosed transactions (i.e. 
> marked as  in transaction).
> 
> Is any way to find the SQL statements that belong to such a transaction, or 
> the transaction time start, or any other helpful data ?

pg_stat_activity has that, surely...

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature

[GENERAL] statements of an unfinished transaction

2007-09-18 Thread Sabin Coanda

Hi there,

I'd like to find the sessions that provide unclosed transactions (i.e. 
marked as  in transaction).

Is any way to find the SQL statements that belong to such a transaction, or 
the transaction time start, or any other helpful data ?

TIA,
Sabin 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread Geoffrey


Ow Mun Heng wrote:

On Tue, 2007-09-18 at 06:52 -0400, Geoffrey wrote:

[EMAIL PROTECTED] wrote:

Robert Wickert would like to recall the message, "August Monthly
techdata split file printers for France and Denmark ".
In my experience, attempting to 'recall' an email message is a fruitless 
endeavor.  Seems to me that this is a 'Microsoft' creation.  I really 
don't understand the purpose, because by the time you consider 
'recalling' the email message, it's already sitting in 1000s of inboxes...


It works for people that has the message _still_ in their exchange
boxes. For those that don't, then... all is lost unfortunately.


My point exactly.

--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
 - Benjamin Franklin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread Rodrigo De León

On 9/18/07, Geoffrey <[EMAIL PROTECTED]> wrote:
> In my experience, attempting to 'recall' an email message is a fruitless
> endeavor.  Seems to me that this is a 'Microsoft' creation.  I really
> don't understand the purpose, because by the time you consider
> 'recalling' the email message, it's already sitting in 1000s of inboxes...

It's an Exchange "feature". It only works if you send the email within
the Exchange domain.

So, yeah, it's a brain-dead feature...

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread Ow Mun Heng


On Tue, 2007-09-18 at 06:52 -0400, Geoffrey wrote:
> [EMAIL PROTECTED] wrote:
> > Robert Wickert would like to recall the message, "August Monthly
> > techdata split file printers for France and Denmark ".
> 
> In my experience, attempting to 'recall' an email message is a fruitless 
> endeavor.  Seems to me that this is a 'Microsoft' creation.  I really 
> don't understand the purpose, because by the time you consider 
> 'recalling' the email message, it's already sitting in 1000s of inboxes...

It works for people that has the message _still_ in their exchange
boxes. For those that don't, then... all is lost unfortunately.

Bitten...

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread Geoffrey


[EMAIL PROTECTED] wrote:

Robert Wickert would like to recall the message, "August Monthly
techdata split file printers for France and Denmark ".


In my experience, attempting to 'recall' an email message is a fruitless 
endeavor.  Seems to me that this is a 'Microsoft' creation.  I really 
don't understand the purpose, because by the time you consider 
'recalling' the email message, it's already sitting in 1000s of inboxes...


--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
 - Benjamin Franklin

--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
 - Benjamin Franklin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Rodrigo De León

On 9/18/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-09-18 at 06:01 -0400, Bill Moran wrote:
> > * (with newer version) reduce the fill factor and REINDEX
>
> What is fill factor?

See "Index Storage Parameters":
http://www.postgresql.org/docs/8.2/static/sql-createindex.html

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] FW: August Monthly techdata split file p

2007-09-18 Thread rwickert

ok they're both done

 -Original Message-
From:  Robert Wickert
Sent: 18 September 2007 11:05
To: Gurvan Meyer
Subject: RE: August Monthly techdata split file printers for France and Denmark

the reports can always be found here .. \\147.114.32.180\monthly_sales_reports\

just find the year and month and you're off!

denmark is finished and france is still being run

 -Original Message-
From:  Gurvan Meyer
Sent: 18 September 2007 09:51
To: Robert Wickert
Subject: RE: August Monthly techdata split file printers for France and Denmark

And when it's done , could you please tell me where you stored it .

thanks

gurvan

 -Original Message-
From:  Robert Wickert
Sent: September 18 2007 09:33
To: Gurvan Meyer
Subject: RE: August Monthly techdata split file printers for France and Denmark

sorry saw it in title .. will run right now

 -Original Message-
From:  Gurvan Meyer
Sent: 17 September 2007 17:21
To: Robert Wickert
Cc: Nathalie Bourgoin; Eric Pouget
Subject: August Monthly techdata split file printers for France and Denmark
Importance: High

Hello Robert

We need urgently August Monthly Techdata Split file printers for France and 
Denmark .

Thanks a lot .

Gurvan *
Robert Wickert
Senior Software Developer
CONTEXT
Tel:+44 (0)20 8394 7739
Fax:+44 (0)20 8394 7701
Email Address: [EMAIL PROTECTED]
Web Site: http://www.contextworld.com
*
This message and the information contained therein is intended for the use
of the person(s) ("the intended  recipient(s)" to whom it is addressed. It
may contain information that is privileged and confidential within the
meaning of applicable law. If you are not the intended recipient, please
contact the sender as soon possible. The views expressed in this
communication may not necessarily  be the views held by Context or its
subsidiaries. The contents of an attachment to this e-mail may contain
viruses that could damage your own computer system. While every reasonable
precaution has been taken to minimise this risk, Context and its
subsidiaries cannot  accept liability for any damage which you sustain  as a
result of software viruses. You should carry out your own virus checks
before opening the attachment. Please notice that Context monitors e-mails
sent or received. Further communication will signify your consent to this.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] JOIN with ORDER on both tables does a sort when it souldn't

2007-09-18 Thread Dániel Dénes

Tom Lane <[EMAIL PROTECTED]> wrote:

> Dániel Dénes <[EMAIL PROTECTED]> writes:
> > But even then, it won't realize that the result are in correct
> > order, and does a sort! Why?
> 
> In general the output of a nestloop doesn't derive any ordering
> properties from the inner scan.  It might happen to work in your
> particular case because on the outer side (site_id, order) is unique
> and so the "order" values must be strictly increasing.  But if there
> could be multiple rows with the same "order" value coming from the
> outer side, then it would be incorrect to claim that the join output is
> sorted by (outer.order, inner.order).
> 
> It's possible that the planner could be taught to recognize this
> situation, but it looks to me like doing that would result in drastic
> increases in planning time for many queries (due to having to consider
> a lot more Paths) with a resulting win in only a very few.
> 
> regards, tom lane

When you wrote this answer, I thought maybe it's really a one-time 
problem, and it's not worth spending much time on it, because the 
tables involved had 10-100 rows, so a sort wasn't really that scary; I 
just wanted to know the cause. 

But now I ran into this again. There are 2 tables involved (simplified):

banners_places:
- id integer (PKEY)
- pageid integer (FKEY to a table not involved now)
- place text
UNIQUE KEY: (pageid, place)

banners_show:
- id integer (PKEY)
- bplid integer (FKEY to banners_places.id)
- uptime timestamp
INDEX: (bplid, uptime)

My query is:
SELECT *
FROM banners_places AS bpl
JOIN banners_show AS bsh ON bsh.bplid = bpl.id
WHERE bpl.pageid = 123
ORDER BY bpl.place, bsh.uptime

To me it looks like the best plan would be to get the desired rows from 
banners_places and then do a NestLoop join using the index on 
banners_show. This way no sorting should be necessary.
But even though I forced PG to do my plan (disabled almost every 
alternative), the sort is there:

Sort
Sort Key: bpl.place, bsh.uptime
-> Nested Loop
  -> Index Scan using bpl_UNIQUE on banners_places bpl
  Index Cond: (pageid = 123)
  -> Index Scan using bsh_INDEX on banners_show bsh
  Index Cond: (bsh.bplid = "outer".id)

Are you sure this can't be fixed without drastically increasing planning 
time?
Or is there a way I can make this query not to do a sort?

Regards,
Denes Daniel

___
Légy mindig trendi és naprakész - olvass magazinokat a mobilodon Mobizinnel!
www.t-mobile.hu/mobizin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Recall: August Monthly techdata split fi

2007-09-18 Thread rwickert


Robert Wickert would like to recall the message, "August Monthly techdata split 
file printers for France and Denmark ". 
*
Robert Wickert
Senior Software Developer
CONTEXT
Tel:+44 (0)20 8394 7739
Fax:+44 (0)20 8394 7701
Email Address: [EMAIL PROTECTED]
Web Site: http://www.contextworld.com
*
This message and the information contained therein is intended for the use
of the person(s) ("the intended  recipient(s)" to whom it is addressed. It
may contain information that is privileged and confidential within the
meaning of applicable law. If you are not the intended recipient, please
contact the sender as soon possible. The views expressed in this
communication may not necessarily  be the views held by Context or its
subsidiaries. The contents of an attachment to this e-mail may contain
viruses that could damage your own computer system. While every reasonable
precaution has been taken to minimise this risk, Context and its
subsidiaries cannot  accept liability for any damage which you sustain  as a
result of software viruses. You should carry out your own virus checks
before opening the attachment. Please notice that Context monitors e-mails
sent or received. Further communication will signify your consent to this.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Ow Mun Heng

On Tue, 2007-09-18 at 06:01 -0400, Bill Moran wrote:
> * (with newer version) reduce the fill factor and REINDEX

What is fill factor?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] stability issues

2007-09-18 Thread Willy-Bas Loos

Hi,

I'm running a PostgreSQL 8.1.9 server on Debian etch (default installation,
w/ PostGIS from Debian repositories).
The machine has double Xeon 5130, 4 GB of DDR2 ECC Reg. RAM and a
two-sata-disc Raid 0 cluster. I don't know the brand of the memory nor the
HDDs. The Riad controller is a 3Ware 8006-2.

Lately i've been confronted by some failures in several PostgreSQL clusters.
3 weeks ago a cluster crashed when i was reindexing it, beond my repair.
Last week we've been getting Toast warnings and shared index problems on
another cluster. I solved the shared index issue, but the Toast trouble
remained, in more than one database on that cluster. I dumped the databases,
created another cluster and shut the troubled one down. (problems were
solved).
I've asked our provider to do a filesystem check, (e2fsck -f /dev/sda1)
which indicated that "there's nothing wrong with the filesystem".

This doesn't feel right.
Is there anything else i should check?

thx,

WBL

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Bill Moran

"Filip Rembiałkowski" <[EMAIL PROTECTED]> wrote:
>
> 2007/9/18, Joshua D. Drake <[EMAIL PROTECTED]>:
> 
> > If you overrun your max_fsm_pages then vacuum analyze IS NOT ENOUGH. If
> > you do not overrun your max_fsm_pages, yes vacuum analyze can deal with
> > the issue.
> 
> Are you sure? I have a situation where above is no true. postgres
> version 8.1.8. while vacuum verbose says:
> 
> INFO:  free space map contains 2329221 pages in 490 relations
> DETAIL:  A total of 2345744 page slots are in use (including overhead).
> 2345744 page slots are required to track all free space.
> Current limits are:  1000 page slots, 1000 relations, using 58698 KB.
> 
> ... and we have constant problem with index bloat and need to REINDEX
> frequently.
> 
> the database is very redundant and has quite hight data retention rate
> (it's an ecommerce site)

I've been involved in a number of the discussions on this, and I think
part of the confusion stems from the fact that "index bloat" is an
ambiguous term.

If the index gets large enough that it no longer fits in shared memory,
and reindexing it will reduce its size to where it _will_ fit in shared
memory, then the index _could_ be said to be "bloated".

However, an equally valid solution to that problem is to increase the
amount of shared memory available (possibly by adding RAM).

Unfortunately, folks like Phoenix are looking for yes/no answers, and
with many of these questions, the _correct_ answer is "it depends on
your workload"

If you find that reindexing improves performance, then you should
investigate further.  Depending on the exact nature of the problem,
there are many possible solutions, three that come to mind:
* Add RAM/SHM
* REINDEX on a regular schedule
* (with newer version) reduce the fill factor and REINDEX

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

2007-09-18 Thread Filip Rembiałkowski

2007/9/18, Joshua D. Drake <[EMAIL PROTECTED]>:

> If you overrun your max_fsm_pages then vacuum analyze IS NOT ENOUGH. If
> you do not overrun your max_fsm_pages, yes vacuum analyze can deal with
> the issue.

Are you sure? I have a situation where above is no true. postgres
version 8.1.8. while vacuum verbose says:

INFO:  free space map contains 2329221 pages in 490 relations
DETAIL:  A total of 2345744 page slots are in use (including overhead).
2345744 page slots are required to track all free space.
Current limits are:  1000 page slots, 1000 relations, using 58698 KB.

... and we have constant problem with index bloat and need to REINDEX
frequently.

the database is very redundant and has quite hight data retention rate
(it's an ecommerce site)


-- 
Filip Rembiałkowski

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] help w/ SRF function

2007-09-18 Thread Trevor Talbot

On 9/17/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote:

> > > CREATE OR REPLACE FUNCTION foo_func(fromdate timestamp, todate
> > > timestamp, code text)

> > > LANGUAGE 'sql' IMMUTABLE STRICT;

> > But If I were to use ALIASINg, I get an error
> >
> > eg: DECLARE
> >   DECLARE
> > fromdate ALIAS for $1;
> > todate ALIAS for $2;
> > code ALIAS for $3;
> >
> >
> > ERROR:  syntax error at or near "ALIAS"
> > LINE 5: fromdate ALIAS for $1;

> anyone knows how come I can't use the reference fromdate/todate etc or
> use aliases but have to resort to using $1/$2 etc?

You seem to be confusing SQL with PL/pgSQL.  If you want variables,
aliases, flow control etc instead of a simple macro, you need to use a
procedural language.

http://www.postgresql.org/docs/8.2/static/xfunc-sql.html
http://www.postgresql.org/docs/8.2/static/plpgsql.html

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] keeping 3 tables in sync w/ each other

2007-09-18 Thread Filip Rembiałkowski

2007/9/18, Ow Mun Heng <[EMAIL PROTECTED]>:
> Hi,
>
> I have 3 tables
>
> foo
> foo_loading_source1
> foo_loading_source2
>
> which is something like
>
> create table foo (a int, b int, c int)
> create table foo_loading_source1 (a int, b int, c int)
> create table foo_loading_source2 (a int, b int, c int)
>
> Is there a way which can be made easier to keep these 3 tables DDL in
> sync?
>
> the loading_sourceX tables are just a temporary-in-transit table for
> data \copy'ied into the DB before being inserted into the main foo
> table.
>
> Currently, each time I add a new column to foo, I have to "remember" to
> add the same to the other 2 table.
>
> Can I use inheritance? References?

Inheritance might work in this case. But it will be a bit weird,
because you will see non-constraint data in parent unless you will
SELECT ... FROM ONLY parent


Try this example:

create table parent ( id serial, data1 text );
create table child () inherits( parent );
\d child
alter table only parent add check ( data1 like '%fits parent' );
insert into parent(data1) select 'this data fits parent';
insert into child(data1) select 'this data was inserted to child';
select * from parent;
select * from only parent;
select * from child;
alter table parent add column data2 text default 'new column default';
\d child


-- 
Filip Rembiałkowski

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] pgcrypto: is an IV needed with pgp_sym_encrypt()?

2007-09-18 Thread Marko Kreen

On 9/18/07, Bill Moseley <[EMAIL PROTECTED]> wrote:
> I'm just starting with pgcrypto, and I'm curious if it's
> needed/recommended to use an initialization vector/value (IV) with
> the pgp_sym_encrypt() function.
>
> The docs hint that an IV is used automatically, but encrypting plain
> text that starts the same seems to result in initial common cipher
> text.  So, I'm not clear.

Few bytes being same is normal.  Those are PGP packet header,
telling "this is symmetrically encrypted session key packet,
with length X" plus some more details.

Yout can use pgpdump (or www.pgpdump.net) to visualize packet
structure.  It does not show you IV but does show salt key S2K
loop count, so you can check if those are randomized.

Note that the random IV will be put into second packet, so quite
far from start.

> 2. Data is prefixed with block of random bytes. This is equal to
> using random IV.
>
> So, I'm currently generating a substring of a md5 hash of a few items
> and pre-pending that to the plain text I need to encrypt as the IV.
> Then when I decrypt I remove that prefix.

You could try with different (same-length) hashes, you'll still
see that few bytes are same.   Also, the PGP IV _must_ be there
so to check you can always try decrypting with gnupg, to see if
packet structure is sane.

If more that few bytes are same, and if the salt is not randomised
it _could_ be a sign of problem. Either pgcrypto bug or failure
to initialize random generator.  If you suspect a problem, please
send me few example encryptions with keys and your setup details
(postgres version, openssl or not, os version)

> BTW, this is for credit card storage, which is a business requirement.
>
> Besides following the PCI DSS and external audit procedures, the plan
> is to use pgcrypto (pgp_sym_encrypt() with AES-256) as part of a
> credit card storage server.  The server and db are SSL only and the
> key is passed from the application and never stored anyplace (except
> in memcached on other servers during the session).  The key is a
> user's plain text password plus an application-specific secret.  So,
> each row has its own key.  Passwords must be changed periodically,
> etc.

I don't know details of your setup, but I strongly suggest you
look into using public-key crypto.  That allow you separate keys
for encryption and decryption.  So in webserver where users only
input credit cards, you keep only public keys, so anybody cracking
that won't be able to decrypt data.

Also, if you only want to check if inputted credit card matches
stored one, you don't need to store credit card at all, just store
hash and then compare it with the hash of user-inputted one.

For that it's preferable to use crypt() function with crypt-blowfish
hash, which is couple of magnitudes stronger that MD5/SHA* for that
purpose.

> I'd welcome any comments or recommendations from others that have
> implemented something similar.

-- 
marko

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] RFC : best way to distrubute IO from queries (low end server)

2007-09-18 Thread Ow Mun Heng

Final specs for the server is just an ordinary desktop fitted w/ 3
7200rpm 500GB drives & 1 7200 80GB drive / 1 GB ram / 2G processor
(single core)

number of records will be between 3 to 30 million rows.

Currently the process is 

1. pull from mssql
2. \copy into PG temp table
3. insert into final table.

current tables are distributed via tablespaces. (current test server is
my laptop w/ 2 5400rpm drives hda & hdc.)  and I'm already seeing the
strain of the concurrent select/delete/insert/update and the additional
"client" pull.

So, I wanted to ask the list for advice on how to tread w/ regard to the
server.

Scenario 1.

1. temp table in it's own tablespace
2. final tables in it's own tablespace
3. pgxlog in the OS tablespace

scenario 2
1. temp table in it's own tablespace
2. final tables in it's own tablespace (Read Only Copy)
3. final tables in it's own tablespace (Read write Copy)
4. pgxlog in the OS tablespace

the idea of read and read/write copy is obtained from some
presentation/article I read whereby, all updates of new data is inserted
into the read/write copy up until 1 point (say lunch time / 5pm etc)
whereby the read-write copy[3] will be "renamed" and made into a
read-only copy and the previous  read-only copy[2] will be made into a
read-write copy.

The only thing I can't wrap my head around is how to keep these 2 copies
in sync. eg: when everything is being updated to [3] and users are
querying [2] and then at the switch over instance, how will the new data
be inputted to the read only copy? because if I switch back and forth
like this, there's bound to be gaps in the data.

Any comments appreciated.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

75 matches

Mail list logo