Re: [GENERAL] Long term database archival

2006-07-07 Thread Ben



On Thu, 6 Jul 2006, Dann Corbit wrote:


It's the data that contains all the value.  The hardware becomes
obsolete when it can no longer keep up with business needs.



. or can no longer be repaired. :)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Postmaster is starting but shutting when trying to connect (Windows)

2006-07-07 Thread Thomas Kellerer

Hello,

i have a PostgreSQL (8.1) installation for testing purposes which was 
running fine for several months now (Windows XP). I was working with it 
yesterday, and today after booting my computer and restarting the 
service (I'm starting the service manually, because I don't need the 
server running all the time) postmaster comes up fine, no entries in the 
log file.


But as soon as I try to connect to the server (does not matter what type 
of client, psql, PgAdmin or JDBC) I get an error message in the log 
file. Here is the full log file:



2006-07-07 09:18:15 LOG:  database system was shut down at 2006-07-07 
09:14:00 Westeuropäische Sommerzeit

2006-07-07 09:18:15 LOG:  checkpoint record is at 0/3C82F60
2006-07-07 09:18:15 LOG:  redo record is at 0/3C82F60; undo record is at 
0/0; shutdown TRUE

2006-07-07 09:18:15 LOG:  next transaction ID: 100576; next OID: 19416
2006-07-07 09:18:15 LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2006-07-07 09:18:15 LOG:  database system is ready
2006-07-07 09:18:15 LOG:  transaction ID wrap limit is 2147484148, 
limited by database postgres


2006-07-07 08:43:54 LOG:  incomplete startup packet
2006-07-07 08:44:18 LOG:  could not receive data from client: An 
operation was attempted on something that is not a socket.


2006-07-07 08:44:18 LOG:  incomplete startup packet
2006-07-07 08:45:27 LOG:  received fast shutdown request
2006-07-07 08:45:27 LOG:  shutting down
2006-07-07 08:45:27 LOG:  database system is shut down
2006-07-07 08:45:28 LOG:  logger shutting down


I already checked with netstat whether I have something running on port 
5432 and I changed the port in posgresql.conf just to make sure it's not 
cause by another process that locks port 5432 that I'm not aware of. But 
still the same thing


There is no firewall active which would block that port either (and - as 
I said before I shut down my computer yesterday evening it was working fine)


There is also no postmaster.pid file hanging around when I shut down the 
service.


Any hints what I could try to fix this?

Thanks in advance
Thomas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Modeling Tool

2006-07-07 Thread hubert depesz lubaczewski
On 7/6/06, Rodrigo Sakai [EMAIL PROTECTED] wrote:











 Anyone knows a good tool for do the reverse engineering
of a postgresql database? I tried to use DBDesigner, but I couldn't get the
relationships!we used case studio 2. worked quite well.depesz-- http://www.depesz.com/ - nowy, lepszy depesz


Re: [GENERAL] Long term database archival

2006-07-07 Thread Csaba Nagy
On Thu, 2006-07-06 at 20:57, Karl O. Pinc wrote:
 Hi,
 
 What is the best pg_dump format for long-term database
 archival?  That is, what format is most likely to
 be able to be restored into a future PostgreSQL
 cluster.

 Should we want to restore a 20 year old backup
 nobody's going to want to be messing around with
 decoding a custom format dump if it does not
 just load all by itself.

Karl, I would say that if you really want data from 20 years ago, keep
it in the custom format, along with a set of the sources of postgres
which created the dump. then in 20 years when you'll need it, you'll
compile the sources and load the data in the original postgres
version... of course you might need to also keep an image of the current
OS and the hardware you're running on if you really want to be sure it
will work in 20 years :-)

Cheers,
Csaba.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Partition Rule Updating While Running?

2006-07-07 Thread Gene

Sorry if this is a duplicat, someone suggested posting to general as well,

I'm using PostgreSQL 8.1.4 in a Hibernate Application and I am
attempting to use partitioning via Inherited tables. At first I was
going to create a rule per sub-table based on a date range, but found
out with multiple rules postgres will only return the affected-row
count on the last rule which gives Hibernate problems. So now I'm
thinking the way to do it is just have one rule at a time and when I
want to start appending data to a new partition, just change the rule
on the parent table and also update the constraint on the last table
to reflect the date ranges contained so that constraint_exclusion will
work. this should perform better also. For instance

Starting off with:

Parent (Rule on insert instead insert into Child2)
 Child1 (Constraint date = somedate1)
 Child2 (Constraint date  somedate1)

Now I want to create another Partition:

Create Table Child3
BEGIN
Update Parent Rule( instead insert into Child3)
somedate2 = max(date) from Child2
Update Child2 Constraint( date  somedate1 AND date = somedate2 )
Set Constraint Child3 (date  somedate2)
END

Which ends up with:

Parent (Rule on insert instead insert into Child2)
 Child1 (Constraint date = somedate1)
 Child2 (Constraint date  somedate1 AND date = somedate2)
 Child3 (Constraint date  somedate2)

Anyone else tried this or expect it to work consistently (without
stopping db)? Is it possible that there could be a race condition for
the insertion and constraints or will the transaction prevent that
from occurring? I've done some testing and it seems to work but I
could just get lucky so far and not lose any data :)

Thanks for any help,
Gene

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Partition Rule Updating While Running?

2006-07-07 Thread Martijn van Oosterhout
On Fri, Jul 07, 2006 at 04:39:53AM -0400, Gene wrote:
 Sorry if this is a duplicat, someone suggested posting to general as well,
 
 I'm using PostgreSQL 8.1.4 in a Hibernate Application and I am
 attempting to use partitioning via Inherited tables. At first I was
 going to create a rule per sub-table based on a date range, but found
 out with multiple rules postgres will only return the affected-row
 count on the last rule which gives Hibernate problems. So now I'm
 thinking the way to do it is just have one rule at a time and when I
 want to start appending data to a new partition, just change the rule
 on the parent table and also update the constraint on the last table
 to reflect the date ranges contained so that constraint_exclusion will
 work. this should perform better also. For instance

Why not just decide somedate2 in advance and create the constraint as a
range to start with. Then, a few days before somedate2 add a new
constraint for between somedate2 and somedate 3. Then you never have to
update any constraints...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Postmaster is starting but shutting when trying to connect (Windows)

2006-07-07 Thread Thomas Kellerer

On 07.07.2006 09:20 Thomas Kellerer wrote:

Hello,

i have a PostgreSQL (8.1) installation for testing purposes which was 
running fine for several months now (Windows XP). I was working with it 
yesterday, and today after booting my computer and restarting the 
service (I'm starting the service manually, because I don't need the 
server running all the time) postmaster comes up fine, no entries in the 
log file.


But as soon as I try to connect to the server (does not matter what type 
of client, psql, PgAdmin or JDBC) I get an error message in the log 
file. 


Follow up:

when I start Postgres manually i.e. from a commandline running as the 
postgres user, using pg_ctl, then everything is working fine.


When I use net start to start the Postgres Service I cannot connect

Cheers
Thomas


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Why my cursor construction is so slow?

2006-07-07 Thread Roman Neuhauser
# kleptog@svana.org / 2006-06-22 09:19:44 +0200:
 On Tue, Jun 20, 2006 at 02:06:19AM -0700, [EMAIL PROTECTED] wrote:
  Such construction is very slow but when I modify SQL to:
  OPEN cursor1 FOR SELECT * FROM alias WHERE mask=alias_out
  ORDER BY mask LIMIT 100;
  
  it works very fast. It is strange for me becuase I've understood so far
  that when cursor is open select is executed but Postgres does not
  select all rows - only cursor is positioned on first row, when you
  execute fetch next row is read. But this example shows something
  different.
 
 PostgreSQL tries to optimise for overall query time. Without the limit
 it tries to find a plan that will return the whole set as quick as
 possible.

That looks like the wrong approach for a cursor.

 With the LIMIT it might take a different approach, which
 might be worse if you read the whole lot, but better for a limited set.
 A fast-start plan so to speak.

That looks like a better approach for a cursor.

-- 
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Long term database archival

2006-07-07 Thread Shane Ambler
On 7/7/2006 17:49, Csaba Nagy [EMAIL PROTECTED] wrote:

 On Thu, 2006-07-06 at 20:57, Karl O. Pinc wrote:
 Hi,
 
 What is the best pg_dump format for long-term database
 archival?  That is, what format is most likely to
 be able to be restored into a future PostgreSQL
 cluster.
 
 Should we want to restore a 20 year old backup
 nobody's going to want to be messing around with
 decoding a custom format dump if it does not
 just load all by itself.
 
 Karl, I would say that if you really want data from 20 years ago, keep
 it in the custom format, along with a set of the sources of postgres
 which created the dump. then in 20 years when you'll need it, you'll
 compile the sources and load the data in the original postgres
 version... of course you might need to also keep an image of the current
 OS and the hardware you're running on if you really want to be sure it
 will work in 20 years :-)
 
 Cheers,
 Csaba.
 

Depending on the size of data (if it isn't too large) you could consider
creating a new database for archives, maybe even one for each year.

This can be on an old server or backup server instead  of the production
one.

Unless the data is too large you can  dump/restore the archive data to a new
pg version as you upgrade meaning the data will always be available and you
won't have any format issues when you want to retrieve the data.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] VACUUM and fsm_max_pages

2006-07-07 Thread DANTE Alexandra

Good morning List,

I have seen several posts on this concept but I don’t find a complete 
response.
I’m using BenchmarkSQL to evaluate PostgreSQL in transaction processing 
and I work with PostgreSQL 8.1.3 on RHEL4-AS, Itanium-2 processor, 8GB RAM.


The database, generated via BenchmarkSQL and used, is a 200-warehouses 
database and its size is about 20GB. The parameter “max_fsm_pages” is 
equal to 2 and “max_fsm_relations” to 1000.


Between two benchs, I launch a VACUUM but at the end of it, I see that 
PostgreSQL asks me to increase the “max_fsm_pages” parameters and the 
value proposed grows with the number of VACUUM launched…


Could someone explain me why ?

This is an example of the message I have :
Free space map contains 20576 pages in 17 relations
A total of 2 page slots are in use (including overhead)
128512 page slots are required to track all free space
Current limits are : 2 page slots, 1000 relations, using 223 KB
Number of page slots needed (128512) exceeds max_fsm_pages (2)
HINT : Consider increasing the config parameter “max_fsm_pages” to a 
value over 128512.


In order not to launch a VACUUM FULL, I increase the value of 
“max_fsm_pages” but is it correct ?


Thank you for your help.
Regards,
Alexandra DANTE

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] WAL internals

2006-07-07 Thread Fabrice Franquenk

Hello,
I am currently working on a heavily stressed system and i am
having some difficulties to make the background writer work properly.

But before going any further, i would like to be sure that i understood 
how transaction
processing, and management are handled by postgreSQL. Correct me if i'm 
wrong :


1. A Transaction is sent from the postgreSQL backend

2. If the result of the transaction is within the shared buffers, we get 
our result
instantly. Else some searching is done within the database datafiles to 
get the result
which is copied to the shared buffers memory zone. The transaction is 
stocked in a WAL

buffer.
Now, when the result is copied to the shared buffer, if the transaction 
was an update or a delete

the line is flagged to be updated/deleted in the datafiles.

transactions go on and on this way.

At some points, the WAL buffers are written in the checkpoint segments. 
I don't know when,

if you could just precise this point.

3. Then periodically, there are checkpoints, those will make the changes 
into the datafiles from

the shared buffers (meaning shared buffers are flushed into the datafiles).
The last written record in the datafiles is flagged into the checkpoint 
segments

that way REDOs are possible.

Now i tried to set the bgwriter_lru_percent to 100% and 
bgwriter_lru_maxpages to 1000 and
i did not spot any difference with the disk activities, cpu occupation 
or anything else from
the default set up which is 1% and 5 so i was wondering if commiting 
after every transaction
would prevent me from seeing any difference ? or is there another 
explanation ?


by the way, i made sure all changes took effect i restarted the 
postmaster process.

I first asked these questions on the novice mailing list.

Thanks in advance for your help

Regards,

Fabrice Franquenk.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Do checkpoints flush all data from shared buffers ?

2006-07-07 Thread Fabrice Franquenk

Hello,
  I was wondering if each checkpoint would flush all transactions from 
the shared buffers or if there could be some left at the end of the 
checkpoint ?
Because i was trying to lower I/Os of the disks, i got the checkpoint 
timeout lowered to 150 seconds so i get twice the number the checkpoint. 
I was hoping it would reduce the number of  I/Os on the disks because 
there would be less data to write in datafiles...


Thanks in advance for your help

regards,
   Fabrice Franquenk

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Roman Neuhauser
# [EMAIL PROTECTED] / 2006-07-06 22:41:27 -0600:
 OK I know this is an odd question but I'm working on an app that will rely 
 more and more on database driven functions, and while the app's source is 
 in SVN, and I intend for the source of the SQL scripts to also be there, I 
 was wondering...what are people doing for version control and change 
 management on their custom (esp PL/pgSQL and say PL/Perl) functions?

Well, people use a version control system.

Do you have any specific questions?

-- 
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] WAL internals

2006-07-07 Thread Martijn van Oosterhout
On Fri, Jul 07, 2006 at 11:49:13AM +0200, Fabrice Franquenk wrote:
 2. If the result of the transaction is within the shared buffers, we get 
 our result
 instantly. Else some searching is done within the database datafiles to 
 get the result
 which is copied to the shared buffers memory zone. The transaction is 
 stocked in a WAL
 buffer.
 Now, when the result is copied to the shared buffer, if the transaction 
 was an update or a delete
 the line is flagged to be updated/deleted in the datafiles.

The shared buffers are merely images of what's on disk. Whenever a
block is read from disk it goes into a shared buffer. This applies to
index pages, data pages, anything.

Tuples are inserted into the pages in the buffer, and also appended to
the transaction log. The page is marked as dirty but not yet written
out.

 transactions go on and on this way.

When a transaction commits, the WAL is synced.

 At some points, the WAL buffers are written in the checkpoint segments. 
 I don't know when,
 if you could just precise this point.

Straight away, why delay?

 3. Then periodically, there are checkpoints, those will make the changes 
 into the datafiles from
 the shared buffers (meaning shared buffers are flushed into the datafiles).
 The last written record in the datafiles is flagged into the checkpoint 
 segments
 that way REDOs are possible.

No, all checkpoints do is make sure all pages in the shared buffers
match what's on disk. When that's the case, you don't need to keep the
WAL anymore. REDO just replays the WAL, nothing more. You only read it
on unclean shutdown.

 Now i tried to set the bgwriter_lru_percent to 100% and 
 bgwriter_lru_maxpages to 1000 and
 i did not spot any difference with the disk activities, cpu occupation 
 or anything else from
 the default set up which is 1% and 5 so i was wondering if commiting 
 after every transaction
 would prevent me from seeing any difference ? or is there another 
 explanation ?

All the bgwriter does is write out dirty pages to disk so checkpoints
don't take as long. How much data is there ever outstanding on your
system, if it's not much, then the bgwriter probably isn't doing
much...

BTW, check out the documentation on WAL, it's much clear than what I've
written..

Hvae a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Why my cursor construction is so slow?

2006-07-07 Thread Martijn van Oosterhout
On Fri, Jul 07, 2006 at 11:30:35AM +, Roman Neuhauser wrote:
  With the LIMIT it might take a different approach, which
  might be worse if you read the whole lot, but better for a limited set.
  A fast-start plan so to speak.
 
 That looks like a better approach for a cursor.

For a cursor postgres assumes you're going to ask for about 10% of the
result, so it does aim for a reasonably fast-start plan. It probably
depends on the specifics of the situation how well it works...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Long term database archival

2006-07-07 Thread Tino Wildenhain

Csaba Nagy schrieb:
...

Karl, I would say that if you really want data from 20 years ago, keep
it in the custom format, along with a set of the sources of postgres
which created the dump. then in 20 years when you'll need it, you'll
compile the sources and load the data in the original postgres
version... of course you might need to also keep an image of the current
OS and the hardware you're running on if you really want to be sure it
will work in 20 years :-)


No need - you will just emulate the whole hardware in 20 years ;-)

Regards
Tino

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] How to sample records

2006-07-07 Thread Vittorio
Dear friends, I'm resending my

Dear friends,
owing to a very poor 
webmail I'm compelled to use I'm resending my messed-up message of 
yesterday.


in postgresql 8.0.7 I have the following table

 \d 
basedati
   Tabella public.basedati
 Colonna  
|Tipo | Modificatori
--
+-+--
 data_ora | timestamp 
without time zone |
 cod_wmo  | character(5)|
 t_aria   
| numeric(5,1)|
 ur   | integer 
|


looking like:

  data_ora   | cod_wmo | t_aria | ur
-+-++
 2006-07-05 00:50:00 | 
16235   |   22.0 | 74
 2006-07-05 02:50:00 | 16235   |   19.0 | 80
 
2006-07-05 09:50:00 | 16235   |   28.0 | 45
 2006-07-05 12:00:00 | 
16235   |   31.0 | 41
 2006-07-06 00:00:00 | 16235   |   20.1 | 93
 
2006-07-06 00:50:00 | 16235   |   20.0 | 93
 2006-07-06 06:50:00 | 
16235   |   25.0 | 65
 2006-07-06 11:50:00 | 16235   |   30.0 | 49
 
2006-07-06 12:00:00 | 16235   |   29.5 | 51
 2006-07-05 03:00:00 | 
16242   |   19.9 | 64
 2006-07-05 03:15:00 | 16242   |   20.0 | 69
 
2006-07-05 10:15:00 | 16242   |   28.0 | 39
 2006-07-05 12:00:00 | 
16242   |   28.6 | 39
 2006-07-06 00:00:00 | 16242   |   22.2 | 71
 
2006-07-06 00:15:00 | 16242   |   22.0 | 74
 2006-07-06 00:45:00 | 
16242   |   23.0 | 64
 2006-07-06 01:15:00 | 16242   |   21.0 | 74
 
2006-07-06 10:15:00 | 16242   |   27.0 | 56
 2006-07-06 12:00:00 | 
16242   |   25.6 | 72

What I would like to extract (and I'm unable 
to!!) is:

for each cod_wmo and each day (that is to_char(data_ora, 
'-MM-
DD')), the complete record in correspondence of the maximum 
of t_aria.
e.g.
  data_ora   | cod_wmo | t_aria | ur
-+-++
 2006-07-05 12:00:00 | 
16235   |   31.0 | 41
 2006-07-06 11:50:00 | 16235   |   30.0 | 49
 
2006-07-05 12:00:00 | 16242   |   28.6 | 39
 2006-07-06 10:15:00 | 
16242   |   27.0 | 56

Please help this absolute beginner

Ciao
Vittorio




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Kenneth Downs

Michael Loftis wrote:

OK I know this is an odd question but I'm working on an app that will 
rely more and more on database driven functions, and while the app's 
source is in SVN, and I intend for the source of the SQL scripts to 
also be there, I was wondering...what are people doing for version 
control and change management on their custom (esp PL/pgSQL and say 
PL/Perl) functions?


We went for generating all server-side code out of a data dictionary.  
This makes for a significant change in the way change management is handled.


In this scenario change management becomes the analysis of before and 
after data dictionaries.  If the changes are all valid, build the code.




--
Genius might be described as a supreme capacity for getting its 
possessors

into trouble of all kinds.
-- Samuel Butler

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



begin:vcard
fn:Kenneth  Downs
n:Downs;Kenneth 
email;internet:[EMAIL PROTECTED]
tel;work:631-689-7200
tel;fax:631-689-0527
tel;cell:631-379-0010
x-mozilla-html:FALSE
version:2.1
end:vcard


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Jorge Godoy
Kenneth Downs [EMAIL PROTECTED] writes:

 We went for generating all server-side code out of a data dictionary.  This
 makes for a significant change in the way change management is handled.

 In this scenario change management becomes the analysis of before and
 after data dictionaries.  If the changes are all valid, build the code.

Ken, could you explain it a bit better?  I think this is an interesting idea.

-- 
Jorge Godoy  [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] migration from Sybase to Postgres

2006-07-07 Thread lanczos
What is the most appropriate way to migrate a database form Sybase (SQL 
Anywhere 5.504) to Postgres? I found some shareware of freeware migration tools 
on the net, which are the best? Is it faster to use any of them them or just 
simply do it manually?

Thanks 

Tomas
---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] migration from Sybase to Postgres

2006-07-07 Thread Merlin Moncure

On 7 Jul 2006 12:50:22 -, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

What is the most appropriate way to migrate a database form Sybase (SQL 
Anywhere 5.504) to Postgres? I found some shareware of freeware migration tools 
on the net, which are the best? Is it faster to use any of them them or just 
simply do it manually?


An excellent conversion tool is actually microsoft sql server.  It can
convert via dts transformation to any odbc source to any other odbc
source if you have a windows machine to do the conversion and a copy
of the database.

barring that, I would just dump sybase into sql statements (does it
support that?) and massage the text and pipe into psql.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[GENERAL] Need help with quote escaping in exim for postgresql

2006-07-07 Thread Marc Haber
Hi,

I am the maintainer of Debian's packages for exim4, a powerful and
versatile Mail Transfer Agent developed in Cambridge and in wide use
throughout the Free Software Community (http://www.exim.org/).

One of our daemon flavours has PostgreSQL support. Our security guys
have found a flaw in exim regarding quote escaping for PostgreSQL. The
bug is filed in Debian's BTS as http://bugs.debian.org/369351 and was
transferred to exim's Bugzilla installation as
http://www.exim.org/bugzilla/show_bug.cgi?id=107.

Personally, I do not have any PostgreSQL experience (and do not have
time and expertise to accumulate any), and the PostgreSQL support code
in exim was contributed some time ago and Philip Hazel, exim's author,
doesn't know too much about PostgreSQL as well.

From what I understand, the correct way would be to use
PQescapeStringConn, but that function needs an established connection,
and exim performs string escape early, way before the actual
connection is established.

I'd appreciate if anybody familiar with PostgreSQL programming could
take a look at the two bug reports and probably exim's program code
and suggest a possible solution, preferably in the bugzilla issue log
referenced above. I'll monitor this thread for possible solutions and
help, though.

Any help would be greatly appreciated.

Greetings
Marc


-- 
-
Marc Haber | I don't trust Computers. They | Mailadresse im Header
Mannheim, Germany  |  lose things.Winona Ryder | Fon: *49 621 72739834
Nordisch by Nature |  How to make an American Quilt | Fax: *49 621 72739835

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Merlin Moncure

On 7/7/06, Michael Loftis [EMAIL PROTECTED] wrote:

OK I know this is an odd question but I'm working on an app that will rely
more and more on database driven functions, and while the app's source is
in SVN, and I intend for the source of the SQL scripts to also be there, I
was wondering...what are people doing for version control and change
management on their custom (esp PL/pgSQL and say PL/Perl) functions?


Keeping your sql procs in cvs/svn is a great idea, and is (IMHO) the
only way to really do it right if you have a large project.  I have
some suggestions that may help you get started.

1. keep your database procedures different schemas in the database.
do not put tables or views in these schemas.  for example, if you are
building an accounting application, make a schema called, ar (accounts
recevable), ap (payables), gl, etc.  put all sql code in appropriate
schemas.  These should mirror your folder structure in your code
repository.  Since schemas can only go one level deep, try and
structure your code base to go only one level deep.

2. For each schema/folder, maintain a sql build file or some type if
make file which uploads the code to the database.  you could get fancy
with this, or just do a simple cat *.sql | psql yadda in a one line
shell script.  The important thing is to have an automatic way of
reconstructing your database.

3. ban your developers from editing directly in the database.  this
means no pgadmin (for ddl), and no direct ddl in the shell.  This
bypasses the souce control.  While it is fine for a development test
database, all uploads to production databse should go through the
build system.  It is ok to copy/paste from .sql files into
shell/pgadmin however.

4. an application code/database code, make a habit of fully qualifying
the function e.g.
select ar.update_invoices();

5. when you make updates to a production sysem, just include (\i) your
.sql files that have been updated with the change.  dml can be inlined
however.
e.g.
-- yadda_1.1.sql
-- converts yadda from 1.0 to 1.1
\i ../../ar/update_invoices.sql
\i ../../ap/delete_customer.sql
update foo set bar = 1;

6. I would suggest, for extra safety purposes, doing a full
schema-only dump on cron and inserting into svn on a daily basis.

7. views and other table dependant objets (triggers but not trigger
functions) should be stored in the same schema as the table(s) they
operate over.  Unlike functions they therefore can not match 1-1
fodler correspondence if you have multiple copies of same table in
different schemas.

Putting all this together, I would suggest a folder structure like
yadda
 ar
   funcs
 update_invoices.sql
   views
 achived_invoices.sql
   build_ar.sql
  ap
 funcs
 views
 build_ap.sql
  updates
 yadda_1.0.sql
 yadda_1.1.sql

merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Long term database archival

2006-07-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ben wrote:
 
 
 On Thu, 6 Jul 2006, Dann Corbit wrote:
 
 It's the data that contains all the value.  The hardware becomes
 obsolete when it can no longer keep up with business needs.
 
 
 . or can no longer be repaired. :)

http://www.softresint.com/charon-vax/index.htm

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is common sense really valid?
For example, it is common sense to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that common sense is obviously wrong.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFErnGyS9HxQb37XmcRAjz0AKCvhP7k5quH+Ozdwa1Z35zvdYyuLACgu45B
tgCgyFmeOvyKp7jzZivpSdI=
=8CL1
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[GENERAL] How to optimize query that concatenates strings?

2006-07-07 Thread badlydrawnbhoy
Hi all,

I've got a database of URLs, and when inserting new data into it I want
to make sure that there are no functionally equivalent URLs already
present. For example, 'umist.ac.uk' is functionally the same as
'umist.ac.uk/'.

I find that searching for the latter form, using string concatentation
to append the trailing slash, is much slower than searching for a
simple string - the index on URL name isn't used to speed up the
search.

Here's an illustration

url=# explain select exists(select * from url where url = 'umist.ac.uk'
or url || '/' = 'umist.ac.uk') as present;
  QUERY PLAN

---
 Result  (cost=47664.01..47664.02 rows=1 width=0)
   InitPlan
 -  Seq Scan on url  (cost=0.00..47664.01 rows=6532 width=38)
   Filter: ((url = 'umist.ac.uk'::text) OR ((url || '/'::text)
= 'umist.ac.uk'::text))
(4 rows)

url=# explain select exists(select * from url where url =
'umist.ac.uk') as present;
 QUERY PLAN

 Result  (cost=5.97..5.98 rows=1 width=0)
   InitPlan
 -  Index Scan using url_idx on url  (cost=0.00..5.97 rows=1
width=38)
   Index Cond: (url = 'umist.ac.uk'::text)
(4 rows)


Is there any way I can force postgres to use the index when using the
string concatenation in the query?

Thanks in advance,

BBB


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] OLEDB connection does not want to work. Help!!

2006-07-07 Thread tommaso . gastaldi
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Is this for me? I am currently using Google to post.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] OLEDB connection does not want to work. Help!!

2006-07-07 Thread tommaso . gastaldi

Hi Merlin, as I tried to explain, I do not need just to send some sql
to postgres, I am working at an higher level of abstraction, where I
need the information (that must be) provided by (any) OleDb Provider

 in my opinion the support for the npgsql driver is ok (not great).  It
 is much better than the oledb provider however.

No doubt about that, but this is irrelevant for my purposes.

  In general when you have an application that needs to speak to all
  possible DBMS (and there are many, such as Reporting Tools or Query
  builders, ...), and hence rely strictly on the OleDb information, the
  OleDb is the only way to go (I believe).

 I have used basically every version of visual studio and have found
 that in most cases for most uses, you will be best off with the oledb
 provider for odbc drivers if you want to use such tools.  Second
 choice (.net only) is to use npgsql driver and typed datasets.

I am NOT talking of the oledb provider for odbc drivers: I never used
it either.

  That's one of the reason of the popularity OleDb providers are gaining
  among programmers.

 ODBC is much more important standard than oledb.

Here we have different opinions. I could show you a lot of things you
cannot
do without the OleDb functionalities, at least if you want to talk to
all
dbms at the same time.

 True, but oledb is basically microsoft only protocol while odbc is
 essentially open standard.  Personally, I don't mind using specific
 providers.  For example, at least 50% of my development is directly
 over libpq library.

Nowadays every producer must have its OleDb provider. It's not an
option.
Moreless, like for me it not an option to write a program that does not
use
web services or isn't web enabled. Of course I could disregard that
part.
But that I would lose about 90% of my potential users (or clients).


  It is important that the Postgres people realize that a good OleDb
  provided is crucial to spread their product. Just as an example I am
  just waiting for their fix to provide support to Postgres users in one
  project of mine:
 
  http://cam70.sta.uniroma1.it/Community/
 
  and this will certainly contribute to increase (a little) the
  popularity of Postgress. (But I cannot do anything to help them if they
  don't fix it.)


You know, Merlin, sometimes people perpective are just different
because they
come from different experiences and environment. It's not matter to be
right or wrong.
In my experience, it is crucial the possibility to rely on the high
level information
provided by the OleDb protocol, because it allows to have a unique
interface for
any dbms in the world and to have a unique way to do things (apart the
slight differences
in sql dialects). This allows easy system integration. If you see for
intance how
DataTime works, retrieving dbms structure talking with any dbms and
easily moving data from one to another, it will be clear what I mean.
If I should write code that depends on the underlying DBMS, maintenance
would just be impossibile and programming a real hell (it's already a
hell the way it is now :) ! ) General standards have always prevailed.
It's just matter of time.

un caro saluto,

Tommaso


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Need help with quote escaping in exim for postgresql

2006-07-07 Thread Martijn van Oosterhout
On Fri, Jul 07, 2006 at 03:48:00PM +0200, Marc Haber wrote:
 Hi,
 
 I am the maintainer of Debian's packages for exim4, a powerful and
 versatile Mail Transfer Agent developed in Cambridge and in wide use
 throughout the Free Software Community (http://www.exim.org/).
 
 One of our daemon flavours has PostgreSQL support. Our security guys
 have found a flaw in exim regarding quote escaping for PostgreSQL. The
 bug is filed in Debian's BTS as http://bugs.debian.org/369351 and was
 transferred to exim's Bugzilla installation as
 http://www.exim.org/bugzilla/show_bug.cgi?id=107.

Whether or not the quick fix works for you depends entirly on the
encoding used by the client to talk to the database. If the connection
is encoded using UTF-8 or any of the Latin series, then it will be
fine. The only time it does not work is if the encoding is an encoding
where the quote or backslash character can appear as the second
character of a multibyte char. This doesn't happen with UTF-8 or any
latin encoding.

http://www.postgresql.org/docs/techdocs.50

This bit may be useful also (especially the second point):

 There are a number of mitigating factors that may keep particular
applications from being subject to these security risks:

* If application always sends untrusted strings as out-of-line
parameters, instead of embedding them into SQL commands, it is not
vulnerable.
* If client_encoding is a single-byte encoding (e.g., one of the
LATINx family), there is no vulnerability.
* If application cannot pass invalidly encoded data to the server,
there is no vulnerability (this probably includes all Java
applications, for example, because of Java's handling of Unicode
strings).

The easiest may be to simply always set the client encoding to
something like UTF-8 and work the escaping rules so they work with
that.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Kenneth Downs




Jorge Godoy wrote:

  Kenneth Downs [EMAIL PROTECTED] writes:

  
  
We went for generating all server-side code out of a data dictionary.  This
makes for a significant change in the way change management is handled.

In this scenario change management becomes the analysis of "before" and
"after" data dictionaries.  If the changes are all valid, build the code.

  
  
Ken, could you explain it a bit better?  I think this is an interesting idea.

  

Sure. To start off I'd say I'm one of those "biz rules belong in the
server" guys. My guess is we are on the same page there so we'll take
that as a given.

So anyway, some years ago I joined an existing project and was
eventually promoted to systems architect. Along the way I developed
their change management system from scratch (we had more salary dollars
than tools dollars). The "Aha!" moment came when I realized what may
seem obvious to many, which was that you can never, nohow, noway, never
prove ahead of time that any particular piece of code was not going to
break something. You can't even prove it will do what anybody
claims. 

I wanted a way to know by analysis, just by looking, that any
particular change to a spec would work. That is, it would do what it
was supposed to do, without stopping other things from doing what they
were supposed to do.

It so happens you can have this if you generate your code out of a spec
that is itself data. The spec has to be comprehensive, it can't just
be columns and tables. You need to be able to specify security and
derivations all in one place, that is the only way to specify all
business rules in a single place.

There are two major things you can do to make sure a spec is workable
before you start generating DDL and triggers.

First, you look for mistakes in the spec itself, such as duplicate
column names in tables, references to non-existent tables, and so forth.

Second, you look for mistakes or impossibilities in the delta-spec, the
changes to the spec. For instance, if column COL1 is char(7) and the
new spec has it listed as INT, you can stop there and tell the person
the change is not valid.

Futhermore, you can then do really cool things like generate a report
of what *would* happen if you did an upgrade, such as the creation of
new tables, changes in formulas for existing columns, new cascades,
changes in definitions of keys (added a delete cascade, removed a
delete cascade), and then give it to the customer to sign. Ha! I love
that one :)

What falls out of all of this for free is that once you have that data
dictionary you don't have to code maintenance forms anymore, because a
library file can generate any maintenance from from the dictionary
description of a particular table.

So anyway, that's the tip of the iceberg on that. Once you go to a
dictionary-based generation system, it actually changes a lot of how
you do things, not just change management.





begin:vcard
fn:Kenneth  Downs
n:Downs;Kenneth 
email;internet:[EMAIL PROTECTED]
tel;work:631-689-7200
tel;fax:631-689-0527
tel;cell:631-379-0010
x-mozilla-html:FALSE
version:2.1
end:vcard


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Addressing: ERROR: could not access status of transaction

2006-07-07 Thread Mark Stosberg


PostgreSQL has been providing reliable service for our web hosting
company since 1997. Thanks!

Last night we got the following error during a dump of an 8.0.6
database:

 pg_dump: SQL command failed
 pg_dump: Error message from server: ERROR:  could not access status of
transaction 245900066
 DETAIL:  could not open file /usr/local/pgsql/data/pg_clog/00EA: No
such file or directory
 pg_dump: The command was: SELECT tableoid, oid, oprname, oprnamespace,
(select usename from pg_user where oprowner = usesysid) as usename,
oprcode::oid as oprcode FROM pg_operator

Another dump run during the same time frame did not have this problem,
and running the mentioned command in 'psql' produces no error.

Research shows that this could be a corrupted tuple, but it's not
clear what it means to me if it sometimes work.

This follows behavior in the past few days where we noticed PostgreSQL
core dumping repeatedly, with this error:

  PANIC:  right sibling's left-link doesn't match

The core dumps stopped when we REINDEX'ed the table mentioned in the
PANIC statement.

We were already planning to upgrade to 8.1.x Real Soon.

Are their further insights about this new error? And is my expectation
correct that a full/dump restore into 8.1.x would address it?

Thanks!

Mark

 . . . . . . . . . . . . . . . . . . . . . . . . . . .
   Mark StosbergPrincipal Developer
   [EMAIL PROTECTED] Summersault, LLC
   765-939-9301 ext 202 database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[GENERAL] Can Log filename include database name?

2006-07-07 Thread Francisco Reyes
I am currently using a log with the file name format: 
log_filename = 'postgresql-%Y-%m.log'


Is there any way to change the filename do start witht he database name?

For now just added to add the database name to each line, but it would be 
usefull to have each DB written to it's own file. Or even better to be able 
to specify on a per database basis whether to log or not.  


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Need help with quote escaping in exim for postgresql

2006-07-07 Thread Martijn van Oosterhout
On Fri, Jul 07, 2006 at 03:48:00PM +0200, Marc Haber wrote:
 From what I understand, the correct way would be to use
 PQescapeStringConn, but that function needs an established connection,
 and exim performs string escape early, way before the actual
 connection is established.

I just downloaded the code and looked. The code never looks or checks
the encoding of the database. This is bad from a security point of
view because that means you have no idea how your queries are going to
be interpreted.

I'd suggest adding a PQsetClientEncoding(conn, Latin1) right after
you establish a connection. I'm not sure if Exim has any kind of
declaration about what encoding strings have internally. You could use
UTF-8 but then postgres would complain if you pass any strings that
arn't valid UTF-8. They may or may not be desirable.

SQL_ASCII may also be an option (assign no special meaning to
characters at all), but I'm less sure of that. Can email address
contain multibyte characters? I didn't think so... What about the
configuration file?

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


[GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Sven Willenberger
Postgresql 8.0.4 on FreeBSD 5.4

I have a table consisting of some 300million rows that, every couple of
months, has 100 million rows deleted from it (an immediately vacuumed
afterward). Even though it gets routinely vacuumed (the only
deletions/updates are just the quarterly ones), the freespace map was
not increased in size to keep up with the growing size of the other
tables in the database which do experience many updates,etc.

I suspect that the table is suffering from bloat (not the indexes though
as I drop them prior to the huge delete, then create them anew). What
would be the recommended method for reclaiming the disk space lost due
to bloat? Dropping the 5 indexes on the table and doing a VACUUM FULL,
keeping the indexes and doing a VACUUM FULL (does FULL perform the same
disk moving operations on the indexes as it does on the actual table?),
dropping the indexes except the primary key and CLUSTER ON primary key,
keeping the indexes and doing a CLUSTER ON primary key (again, does
CLUSTER ON just operation on the table proper?)

What are the caveats on using one over the other? I imagine any of the
options I listed above will involve a full table lock. Are there any
differences in the amount of free disk space required for each method?

Thanks,

Sven


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] How to optimize query that concatenates strings?

2006-07-07 Thread Jacob Coby

badlydrawnbhoy wrote:

Hi all,

I've got a database of URLs, and when inserting new data into it I want
to make sure that there are no functionally equivalent URLs already
present. For example, 'umist.ac.uk' is functionally the same as
'umist.ac.uk/'.

I find that searching for the latter form, using string concatentation
to append the trailing slash, is much slower than searching for a
simple string - the index on URL name isn't used to speed up the
search.

Here's an illustration

url=# explain select exists(select * from url where url = 'umist.ac.uk'
or url || '/' = 'umist.ac.uk') as present;


Well, in that example, you should just remove the OR conditional - it 
just evaluates to false anyways.




Is there any way I can force postgres to use the index when using the
string concatenation in the query?


If you are always going to strcat with a '/', you could probably create 
a functional index or add a new column for a normalized url (which is 
what I'd lean towards).




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Addressing: ERROR: could not access status of transaction

2006-07-07 Thread Tom Lane
Mark Stosberg [EMAIL PROTECTED] writes:
 Last night we got the following error during a dump of an 8.0.6
 database:
   pg_dump: Error message from server: ERROR:  could not access status of
 transaction 245900066
 Another dump run during the same time frame did not have this problem,
 and running the mentioned command in 'psql' produces no error.

 Research shows that this could be a corrupted tuple, but it's not
 clear what it means to me if it sometimes work.

Transient errors of this kind sound like flaky hardware to me.  Might
be time to run some memory and disk diagnostics ...

 This follows behavior in the past few days where we noticed PostgreSQL
 core dumping repeatedly, with this error:
PANIC:  right sibling's left-link doesn't match
 The core dumps stopped when we REINDEX'ed the table mentioned in the
 PANIC statement.

That too could have been from a hardware fault during a previous index
update.  (There's a known bug in early 8.1.x releases that had this
symptom, but not in 8.0.x IIRC.)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Long term database archival

2006-07-07 Thread Richard Broersma Jr
of course you might need to also keep an image of the current
 OS and the hardware you're running on if you really want to be sure it
 will work in 20 years :-)

I think that in twenty years, I think most of us will be more worried about our 
retirement than
the long terms data conserns of the companies we will no longer be working for. 
:-D

Of course, some of us that really enjoy what we do for work might prefer to 
die with our work
boots on.

Regards,

Richard Broersma Jr.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] duplicated values on primary key field on reindex

2006-07-07 Thread Scott Marlowe
On Thu, 2006-07-06 at 17:30, Weerts, Jan wrote:
 Scott Marlowe wrote:
  On Thu, 2006-07-06 at 16:36, Weerts, Jan wrote:
  Hi all!
  
  This was 8.1.3 and now is 8.1.4 running on Debian Sarge, locally
  compiled without any fancy options.
  
  
  While the first answer seems much more valid (the primarkey is
  an artificially created number), the second answer seems to
  be the one being presented for all further invocations of the
  above query. 
  
  I noted, that the second row does not fit the order by clause,
  so I tried a reindex of the db, but that led to a duplicate value
  error: # reindex index tw_blob_pkey;
  ERROR:  could not create unique index
  DETAIL:  Table contains duplicated values.
  
  Now that is something I don't understand at all.
  
  Since the backup for said server went postal too long ago
  unnoticed, I would prefer a repair solution. Any ideas?
  
  Can you get set of fields in that row to uniquely identify it by?
  
  If so, see if you can update that column to something else and
  continue 
 
 The only way would be to update by primarykey. But since the
 select on the primarykey field shows this strange ordering,
 I wonder, what effect an update would have. My guess would be, 
 that the correct pk value should be 1636695, but seeing only
 216305 on subsequent calls makes me think.
 
 I even have executed 
 # select * from tw_blob where primarykey = 216305;
 and receive a single row, which I don't really trust to be 
 the same one producing the error.

If there are no other fields you can use to uniquely identify that row,
then add a new row to the table and update it from a sequence...

create sequence deargodsaveme;
alter table brokentable add column emergencyint int;
update brokentable set emergencyint=nextval('deargodsaveme');

then use that new column, emergencyint to select it for an update.

Note that these kind of problem is generally a sign of faulty hardware,
so you'll need to be looking at your machine's hardware (memory, CPU,
etc..)  and possible problems like faulty fsyncing etc... to make sure
it doesn't happen again.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Csaba Nagy
[snip]
 as I drop them prior to the huge delete, then create them anew). What
 would be the recommended method for reclaiming the disk space lost due
 to bloat? Dropping the 5 indexes on the table and doing a VACUUM FULL,
 keeping the indexes and doing a VACUUM FULL (does FULL perform the same
 disk moving operations on the indexes as it does on the actual table?),
 dropping the indexes except the primary key and CLUSTER ON primary key,
 keeping the indexes and doing a CLUSTER ON primary key (again, does
 CLUSTER ON just operation on the table proper?)

I won't know for sure, but I guess the least downtime you would get by
not dropping the indexes before the delete, but do a reindex after it.
Then cluster on the primary key...

My reasoning (correct me if I'm wrong): the deletion speed won't be
affected by the indexes, I think deletions don't touch the indexes at
all. The REINDEX command recreates all indexes at once, I think it needs
only one full table scan. That needs the indexes in place, so you
shouldn't drop them. The CLUSTER is a lot faster than VACUUM FULL. The
only problem could be that I think all these operations might take more
disk space than the individual indexing + VACUUM FULL.

Are my assumptions correct ?

Cheers,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] VACUUM and fsm_max_pages

2006-07-07 Thread Scott Marlowe
On Fri, 2006-07-07 at 01:57, DANTE Alexandra wrote:
 Good morning List,
 
 I have seen several posts on this concept but I don’t find a complete 
 response.
 I’m using BenchmarkSQL to evaluate PostgreSQL in transaction processing 
 and I work with PostgreSQL 8.1.3 on RHEL4-AS, Itanium-2 processor, 8GB RAM.
 
 The database, generated via BenchmarkSQL and used, is a 200-warehouses 
 database and its size is about 20GB. The parameter “max_fsm_pages” is 
 equal to 2 and “max_fsm_relations” to 1000.
 
 Between two benchs, I launch a VACUUM but at the end of it, I see that 
 PostgreSQL asks me to increase the “max_fsm_pages” parameters and the 
 value proposed grows with the number of VACUUM launched…

Oh, and if you can backup your database and import it into a test
server, see how much smaller your new data/base directory is over the
one on your production server.  That'll give you an idea of how bloated
your database is.  10 to 30% larger is fine.  100 to 1000% larger is
bad.  You get the idea.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] How to optimize query that concatenates strings?

2006-07-07 Thread Chander Ganesan




You could build a function-based index that contains the "simplified"
version of each URL (in your case, the field with the '/' stripped).
Then use the same function on the URL going in. In that case
PostgreSQL will use the index that you created already.

Take a look at the PostgreSQL documentation for function-based indexes.

select from ... where
simplify(url)  url_col;

In the example above 'url_col' would have a function-based index that
was based on 'simplify(url_col)'
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999



badlydrawnbhoy wrote:

  Hi all,

I've got a database of URLs, and when inserting new data into it I want
to make sure that there are no functionally equivalent URLs already
present. For example, 'umist.ac.uk' is functionally the same as
'umist.ac.uk/'.

I find that searching for the latter form, using string concatentation
to append the trailing slash, is much slower than searching for a
simple string - the index on URL name isn't used to speed up the
search.

Here's an illustration

url=# explain select exists(select * from url where url = ''
or url || '/' = 'umist.ac.uk') as present;
  QUERY PLAN

---
 Result  (cost=47664.01..47664.02 rows=1 width=0)
   InitPlan
 -  Seq Scan on url  (cost=0.00..47664.01 rows=6532 width=38)
   Filter: ((url = ''::text) OR ((url || '/'::text)
= 'umist.ac.uk'::text))
(4 rows)

url=# explain select exists(select * from url where url =
'') as present;
 QUERY PLAN

 Result  (cost=5.97..5.98 rows=1 width=0)
   InitPlan
 -  Index Scan using url_idx on url  (cost=0.00..5.97 rows=1
width=38)
   Index Cond: (url = ''::text)
(4 rows)


Is there any way I can force postgres to use the index when using the
string concatenation in the query?

Thanks in advance,

BBB


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
  





[GENERAL] How to insert .xls files into database

2006-07-07 Thread Parang Saraf
Hey,I am using Postgresql 8.1.4 on windows. I have a large amount of data stored in .xls files which I want to insert into my database. The columns in .xls files are not exactly compatible with the database schema. For example the event_id in every .xls file starts with 1 while for my database event_id is the primary key. Also, there are some information like event_type, event_location that are particular to every .xls file and thus they have been mentioned only once in the .xls file but in the database there exists a separate column for them. 
For more clarification I am giving my database schema and attaching a sample .xls file. My database schema is as follows : { event_id int4 NOT NULL, 

buoy char(1) NOT NULL, deployment  varchar(40),  depth int4 NOT NULL, event_type

  varchar(64), model_info_id  varchar(256), start_date float8 NOT NULL, start_date_sd float8, end_date float8 NOT NULL,
 end_date_sd float8, mean float8, variance float8, max float8, min float8, event  varchar(20) NOT NULL,
 depth_type  varchar(20) NOT NULL, buoy_location geometry, duration

 float8, Amplitude_sd float8,}.xls file is in the attachment. Now as you can see all the bold attributes are specified only once in the .xls files. And all the bold+italics one have to be manipulated a bit before storing. Even event_id in every .xls file starts with 1 but as this is a primary key I have to manipulate this also. 
I think if I can transform and manipulate each row into insert statements then I can insert the data into my database. Please guide me how to do this. Or if there is any another way of doing this. I am relatively new in this field so, please dont get offended if this problem is quite obvious. 
Thanks Parang Saraf[EMAIL PROTECTED]



A1_TS_Sigma-Tdensity_multi-year_anomaly_negative.xls
Description: MS-Excel spreadsheet

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Joshua D. Drake
On Friday 07 July 2006 08:19, Sven Willenberger wrote:
 Postgresql 8.0.4 on FreeBSD 5.4

 I have a table consisting of some 300million rows that, every couple of
 months, has 100 million rows deleted from it (an immediately vacuumed
 afterward). Even though it gets routinely vacuumed (the only
 deletions/updates are just the quarterly ones), the freespace map was
 not increased in size to keep up with the growing size of the other
 tables in the database which do experience many updates,etc.

Based on the size of the table, you may want to:

Backup the table
Drop the table
Restore the table

Is is possible that this will be faster in this instance.

Secondly this sounds like a perfect time for you to consider upgrading to 8.1
and making use of table partitioning. That way you can just truncate the child
table containing the old data.

Sincerely,

Joshua D. Drake

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Greg Stark
Csaba Nagy [EMAIL PROTECTED] writes:

 I won't know for sure, but I guess the least downtime you would get by
 not dropping the indexes before the delete, but do a reindex after it.
 Then cluster on the primary key...
 
 My reasoning (correct me if I'm wrong): the deletion speed won't be
 affected by the indexes, I think deletions don't touch the indexes at
 all. 

That's true, more or less. I think there's a small hit actually as queries set
the hint bit and the pages have to be flushed.

As long as you're just deleting and not inserting or updating

 The REINDEX command recreates all indexes at once, I think it needs
 only one full table scan. 

No, each index build has to do its own full scan. It wouldn't save much
anyways not to, where would you store the tuples in the meantime? And why
would this temporary storage place be any faster than scanning the original
table?


-- 
greg


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Sven Willenberger
On Fri, 2006-07-07 at 09:55 -0700, Joshua D. Drake wrote:
 On Friday 07 July 2006 08:19, Sven Willenberger wrote:
  Postgresql 8.0.4 on FreeBSD 5.4
 
  I have a table consisting of some 300million rows that, every couple of
  months, has 100 million rows deleted from it (an immediately vacuumed
  afterward). Even though it gets routinely vacuumed (the only
  deletions/updates are just the quarterly ones), the freespace map was
  not increased in size to keep up with the growing size of the other
  tables in the database which do experience many updates,etc.
 
 Based on the size of the table, you may want to:
 
 Backup the table
 Drop the table
 Restore the table
 
 Is is possible that this will be faster in this instance.
 
 Secondly this sounds like a perfect time for you to consider upgrading to 8.1
 and making use of table partitioning. That way you can just truncate the child
 table containing the old data.
 
 Sincerely,
 
 Joshua D. Drake

Doing a quick check reveals that the relation in question currently
consumes 186GB of space (which I highly suspect is largely bloat). The
delete was just run this past weekend as was the recreation of the
indexes. I have 50GB of disk space left; If I vacuum full, it does not
need to create a temporary copy of the relation and indexes like cluster
does, does it? At this point, I think CLUSTER ON is out of the question
due to the need to create the temporary table and indexes (I will run
out of space during the operation).

I do plan on migrating the whole mess to a new server which will run 8.1
(I had looked at inheritance for partitioning, I am glad to see that 8.1
took the concept and ran with it further :)  ) This new server will use
an external SAS array so I should simply be able to add another array as
the need arises and partition to it via tablespace.

Thanks to all who offered suggestions; it would appear that at this
stage my only option to buy some time is try a vacuum full. My final
question: can I leave the indexes in place when I vacuum full? I assume
this will only operate on the table itself?

Sven


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Joshua D. Drake

  Sincerely,
 
  Joshua D. Drake

 Doing a quick check reveals that the relation in question currently
 consumes 186GB of space (which I highly suspect is largely bloat).

Good lord.. .186 gig for a 300 million row table? Unless those are seriously
large rows, you have a TON of bloat.

Joshua D. Drake

-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Franz . Rasper
How long does it take do a database dump (with gzip -1 via | and  ), drop
this database
and create the database and restore it from the backup. That is my solution,
but I dont
know how long it will take to restore your database and i dont have so large
databases.

 Secondly this sounds like a perfect time for you to consider upgrading to
8.1

I would not do this without a test (not only a dump/restore test, I would
test it with
your application too)

Greetings,

-Franz  

-Ursprüngliche Nachricht-
Von: Sven Willenberger [mailto:[EMAIL PROTECTED] 
Gesendet: Freitag, 7. Juli 2006 19:26
An: Joshua D. Drake
Cc: pgsql-general@postgresql.org
Betreff: Re: [GENERAL] VACUUM FULL versus CLUSTER ON


On Fri, 2006-07-07 at 09:55 -0700, Joshua D. Drake wrote:
 On Friday 07 July 2006 08:19, Sven Willenberger wrote:
  Postgresql 8.0.4 on FreeBSD 5.4
 
  I have a table consisting of some 300million rows that, every couple of
  months, has 100 million rows deleted from it (an immediately vacuumed
  afterward). Even though it gets routinely vacuumed (the only
  deletions/updates are just the quarterly ones), the freespace map was
  not increased in size to keep up with the growing size of the other
  tables in the database which do experience many updates,etc.
 
 Based on the size of the table, you may want to:
 
 Backup the table
 Drop the table
 Restore the table
 
 Is is possible that this will be faster in this instance.
 
 Secondly this sounds like a perfect time for you to consider upgrading to
8.1
 and making use of table partitioning. That way you can just truncate the
child
 table containing the old data.
 
 Sincerely,
 
 Joshua D. Drake

Doing a quick check reveals that the relation in question currently
consumes 186GB of space (which I highly suspect is largely bloat). The
delete was just run this past weekend as was the recreation of the
indexes. I have 50GB of disk space left; If I vacuum full, it does not
need to create a temporary copy of the relation and indexes like cluster
does, does it? At this point, I think CLUSTER ON is out of the question
due to the need to create the temporary table and indexes (I will run
out of space during the operation).

I do plan on migrating the whole mess to a new server which will run 8.1
(I had looked at inheritance for partitioning, I am glad to see that 8.1
took the concept and ran with it further :)  ) This new server will use
an external SAS array so I should simply be able to add another array as
the need arises and partition to it via tablespace.

Thanks to all who offered suggestions; it would appear that at this
stage my only option to buy some time is try a vacuum full. My final
question: can I leave the indexes in place when I vacuum full? I assume
this will only operate on the table itself?

Sven


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] VACUUM FULL versus CLUSTER ON

2006-07-07 Thread Sven Willenberger
On Fri, 2006-07-07 at 10:41 -0700, Joshua D. Drake wrote:
   Sincerely,
  
   Joshua D. Drake
 
  Doing a quick check reveals that the relation in question currently
  consumes 186GB of space (which I highly suspect is largely bloat).
 
 Good lord.. .186 gig for a 300 million row table? Unless those are seriously
 large rows, you have a TON of bloat.
 
 Joshua D. Drake
 

Yes, that number came from the dbsize functions (in contrib) so I don't
know if that includes the associated indexes as well. The rows are
fairly large, yes, but not enough (IMO) to account for that size. It
will be interesting to see the final size after the vacuum full (which
is the method I have settled on to reclaim space this go round).

Sven


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Long term database archival

2006-07-07 Thread Steve Atkins


On Jul 7, 2006, at 1:19 AM, Csaba Nagy wrote:


On Thu, 2006-07-06 at 20:57, Karl O. Pinc wrote:

Hi,

What is the best pg_dump format for long-term database
archival?  That is, what format is most likely to
be able to be restored into a future PostgreSQL
cluster.



Should we want to restore a 20 year old backup
nobody's going to want to be messing around with
decoding a custom format dump if it does not
just load all by itself.


Karl, I would say that if you really want data from 20 years ago, keep
it in the custom format, along with a set of the sources of postgres
which created the dump. then in 20 years when you'll need it, you'll
compile the sources and load the data in the original postgres
version... of course you might need to also keep an image of the  
current

OS and the hardware you're running on if you really want to be sure it
will work in 20 years :-)


I've been burned by someone doing that, and then being unable to
find a BCPL compiler.

So don't do that.

Store them in a nice, neutral ASCII format, along with all the
documentation. If you can't imagine extracting the
data with a small perl script and less than a days work today
then your successor will likely curse your name in 20 years
time.

Cheers,
  Steve


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] How to insert .xls files into database

2006-07-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Parang Saraf wrote:
 Hey,
 
 I am using Postgresql 8.1.4 on windows. I have a large amount of data
 stored
 in .xls files which I want to insert into my database.
 
 The columns in .xls files are not exactly compatible with the database
 schema. For example the event_id in every .xls file starts with 1 while for
 my database event_id is the primary key. Also, there are some information
 like event_type, event_location that are particular to every .xls file and
 thus they have been mentioned only once in the .xls file but in the
 database
 there exists a separate column for them.
 
 For more clarification I am giving my database schema and attaching a
 sample
 .xls file.
 
 My database schema is as follows :
 {
  event_id int4 NOT NULL,
  buoy char(1) NOT NULL,
  deployment varchar(40),
  depth int4 NOT NULL,
  event_type varchar(64),
  model_info_id varchar(256),
  start_date float8 NOT NULL,
  start_date_sd float8,
  end_date float8 NOT NULL,
  end_date_sd float8,
  mean float8,
  variance float8,
  max float8,
  min float8,
  event varchar(20) NOT NULL,
  depth_type varchar(20) NOT NULL,
  buoy_location geometry,
  duration float8,
  Amplitude_sd float8,
 }
 
 .xls file is in the attachment. Now as you can see all the bold attributes
 are specified only once in the .xls files. And all the bold+italics one
 have
 to be manipulated a bit before storing. Even event_id in every .xls file
 starts with 1 but as this is a primary key I have to manipulate this also.
 
 I think if I can transform and manipulate each row into insert statements
 then I can insert the data into my database. Please guide me how to do
 this.
 Or if there is any another way of doing this.
 
 I am relatively new in this field so, please dont get offended if this
 problem is quite obvious.

OpenOffice.org v2 can interface .xls files with PostgreSQL.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is common sense really valid?
For example, it is common sense to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that common sense is obviously wrong.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFErqcHS9HxQb37XmcRAj6bAKCRGCZMeKXbaIUewBFAPDnko8t/kACfU+sa
7EiEI+V2LEGD1OKJh+8IDeU=
=Ahtm
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Michael Loftis



--On July 7, 2006 12:35:53 PM + Roman Neuhauser [EMAIL PROTECTED] 
wrote:



# [EMAIL PROTECTED] / 2006-07-06 22:41:27 -0600:

OK I know this is an odd question but I'm working on an app that will
rely  more and more on database driven functions, and while the app's
source is  in SVN, and I intend for the source of the SQL scripts to
also be there, I  was wondering...what are people doing for version
control and change  management on their custom (esp PL/pgSQL and say
PL/Perl) functions?


Well, people use a version control system.

Do you have any specific questions?


Yes, how exactly do you use it.  Since there's no way to directly control 
whats in the DB via a VCS, further, how do you verify that what is in the 
DB is also in the VCS, etc?  (I'm intentionally asking a bit of a 'dumb' 
question because I really want to find out exactly what others are doing).




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Version/Change Management of functions?

2006-07-07 Thread Joshua D. Drake
On Friday 07 July 2006 13:08, Michael Loftis wrote:
 --On July 7, 2006 12:35:53 PM + Roman Neuhauser [EMAIL PROTECTED]

 wrote:
 # [EMAIL PROTECTED] / 2006-07-06 22:41:27 -0600:
  OK I know this is an odd question but I'm working on an app that will
  rely  more and more on database driven functions, and while the app's
  source is  in SVN, and I intend for the source of the SQL scripts to
  also be there, I  was wondering...what are people doing for version
  control and change  management on their custom (esp PL/pgSQL and say
  PL/Perl) functions?
 
  Well, people use a version control system.
 
  Do you have any specific questions?

 Yes, how exactly do you use it.  Since there's no way to directly control
 whats in the DB via a VCS, further, how do you verify that what is in the
 DB is also in the VCS, etc?  (I'm intentionally asking a bit of a 'dumb'
 question because I really want to find out exactly what others are doing).

I take text based schema dumps using the same file name each time and commit 
them so that subversion can tell me the differences.

Joshua D. Drake





 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org

-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] How to insert .xls files into database

2006-07-07 Thread Adrian Klaver
I guess the solution depends on what is a 'large amount of data'. The most 
time consuming part is going to be converting the single data elements at the 
top of each sheet into multiple elements. I would create columns for the data 
in the sheet. At the same time I would order the columns to match the 
database schema. Then it would a matter of cut and paste to fill the columns 
with the data. The event id's could be renumbered using Excel's series 
generator to create a non repeating set of id's. If the amount of data was 
very large it might pay to create some macros to do the work. Once the data 
was filled in you would have a couple of choices. One, as mentioned by Ron 
would be to use OpenOffice v2 to dump the data into the database. The other 
would be to save the data as CSV and use the psql \copy command to move the 
data into the table.
On Friday 07 July 2006 09:40 am, Parang Saraf wrote:
 Hey,

 I am using Postgresql 8.1.4 on windows. I have a large amount of data
 stored in .xls files which I want to insert into my database.

 The columns in .xls files are not exactly compatible with the database
 schema. For example the event_id in every .xls file starts with 1 while for
 my database event_id is the primary key. Also, there are some information
 like event_type, event_location that are particular to every .xls file and
 thus they have been mentioned only once in the .xls file but in the
 database there exists a separate column for them.

 For more clarification I am giving my database schema and attaching a
 sample .xls file.

 My database schema is as follows :
 {
   event_id int4 NOT NULL,
   buoy char(1) NOT NULL,
   deployment varchar(40),
   depth int4 NOT NULL,
   event_type varchar(64),
   model_info_id varchar(256),
   start_date float8 NOT NULL,
   start_date_sd float8,
   end_date float8 NOT NULL,
   end_date_sd float8,
   mean float8,
   variance float8,
   max float8,
   min float8,
   event varchar(20) NOT NULL,
   depth_type varchar(20) NOT NULL,
   buoy_location geometry,
   duration float8,
   Amplitude_sd float8,
 }

 .xls file is in the attachment. Now as you can see all the bold attributes
 are specified only once in the .xls files. And all the bold+italics one
 have to be manipulated a bit before storing. Even event_id in every .xls
 file starts with 1 but as this is a primary key I have to manipulate this
 also.

 I think if I can transform and manipulate each row into insert statements
 then I can insert the data into my database. Please guide me how to do
 this. Or if there is any another way of doing this.

 I am relatively new in this field so, please dont get offended if this
 problem is quite obvious.

 Thanks
 Parang Saraf
 [EMAIL PROTECTED]

-- 
Adrian Klaver   
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] How to optimize query that concatenates strings?

2006-07-07 Thread Stephane Bortzmeyer
On Fri, Jul 07, 2006 at 04:29:51AM -0700,
 badlydrawnbhoy [EMAIL PROTECTED] wrote 
 a message of 48 lines which said:

 I've got a database of URLs, and when inserting new data into it I
 want to make sure that there are no functionally equivalent URLs
 already present. For example, 'umist.ac.uk' is functionally the same
 as 'umist.ac.uk/'.

IMHO, your problem seems to be an instance of a very general class:
data that needs canonicalization. For instance, I work for a domain
name registry and domain names are case-insensitive. Do we store the
domain names blindly and then always use ILIKE or other
case-insensitive operators? No, we canonicalize domain names by
changing them to lower-case. That way:

* we do not have to think of using case-insensitive operators,

* indexes do work.

This is what I recommend here: decide on a canonical form and
canonicalize everything when it comes into the database (it is easy to
do it from a trigger).

If the original form is important, you can still store it in a column
intended for display, not for searching.

Here is a way to canonicalize, with a trigger. The function
canonicalize is left as an exercice (you can write it in PL/pgSQL,
C, Python, etc):

CREATE OR REPLACE FUNCTION force_canonicalization() RETURNS TRIGGER
   AS 'BEGIN
  NEW.url = canonicalize(NEW.url);
  RETURN NEW;
   END;'
 LANGUAGE PLPGSQL;

CREATE TRIGGER force_canonicalization
   BEFORE INSERT ON Things
   FOR EACH ROW
   EXECUTE PROCEDURE force_canonicalization();


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[GENERAL] Bug? Changing where distinct occurs produces error?

2006-07-07 Thread Michael Loftis

OK I'm either insane or found a bug in 8.1.3

If you execute say:

SELECT DISTINCT(ua.user_id),pa.poll_id FROM user_answers ua, poll_answers 
pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';


Everything is fine, however if you run

SELECT pa.poll_id,DISTINCT(ua.user_id) FROM user_answers ua, poll_answers 
pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';


ERROR:  syntax error at or near DISTINCT at character 19
LINE 1: SELECT pa.poll_id,DISTINCT(ua.user_id) FROM user_answers ua,...
 ^
Further if I wrap it in a SUM (with a group by pa.poll_id) I get no error.

I'm a little confused as to what is wrong with my second example there?

--
Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds.
-- Samuel Butler

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Bug? Changing where distinct occurs produces error?

2006-07-07 Thread Stephan Szabo

On Fri, 7 Jul 2006, Michael Loftis wrote:

 OK I'm either insane or found a bug in 8.1.3

 If you execute say:

 SELECT DISTINCT(ua.user_id),pa.poll_id FROM user_answers ua, poll_answers
 pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';

 Everything is fine, however if you run

 SELECT pa.poll_id,DISTINCT(ua.user_id) FROM user_answers ua, poll_answers
 pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';

This statement is invalid. DISTINCT is a set quantifier and either comes
before the select list or as the first thing in a set function
specification.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Bug? Changing where distinct occurs produces error?

2006-07-07 Thread Richard Broersma Jr
 SELECT DISTINCT(ua.user_id),pa.poll_id FROM user_answers ua, poll_answers 
 pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';
 
 Everything is fine, however if you run
 
 SELECT pa.poll_id,DISTINCT(ua.user_id) FROM user_answers ua, poll_answers 
 pa WHERE pa.poll_answer_id = ua.poll_answer_id AND ua.active='Y';
 
 ERROR:  syntax error at or near DISTINCT at character 19
 LINE 1: SELECT pa.poll_id,DISTINCT(ua.user_id) FROM user_answers ua,...
   ^

Notice: http://www.postgresql.org/docs/8.1/interactive/sql-select.html

According to the syntax for a select a distinct | distinct on must be the first 
column specified
in the syntax.

So perhaps it is designed that way.

Regards,

Richard Broersma Jr.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


re: [GENERAL] How to insert .xls files into database

2006-07-07 Thread lanczos
 [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Klaver
 
 I guess the solution depends on what is a 'large amount of data'. The 
 most time consuming part is going to be converting the single data 
 elements at the top of each sheet into multiple elements. I would 
 create columns for the data in the sheet. At the same time I would 
 order the columns to match the database schema. Then it would a matter 
 of cut and paste to fill the columns with the data. The event id's 
 could be renumbered using Excel's series generator to create a non 
 repeating set of id's. If the amount of data was very large it might 
 pay to create some macros to do the work. Once the data was filled in 
 you would have a couple of choices. One, as mentioned by Ron would be 
 to use OpenOffice v2 to dump the data into the database. The other 
 would be to save the data as CSV and use the psql \copy command to 
 move the data into the table.
 On Friday 07 July 2006 09:40 am, Parang Saraf wrote:

Evrything You described is familiar to me, except the OpenOffice v2 dump - 
could You explain this more in details pls? I tried to do it many times, 
without success.

Thank You

Tomas

[del]

 -- 
 Adrian Klaver 
 [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Bug? Changing where distinct occurs produces error?

2006-07-07 Thread Michael Loftis



--On July 7, 2006 3:22:01 PM -0700 Richard Broersma Jr 
[EMAIL PROTECTED] wrote:



Notice: http://www.postgresql.org/docs/8.1/interactive/sql-select.html

According to the syntax for a select a distinct | distinct on must be the
first column specified in the syntax.

So perhaps it is designed that way.



Ahhh I see, I keep thinking of DISTINCT as a Aggregator function.  That's 
my mistake.


Thanks all!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] How to insert .xls files into database

2006-07-07 Thread John D. Burger
One option is to write a Python translator to create CSV files, or even 
an uploader to go directly from the Excel files to the database.  There 
is at least one module to read Excel files, in all their complexity:


  http://cheeseshop.python.org/pypi/xlrd/0.5.2

and a number of Postgres modules:

  http://wiki.python.org/moin/PostgreSQL

- John Burger
  MITRE


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] How to insert .xls files into database

2006-07-07 Thread Adrian Klaver
So far I have only got this to work with the Postgres SDBC driver-
http://dba.openoffice.org/drivers/postgresql/index.html
1) Open the Data Source (F4) window in the spreadsheet.
2) Make a connection to the database. I usually do this by opening a table. 
This is fairly important, otherwise when you do the import OO will try to 
establish connection at the same as it is importing the data and end up 
hanging.
3)Select the data in the spreadsheet you want to import. The column headings 
will become the field names. NOTE: You want the data to be as plain as 
possible. OpenOffice tends to auto format the cell contents i.e changing 1/2 
to a typeset version, changing  to true quotes. This introduces format codes 
that don't translate well.
4)Drag the selected data to the Tables heading of the appropriate data source. 
At this point a dialog form will pop up.
5)Name the table. You will need to schema qualify i.e. public.test_table. Make 
sure you select Create a Primary Key. Without the table will be created but 
the data will not be imported. This why I use the SDBC driver. The JDBC 
driver does not seem to support this step.
6) Select Next. Here you can choose the columns you wish to import.
7) Select Next. Here you can modify the data types of the columns.
8) Select Create. The table should be created and the data inserted.

On Friday 07 July 2006 03:41 pm, [EMAIL PROTECTED] wrote:
  [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Klaver
 
  I guess the solution depends on what is a 'large amount of data'. The
  most time consuming part is going to be converting the single data
  elements at the top of each sheet into multiple elements. I would
  create columns for the data in the sheet. At the same time I would
  order the columns to match the database schema. Then it would a matter
  of cut and paste to fill the columns with the data. The event id's
  could be renumbered using Excel's series generator to create a non
  repeating set of id's. If the amount of data was very large it might
  pay to create some macros to do the work. Once the data was filled in
  you would have a couple of choices. One, as mentioned by Ron would be
  to use OpenOffice v2 to dump the data into the database. The other
  would be to save the data as CSV and use the psql \copy command to
  move the data into the table.
  On Friday 07 July 2006 09:40 am, Parang Saraf wrote:

 Evrything You described is familiar to me, except the OpenOffice v2 dump -
 could You explain this more in details pls? I tried to do it many times,
 without success.

 Thank You

 Tomas

 [del]

  --
  Adrian Klaver
  [EMAIL PROTECTED]

 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

-- 
Adrian Klaver   
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postmaster is starting but shutting when trying to connect (Windows)

2006-07-07 Thread Magnus Hagander
 Hello,
 
 i have a PostgreSQL (8.1) installation for testing purposes 
 which was running fine for several months now (Windows XP). I 
 was working with it yesterday, and today after booting my 
 computer and restarting the service (I'm starting the service 
 manually, because I don't need the server running all the 
 time) postmaster comes up fine, no entries in the log file.
 
 But as soon as I try to connect to the server (does not 
 matter what type of client, psql, PgAdmin or JDBC) I get an 
 error message in the log file. Here is the full log file:

This looks exactly like the issues we've seen with broken antivirus or
personal firewall software. Make sure you don't have any such installed
(actualy installed, not just enabled), and if you do try to uninstall
them. If you don't, but had before, check for any remaining pieces.
There should be posts in the archives with details about it.

Again, note that just disabling things usually isn't enough, you need to
actually uninstall.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly