[GENERAL] Query optimisation

2008-04-06 Thread Naz Gassiep
The following query is executing in a long time, 500ms or so. This needs to be about 
100ms or so in order to be acceptable. Can anyone spot any optimisations that I could 
make to this query to bring the exec time down? Have I designed this query correctly?

Is joining to the same table every time like that the right thing to be doing?

I'm pretty sure I have indexes on the right fields etc. Thanks to anyone who can 
spare a minute or three to look at this.


Here's the query and the query plan:


 SELECT images.imageid,
 images_sites.siteid,
 images.title,
 images.description,
 albums.albumid,
 albums.title AS albumtitle,
 albums.private AS album_private,
 images.entered,
 images.taken,
 images.private,
 images.comments,
 images.showcomments,
 images.shownames,
 images.commentlimit,
 images.commentlimit_user,
 images.trashed,
 images.deleted,
 imageid_file.fileid AS imageid_fileid,
 imageid_file.filename AS imageid_filename,
 imageid_file.size AS imageid_size,
 imageid_file.md5 AS imageid_md5,
 imageid_file.sha1 AS imageid_sha1,
 size120_file.fileid AS size120_fileid,
 size120_file.filename AS size120_filename,
 size240_file.fileid AS size240_fileid,
 size240_file.filename AS size240_filename,
 size420_file.fileid AS size420_fileid,
 size420_file.filename AS size420_filename,
 size600_file.fileid AS size600_fileid,
 size600_file.filename AS size600_filename,
 size800_file.fileid AS size800_fileid,
 size800_file.filename AS size800_filename,
 size1024_file.fileid AS size1024_fileid,
 size1024_file.filename AS size1024_filename,
 size130sq_file.fileid AS size130sq_fileid,
 size130sq_file.filename AS size130sq_filename,
 size240sq_file.fileid AS size240sq_fileid,
 size240sq_file.filename AS size240sq_filename,
 size420sq_file.fileid AS size420sq_fileid,
 size420sq_file.filename AS size420sq_filename
FROM images
JOIN files imageid_file ON (images.imageid = 
imageid_file.fileid)
JOIN files size120_file ON (images.size120 = 
size120_file.fileid)
JOIN files size240_file ON (images.size240 = 
size240_file.fileid)
JOIN files size420_file ON (images.size420 = 
size420_file.fileid)
JOIN files size600_file ON (images.size600 = 
size600_file.fileid)
JOIN files size800_file ON (images.size800 = 
size800_file.fileid)
JOIN files size1024_file ON (images.size1024 = 
size1024_file.fileid)
JOIN files size130sq_file ON (images.size130sq = 
size130sq_file.fileid)
JOIN files size240sq_file ON (images.size240sq = 
size240sq_file.fileid)
JOIN files size420sq_file ON (images.size420sq = 
size420sq_file.fileid)
JOIN images_sites ON (images_sites.imageid = images.imageid)
 LEFT OUTER JOIN albums ON (images_sites.albumid = albums.albumid)
   WHERE images_sites.siteid = 1
 AND images_sites.albumid = 6
 AND (albums.private IS NULL OR albums.private = 5)
 AND images.private = 5
ORDER BY images.entered;



 QUERY PLAN
--
Sort  (cost=14045.35..14045.43 rows=34 width=404) (actual time=507.877..507.881 
rows=11 loops=1)
  Sort Key: images.entered
  Sort Method:  quicksort  Memory: 22kB
  -  Nested Loop  (cost=13084.62..14044.48 rows=34 width=404) (actual 
time=507.409..507.814 rows=11 loops=1)
-  Nested Loop  (cost=13084.62..13986.47 rows=34 width=395) (actual 
time=507.399..507.724 rows=11 loops=1)
  -  Nested Loop  (cost=13084.62..13928.46 rows=34 width=386) 
(actual time=507.389..507.642 rows=11 loops=1)
   

Re: [GENERAL] Query optimisation

2008-04-06 Thread Naz Gassiep




As you can see, they all are the same table, just repeatedly joined
with aliases. The images table has several fields, each one referring
to a different sized version of the image. It then has to join against
the files table for each size to get the file that corresponds with
that image version.
- Naz.


Craig Ringer wrote:

  Naz Gassiep wrote:

  
  
JOIN files imageid_file ON (images.imageid =
imageid_file.fileid)
JOIN files size120_file ON (images.size120 =
size120_file.fileid)
JOIN files size240_file ON (images.size240 =
size240_file.fileid)
JOIN files size420_file ON (images.size420 =
size420_file.fileid)
JOIN files size600_file ON (images.size600 =
size600_file.fileid)
JOIN files size800_file ON (images.size800 =
size800_file.fileid)
JOIN files size1024_file ON (images.size1024 =
size1024_file.fileid)
JOIN files size130sq_file ON (images.size130sq =
size130sq_file.fileid)
JOIN files size240sq_file ON (images.size240sq =
size240sq_file.fileid)
JOIN files size420sq_file ON (images.size420sq =
size420sq_file.fileid)
JOIN images_sites ON (images_sites.imageid =
images.imageid)

  
  
That's a whole lot of joins.

Do all those tables have the same fields? Can you unify them into one
table with an additional field like "imagesize" and use an appropriate
WHERE clause when looking up the table?

--
Craig Ringer

  





[GENERAL] Serial Data Type

2008-04-02 Thread Naz Gassiep
I have just created a table using SELECT INTO however the PK was 
supposed to be a serial. It is now an integer. To make it a serial I 
just create the seq and set the default to be the nextval() of that 
sequence right? is there anything else I need to do? It'll maintain the 
transactional safety of a serial created default, right? I.e., it'll not 
rollback seq values on a transaction abortion will it?

Thanks,
- Naz.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Locale / Encoding mismatch

2008-03-30 Thread Naz Gassiep
I have just attempted to upgrade to 8.3.1 and I now get this error when 
trying to create a UTF8 DB:


[EMAIL PROTECTED]:~$ createdb twerl -E utf8
createdb: database creation failed: ERROR:  encoding UTF8 does not match 
server's locale en_AU

DETAIL:  The server's LC_CTYPE setting requires encoding LATIN1.

I have never bothered with the server's locale, all the locale handling 
in the app gets handled within the DB itself. I guess I now have to dig 
into the murky world of system locales and how they affect my app 
ecosystem. I don't suppose there is an easy solution to this and a short 
explanation of why it now occurs is there?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Locale / Encoding mismatch

2008-03-30 Thread Naz Gassiep



Short answer is: use en_AU.UTF-8 for your locale.  If it doesn't exist
you can create it using /etc/locale.gen (assuming you're running some
kind of linux)
  


I've just installed that locale on my system (Debian Sarge). However I'm 
still getting the error. I only set the locale for that user, could it 
be that the locale needs to be set to UTF8 system wide? Here's the output:


[EMAIL PROTECTED]:~$ locale
LANG=en_AU.UTF8
LANGUAGE=en_AU.UTF8:en_US.UTF8:en_GB.UTF8:en.UTF8
LC_CTYPE=en_AU.UTF8
LC_NUMERIC=en_AU.UTF8
LC_TIME=en_AU.UTF8
LC_COLLATE=en_AU.UTF8
LC_MONETARY=en_AU.UTF8
LC_MESSAGES=en_AU.UTF8
LC_PAPER=en_AU.UTF8
LC_NAME=en_AU.UTF8
LC_ADDRESS=en_AU.UTF8
LC_TELEPHONE=en_AU.UTF8
LC_MEASUREMENT=en_AU.UTF8
LC_IDENTIFICATION=en_AU.UTF8
LC_ALL=
[EMAIL PROTECTED]:~$ createdb twerl -E utf8
createdb: database creation failed: ERROR:  encoding UTF8 does not match 
server's locale en_AU

DETAIL:  The server's LC_CTYPE setting requires encoding LATIN1.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey: renaming/removing script binaries (createdb, createuser...)

2008-03-26 Thread Naz Gassiep



1) What type of names do you prefer?
---
a) old notation - createdb, createuser ...
b) new one with pg_ prefix - pg_createdb, pg_creteuser ...
c) new one with pg prefix - pgcreatedb, pgcreateuser ...
d) remove them - psql is the solution
e) remove them - pgadmin is the solution

a)


2) How often do you use these tools?
---
a) every day (e.g. in my cron)
b) one per week
c) one time
d) never

b)


3) What name of initdb do you prefer?
-- --
a) initdb
b) pg_initdb
c) pg_init
d) pg_ctl -d dir init  (replace initdb with pg_ctl new functionality)
e) What is initdb? My start/stop script does it automatically.

a)


4) How do you perform VACUUM?
-
a) vacuumdb - shell command
b) VACUUM - SQL command
c) autovacuum
d) What is vacuum?

b)

We're not seriously thinking of changing these are we? Once a command 
set has been in use for as long a time as the PG command set has, any 
benefit that may be derived by new users with an aversion to 
documentation reading is vastly offset by the confusion that would 
result among long time users whos scripts, tools and mental mental 
processes all have the old names hardcoded in.


I can't imagine how there would be a nomenclature clash, if there is, 
then just take one of the tools out of the path, use symlinks or put 
calling scripts in the path instead. These are suboptimal solutions, 
granted, but *any* naming scheme we change to will be subject to the 
possibility of naming clashes with another package with a similar name, 
unless we make the binaries have long, verbose names. I don't know about 
you, but I don't fancy having to type postgresqlclient dbname to start 
a DB. I like psql dbname.


So I ask again, we're not seriously thinking about this are we?
Regards,
- Naz.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Read/Write restriction mechanism

2008-01-08 Thread Naz Gassiep

A tangentially PG related question:

In a PHP project I have several functions that I use for DB operations. 
I only want to allow one of them to write, all the others are for 
reading only. I was thinking that a way I can enforce this would be to 
check that the read only ones only have queries where the first 
non-whitespace character is 'S'. This is not a security thing, user 
defined queries are totally disallowed, this is just a so developers 
don't forget measure.


Checking the first char like that seems awfully hackinsh to me, although 
I can't see any reason it wouldn't work. Does anyone have any better 
ideas? (Using DB level perms are out, as this is the function usage I'm 
trying to control, not the connections).


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Hash Indexes

2008-01-07 Thread Naz Gassiep



Why are hash indexes obviously best? In an ideal world with a good
implementation maybe, but postgresql b-trees are really quite good.
  
Because doing normal queries on a table where there are large text 
blocks is unlikely to be a good idea. E.g.,:


SELECT * FROM table WHERE textcol = 'a 4kb block of text';


You could always do something like:

CREATE INDEX foo ON table((md5(textcol)));

Then it will get used in queries like:
SELECT * FROM table WHERE md5(textcol) = md5('text');
  
That's exactly what I was considering doing, however there is always the 
change of a hash collision. Yes, this is a very remote chance, however 
the ramifications of a collision under those circumstances is 
potentially catastrophic. Think a user being delivered text that 
contains confidential and sensitive material as opposed to the latest 
memo about the cleaning of toilets.


I would assume that hash indexes have inbuilt mechanisms for collision 
checking before returning the row as a match. Am I correct in this 
assumption?


Best regards,
- Naz.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] Hash Indexes

2008-01-04 Thread Naz Gassiep

Hi there,
   I am creating functionality where there are blocks of text that are 
being stored in the DB and that need to be searched for. No like or 
pattern matching, just a plain old WHERE clause. Obviously, hash indexes 
would be best here, however I've been warned away from PG's hash 
implementation. Would it be better to manually create a column storing 
the hashes (maintained with a CHECK (md5(textcol) = hashcol) type 
constraint) and generate the hashes in the app?


- Naz.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Query problem

2007-10-01 Thread Naz Gassiep
Aside from the messy nomenclature, is anyone able to spot why 
the sum column from the first query is not returning 7, as 
the second query suggests that it should? I know that this is

probably simple, and that It's probably going to jump out at
me the minute I hit Send, but if I don't hit send, then I'll
never see it hehe.


twerl=# SELECT 'contactgroups',
contactgroups.siteid,
contactgroups.id,
contactgroups.name,
contactgroups.description,
SUM(contacts.id),
contactgroups.trashed
   FROM contactgroups
LEFT OUTER JOIN contacts ON (contactgroups.id = contacts.groupid)
  WHERE contactgroups.trashed IS NOT NULL
AND contactgroups.deleted IS NULL
   GROUP BY contactgroups.siteid,
contactgroups.id,
contactgroups.name,
contactgroups.description,
contactgroups.trashed;
  ?column?| siteid | id |name| description | sum |  
  trashed
---++++-+-+---
contactgroups |  1 |  3 | Telechoice / Optus | | 594 | 
2007-10-01 20:08:51.449825+10
(1 row)

twerl=# select count(*) from contacts where groupid = 3;
count
---
7
(1 row)


Thanks,
- Naz.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Etc/% timezones

2007-08-29 Thread Naz Gassiep
I just noticed that in the pg_timezone_names system table, the name and 
abbrev of the Etc/% timezones appear to be inverted with their 
utc_offset value.


I never noticed before, as I filter these zones out (among others) and 
do not use them in my app. I was just interested as to why the suggested 
abbrev field and utc_offset field are the opposite sign of each other. 
Is this a bug or is there a reason for this? I have included the output 
from the view with the relevant time zones below.


E.g., I am in Melbourne, Australia, which I have always known as GMT+10 
(ignoring DST). According to the below list, however, the timezone that 
has the matching utc_offset to mine is called Etc/GMT-10, which to me 
seems wrong. Were I presented with a list of the names, I'd select 
Etc/GMT+10 as my time zone, incorrectly telling the app that my 
timezone's utc_offset is -10:00:00.


I'm pretty certain that this is a bug. Can anyone confirm?

Regards,
- Naz.


select * from pg_timezone_names where name ilike 'etc/%' order by 
utc_offset, abbrev, name;

name  | abbrev | utc_offset | is_dst
---+++
Etc/GMT+12| GMT+12 | -12:00:00  | f
Etc/GMT+11| GMT+11 | -11:00:00  | f
Etc/GMT+10| GMT+10 | -10:00:00  | f
Etc/GMT+9 | GMT+9  | -09:00:00  | f
Etc/GMT+8 | GMT+8  | -08:00:00  | f
Etc/GMT+7 | GMT+7  | -07:00:00  | f
Etc/GMT+6 | GMT+6  | -06:00:00  | f
Etc/GMT+5 | GMT+5  | -05:00:00  | f
Etc/GMT+4 | GMT+4  | -04:00:00  | f
Etc/GMT+3 | GMT+3  | -03:00:00  | f
Etc/GMT+2 | GMT+2  | -02:00:00  | f
Etc/GMT+1 | GMT+1  | -01:00:00  | f
Etc/GMT   | GMT| 00:00:00   | f
Etc/GMT+0 | GMT| 00:00:00   | f
Etc/GMT-0 | GMT| 00:00:00   | f
Etc/GMT0  | GMT| 00:00:00   | f
Etc/Greenwich | GMT| 00:00:00   | f
Etc/UCT   | UCT| 00:00:00   | f
Etc/UTC   | UTC| 00:00:00   | f
Etc/Universal | UTC| 00:00:00   | f
Etc/Zulu  | UTC| 00:00:00   | f
Etc/GMT-1 | GMT-1  | 01:00:00   | f
Etc/GMT-2 | GMT-2  | 02:00:00   | f
Etc/GMT-3 | GMT-3  | 03:00:00   | f
Etc/GMT-4 | GMT-4  | 04:00:00   | f
Etc/GMT-5 | GMT-5  | 05:00:00   | f
Etc/GMT-6 | GMT-6  | 06:00:00   | f
Etc/GMT-7 | GMT-7  | 07:00:00   | f
Etc/GMT-8 | GMT-8  | 08:00:00   | f
Etc/GMT-9 | GMT-9  | 09:00:00   | f
Etc/GMT-10| GMT-10 | 10:00:00   | f
Etc/GMT-11| GMT-11 | 11:00:00   | f
Etc/GMT-12| GMT-12 | 12:00:00   | f
Etc/GMT-13| GMT-13 | 13:00:00   | f
Etc/GMT-14| GMT-14 | 14:00:00   | f
(35 rows)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Etc/% timezones

2007-08-29 Thread Naz Gassiep



I'm pretty certain that this is a bug. Can anyone confirm?



It is a bug -- in the SQL standard definition.  The meaning of the sign
is inverted w.r.t. the relevant POSIX (?) standard, AFAIU.
Unsurprisingly, we're following the SQL standard here.


Wow. Seriously, wow.
Good thing I'm filtering them out then, else confusion would ensue. Has 
anyone brought the stupidity of this to the attention of the SQL team? Is 
there any rationale behind this? I've been working with timezone stuff for 
the last few weeks and I'm really getting into it, so I'm just interested 
to know how it was concluded that this is a good idea.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Accessing pg_timezone_names system view

2007-08-17 Thread Naz Gassiep
I was wondering if there is any reason that accessing the system view 
pg_timezone_names is extremely slow relative to other queries. The 
following query:


   SELECT * FROM pg_timezone_names;

Executes in between 29ms and 32ms on my server. It takes about the same 
when I put a


   WHERE name = 'some/timezone'

clause in it. To put this into perspective, on the pages that execute 
this, it accounts for something like 3/4 of my DB execution time.


Here's a screenshot to show you what I'm talking about:

   http://www.mrnaz.com/dbetime.gif

As you can see, the execution of that single fetch dwarfs all other 
processing loads. I've run this a few times, and the timings are always 
roughly the same. Is there a way for me to speed this up? Would I be 
better off loading these into a static table and executing from there? 
It seems kinda purpose defeating to do that though. Perhaps this has 
been addressed in 8.3 ? I eagerly await.


Regards,
- Naz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] More Time Zone fun

2007-08-17 Thread Naz Gassiep
I'm making (slow) progress in my timezone system, and I just noticed 
this little behavioral nugget, which surely is a bug. In the system view 
pg_timezone_names is a few timezones that use leap seconds. An example 
which I tested is Asia/Riyadh87. When I attempt to SET TIME ZONE using 
this timezone, I get this:


ERROR: time zone Asia/Riyadh87 appears to use leap seconds DETAIL: 
PostgreSQL does not support leap seconds.


Surely it's a bug to have timezones in the list that result in errors 
being thrown? Also, is there a reason that PostgreSQL does not support 
leap seconds?

Regards,
- Naz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] More Time Zone fun

2007-08-17 Thread Naz Gassiep
Just to further note an oddity, apparently PostgreSQL in fact *does* 
support leap seconds since 7.4 After scouring the manual for info on 
this, I discovered on this page:


http://www.postgresql.org/docs/8.2/static/release-7-4.html

Tom added support for 60 second values in the seconds component of 
TIME/TIMESTAMP/INTERVAL data types, which, as noted by the changelog, is 
needed for support of leap seconds. So this error very well may be a 
behavioral bug. Should I post this to -hackers or -bugs ?

- Naz.

Naz Gassiep wrote:
I'm making (slow) progress in my timezone system, and I just noticed 
this little behavioral nugget, which surely is a bug. In the system 
view pg_timezone_names is a few timezones that use leap seconds. An 
example which I tested is Asia/Riyadh87. When I attempt to SET TIME 
ZONE using this timezone, I get this:


ERROR: time zone Asia/Riyadh87 appears to use leap seconds DETAIL: 
PostgreSQL does not support leap seconds.


Surely it's a bug to have timezones in the list that result in errors 
being thrown? Also, is there a reason that PostgreSQL does not support 
leap seconds?

Regards,
- Naz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] User-Friendly TimeZone List

2007-08-15 Thread Naz Gassiep

Hi all,
   I am still, after quite some time, wrangling over the time zone 
system in my app. I have sorted out all the internal handling, however I 
am still uncertain as to what the best way to get the user to select 
their time zone is.


   I was thinking of having users just select their timezones from a 
list which was populated with the contents of the query:

select * from pg_timezone_names ;
which would look something like this.
http://mrnaz.com/tztest.html

This however is problematic for a number of reasons:

1. The timezone list there isn't exactly user friendly, there are many 
Etc/* timezones there, as well as others that would be potentially 
confusing for users who are trying to select the timezone they are in.
2. If a timezone is removed from the list for whatever reason, then the 
system will be left with users who have selected a timezone that is no 
longer a valid choice in the list.


   The ideal situation would be to maintain my own persistent list of 
timezones (the way Microsoft maintains their own user-friendly list that 
they use for Windows) that maps to the timezones embedded in the PG back 
end, but I haven't the resources to pull this off. Has anyone else 
worked on a mechanism to allow users to supply the timezone they are in, 
and if so, do you have any comments on how best to handle the apparently 
mutually exclusive problems of simplicity for users and accuracy in the 
back end?


   At the simple end of the I can't just have users only select from a 
list going from GMT-12 to GMT+12. At the complex end of the scale I'd 
just give them the output from the list and any that are deprecated will 
result in the user reverting to UTC until they reset a new timezone.


Comments on this usability problem are appreciated,
- Naz.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] User-Friendly TimeZone List

2007-08-15 Thread Naz Gassiep






  Do the views in 8.2 pg_timezone_abbrevs and pg_timezone_names help at all?


They are where I am currently getting the authoritative list of
timezones. However this list does not seem to be quite appropriate to
expose users to directly. Read my original post, I've explained it a
little more there.
Thanks,
- Naz.




[GENERAL] TimestampTZ

2007-08-12 Thread Naz Gassiep
When entering data into a timestamptz field, if no timezone is added 
does it assume you've entered a UTC time, or the time at the timezone 
set in the session with SET TIMEZONE, or the local system time ?

- Naz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] Persistent connections in PHP

2007-08-12 Thread Naz Gassiep

Hi,
   Does the connection pooling feature of PHP cause the persistent 
connections to keep the properties between accesses? E.g., if a user 
takes a connection, sets a timezone to it using SET TIMEZONE, will the 
next user who happens to take this connection get it in that same state, 
or will it be reset to a blank or starting state as though it had been 
opened? Also, what about temp tables? Will they be present to the second 
user if the first user set some up?

- Naz.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] TimestampTZ

2007-08-12 Thread Naz Gassiep



As clearly stated in the documentation

http://www.postgresql.org/docs/8.2/interactive/datatype-datetime.html#DATATYPE-TIMEZONES 



Perhaps I'm thick, but I don't find that particular page to be clear on 
this at all.

- Naz.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Removing a schema

2007-08-07 Thread Naz Gassiep
I'm trying to remove a schema and move all the tables to another schema. 
I've manually run alter table on every table to move them, however all 
the foreign keys still reference the old schema, and there are too many 
to do by hand.


Is there an easy way to update one of the system catalogs to do this? I 
want to change every reference of that schema to point to the new one, 
and then drop the schema, but not drop any data.


Thanks,
- Naz.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] using Tsearch2 for chemical text

2007-07-25 Thread Naz Gassiep



I think you might need to write a custom lexer to divide the strings
into meaningful units.  If there are subsections of these names that
make sense to search for, then tsearch2 can certainly handle the
mechanics of that, but I doubt that the standard rules will divide
these names into lexemes usefully.


A custom lexer for tsearch2 that recognized chemistry related lexical 
components (di-, tetra-, acetyl-, ethan-, -oic, -ane, -ene etc) would 
increase *hugely* the out-of-the-box applicability of PostgreSQL to 
scientific applications. Perhaps such an effort could be co ordinated 
with a physics based lexer and biology related lexer, to perhaps provide 
a unified lexer that provided full scientific capabilities in the way 
that PostGIS provides unified geospatial capabilities.


I don't know how best to bring such an effort about, but I do know that 
if such a thing were created it would be a boon for PostgreSQL, giving 
it a very significant leg up in terms of functionality, not to mention 
the great positive impact that the wide, free availability of such a 
tool would have on the scientific research community.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] why postgresql over other RDBMS

2007-07-18 Thread Naz Gassiep
Surely such a use case could, and more to the point *should* be met 
using PITR?

Regards,
- Naz.

Alvaro Herrera wrote:

A.M. wrote:
  

On May 24, 2007, at 14:29 , Wiebe Cazemier wrote:



On Thursday 24 May 2007 17:30, Alexander Staubo wrote:

  

[2] Nobody else has this, I believe, except possibly Ingres and
NonStop SQL. This means you can do a begin transaction, then issue
create table, alter table, etc. ad nauseum, and in the mean time
concurrent transactions will just work. Beautiful for atomically
upgrading a production server. Oracle, of course, commits after each
DDL statements.

If this is such a rare feature, I'm very glad we chose postgresql.  
I use it all
the time, and wouldn't know what to do without it. We circumvented  
Ruby on
Rails' migrations, and just implemented them in SQL. Writing  
migrations is a
breeze this way, and you don't have to hassle with atomicity, or  
the pain when

you discover the migration doesn't work on the production server.
  
Indeed. Wouldn't it be a cool feature to persists transaction states  
across connections so that a new connection could get access to a sub- 
transaction state? That way, you could make your schema changes and  
test them with any number of test clients (which designate the state  
to connect with) and then you would commit when everything works.


Unfortunately, the postgresql architecture wouldn't lend itself well  
to this. Still, it seems like a basic extension of the notion of sub- 
transactions.



Hmm, doesn't this Just Work with two-phase commit?

  


[GENERAL] Changing DB Encodings

2007-07-07 Thread Naz Gassiep
Why have I been told that I need to do a re initdb to change the char
encoding? The man says i can just createdb foodb -E utf8so why
would i need to dump/initdb/create/restore? cant i just dump/create/restore?

It'd save all the messing around with changing the data dirs etc.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Changing DB Encodings

2007-07-07 Thread Naz Gassiep
Tom Lane wrote:
 Are you clear on the difference between encoding and locale?
   

I confidently reply with maybe.

 You can make new databases with whatever encoding you say, but the
 server's lc_collate and lc_ctype are frozen at initdb, and it will
 not work well to select an encoding that is incompatible with the
 locale setting.  In practice this means that you can only use multiple
 database encodings when you initdb'd in C locale; all other locale
 settings imply a particular encoding.

 Yes, this is pretty annoying.  No, it's not easy to fix.
   

OK I understand now, I think. How do I check what locale is currently
set? I've successfully created a utf8 database, does that imply that
because I was able to create a DB with a different encoding to the ones
all the others use (SQL_ASCII) that my locale is set to C ?

Regards,
- Naz.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Changing DB Encodings

2007-07-07 Thread Naz Gassiep
Tom Lane wrote:
 I've successfully created a utf8 database, does that imply that
 because I was able to create a DB with a different encoding to the ones
 all the others use (SQL_ASCII) that my locale is set to C ?
 

 No, that implies a lack of error checking.

Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an
encoding that differs from the selected locale?

lc_collate and lc_type both report C as the locale. Does that mean I can
safely bulk load a pg_dump into it?

When doing the dump, do I have to use --encoding=utf8 as a param?

Please excuse me if I sound ignorant of all of this, I'm currently
forced to address an issue that I don't yet feel I have a sufficient
grasp of.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Proposed Feature

2007-06-23 Thread Naz Gassiep




Yo,
 I'd have no problem with it being disabled by default. I ruthlessly
pare my systray down and turn on the hide inactive icons function as
well. But when I have PG running it is the sort of thing I'd want to be
able to see at a glance, the same as with Apache.
- Naz.

Magnus Hagander wrote:

  Naz Gassiep wrote:
  
  
Hey,
I'm sure that'd be greatly appreciated, most other major servers and
DBs have a similar feature, and that's what the systray is for, i.e.,
viewing major user-installed services.

  
  
Don't forget that the tray (or taskbar notification area as it's
supposed to be called) is one of the most abused areas of the windows
GUI. I've seen installs where it takes up half the screen. So it's not
*necessarily* a good thing - making it mandatory to have an easy way to
turn it off. And if we don't add any actual *functionality*, it should
absolutely not be enabled by default. Providing it as an option is never
wrong, though.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

  





[GENERAL] Table Names

2007-06-23 Thread Naz Gassiep
Is there a limit on the length of table names?
Thanks,
- Naz.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Proposed Feature

2007-06-22 Thread Naz Gassiep
I'm using PG on windows for the first time (as of about 6 minutes ago).
I was thinking that it would be great to have a system tray icon with a
running indicator, kind of like the way Apache2.x for windows has, or
even MSSQL. Perhaps the PG logo with a small white circle with a red
square or a green triangle in the same fashion.
Just a thought.
- Naz.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Proposed Feature

2007-06-22 Thread Naz Gassiep




Hi,
 It provides a very quick and easy status check (running or not) as
well as right click access to start/stop and any other basic admin
features that may be appropriate. I'd be happy with seeing whether it
was running, as well as start/stop accessible via one right click. It's
the same as Apache for Win32, MSSQL any many other server programs.
- Naz.

Bruce Momjian wrote:

  Naz Gassiep wrote:
  
  
I'm using PG on windows for the first time (as of about 6 minutes ago).
I was thinking that it would be great to have a system tray icon with a
running indicator, kind of like the way Apache2.x for windows has, or
even MSSQL. Perhaps the PG logo with a small white circle with a red
square or a green triangle in the same fashion.
Just a thought.

  
  
And what does the icon show or do?

  





Re: [GENERAL] Proposed Feature

2007-06-22 Thread Naz Gassiep
Hey,
I'm sure that'd be greatly appreciated, most other major servers and
DBs have a similar feature, and that's what the systray is for, i.e.,
viewing major user-installed services.
- Naz.

Tony Caduto wrote:
 Bruce Momjian wrote:
 Naz Gassiep wrote:
  
 I'm using PG on windows for the first time (as of about 6 minutes ago).
 I was thinking that it would be great to have a system tray icon with a
 running indicator, kind of like the way Apache2.x for windows has, or
 even MSSQL. Perhaps the PG logo with a small white circle with a red
 square or a green triangle in the same fashion.
 Just a thought.
 

 And what does the icon show or do?

   
 That would be pretty easy to do with Delphi.
 I could whip something up and donate it to the project with a BSD
 license.

 It could be a green Play arrow if the service is running and a red
 one if it is not, then have a few right click options
 to start/stop/restart the service.

 I was thinking the same thing awhile back, but forgot about it.

 Later,


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] Aggregates

2007-06-21 Thread Naz Gassiep
Hi,
If I have a table with users and a table with messages, is it
possible to have a query that returns user.* as well as one extra column
with the number of messages they have posted and the data and time of
the last message? At the moment I am using a subquery to do this,
however it seems suboptimal. Is there a better way?
- Naz.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Surrogate VS natural keys

2007-06-20 Thread Naz Gassiep
OK so which is the correct way to do it?

E.g., Say I have a table with users, and a table with clubs, and a table
that links them. Each user can be in more than one club and each club
has more than one member. Standard M:M relationship. Which link table is
the right way to do it?

This:

CREATE TABLE (
userid INTEGER NOT NULL REFERENCES users,
clubid INTEGER NOT NULL REFERENCES clubs,
PRIMARY KEY (userid, clubid)
);

Or this:

CREATE TABLE (
id SERIAL PRIMARY KEY,
userid INTEGER NOT NULL REFERENCES users,
clubid INTEGER NOT NULL REFERENCES clubs
);

I've always favored natural keys (the first option) as it just seems to
me a more natural and semantic representation of the data, however I
often get app designers complaining about it being more complex or
something.

Comments?

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] TimeZone List

2007-05-29 Thread Naz Gassiep
I've been trying to sort out the answer to this question for a while
now, I've received different answers from different places.

I'm looking for a definitive non-volatile list of timezones for use in a
web application. I can't use the OS's time zone list, as changing OSes
may cause some listed timezones to disappear or change name, causing
havoc for any entries in the DB referring to those zones.

Is relying on zone.tab to be a non-volatile list of timezones
appropriate for this purpose? Or is there an ISO listing of the various
time zones? I would prefer to use a list that matched the PostgreSQL
DB's internal timezone strings so they could easily be referenced
without any further work in the app.

Thanks
- Naz.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] TimeZone List

2007-05-29 Thread Naz Gassiep
Alvaro Herrera wrote:
 alvherre=# select * from pg_timezone_names ;
name   | abbrev | utc_offset | is_dst 
 --+++
  Africa/Algiers   | CET| 01:00:00   | f
  Africa/Luanda| WAT| 01:00:00   | f
  Africa/Porto-Novo| WAT| 01:00:00   | f
  Africa/Gaborone  | CAT| 02:00:00   | f
  Africa/Ouagadougou   | GMT| 00:00:00   | f
  Africa/Bujumbura | CAT| 02:00:00   | f
  Africa/Douala| WAT| 01:00:00   | f
  Africa/Bangui| WAT| 01:00:00   | f
  Africa/Ndjamena  | WAT| 01:00:00   | f
  Africa/Kinshasa  | WAT| 01:00:00   | f
  Africa/Lubumbashi| CAT| 02:00:00   | f
  Africa/Brazzaville   | WAT| 01:00:00   | f
  Africa/Abidjan   | GMT| 00:00:00   | f
  Africa/Djibouti  | EAT| 03:00:00   | f
  Africa/Cairo | EEST   | 03:00:00   | t
  Africa/Malabo| WAT| 01:00:00   | f
  Africa/Asmera| EAT| 03:00:00   | f
 ...

 alvherre=# select * from pg_timezone_abbrevs ;
  abbrev | utc_offset | is_dst 
 ++
  ACSST  | 10:30:00   | t
  ACST   | -04:00:00  | t
  ACT| -05:00:00  | f
  ADT| -03:00:00  | t
  AESST  | 11:00:00   | t
  AEST   | 10:00:00   | f
  AFT| 04:30:00   | f
  AKDT   | -08:00:00  | t
  AKST   | -09:00:00  | f
  ALMST  | 07:00:00   | t
 ...
   

Ok, that's kinda cool. But can I trust those names to not change from
version to version?

And how are conflicts in the abbreviations handled? (for example, EST is
a valid time zone for the US and Australia).

Also, whats with the
Etc/GMT+x timezones? I assume they are just standard integer offset
values? I'd probably filter them out in the app using WHERE
substring(name FROM 0 FOR 3)  Etc are there any caveats for doing this?

Thanks,
- Naz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] TimeZone List

2007-05-29 Thread Naz Gassiep
Tom Lane wrote:
 Naz Gassiep [EMAIL PROTECTED] writes:
   
 Ok, that's kinda cool. But can I trust those names to not change from
 version to version?
 

 No, you can't.  The reason there is no nonvolatile list of timezones
 is that timezones are subject to the whims of politicians, who can and
 do change them at the drop of a hat.  Read the historical comments in
 the zic source files sometime...

As if war, famine, injustice and social inequality weren't enough,
politicians also have to screw up time zone handling in my app.
Damn politicians.
- Naz.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Integrity on large sites

2007-05-22 Thread Naz Gassiep
I'm working in a project at the moment that is using MySQL, and people
keep making assertions like this one:

*Really* big sites don't ever have referential integrity. Or if the few
spots they do (like with financial transactions) it's implemented on the
application level (via, say, optimistic locking), never the database level.

This sounds to me like MySQLish. A large DB working with no RI would
give me nightmares. Is it really true that large sites turn RI off to
improve performance, or is that just a MySQL thing where it gets turned
off just because MySQL allows you to turn it off and improve
performance? Can you even turn RI off in PostgreSQL? Does Oracle, DB2 or
MSSQL allow you to turn it off? Am I just being naive in thinking that
everyone runs their DBs with RI in production?

- Naz

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] In theory question

2007-05-09 Thread Naz Gassiep
This may be a question for -hackers, but I don't like disturbing them
unnecessarily.

I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?

I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?

Such a mechanism could be a) transparent to any and all APIs accessing
the back end thus instantly providing the benefits of caching to all
apps transparently, and b) assist with replication by providing a way
for a setup to have n serving daemons (effectively db caches) on
different machines accessing n replicated DBs. Such a setup would be
far easier to set up than n fully fledged DB servers, and would likely
scale better anyway.

Thoughts?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] In theory question

2007-05-09 Thread Naz Gassiep
Hannes Dorbath wrote:
 I think this is close to what MySQL's query cache does. The question
 is if this should be the job of the DBMS and not another layer. At
 least the pgmemcache author and I think that it's better done outside
 the DBMS. See
 http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf for the idea.
This is exactly what I was asking about. So my theoretical idea has
already been implemented. Now if only *all* my ideas were done for me by
the time I came up with them :)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] In theory question

2007-05-09 Thread Naz Gassiep
 
 This is exactly what I was asking about. So my theoretical idea has
 already been implemented. Now if only *all* my ideas were done for me by
 the time I came up with them :)

 Then you wouldn't be able to eventually patent them ;)
What an un-BSD licensish thing to say :P

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] In theory question

2007-05-09 Thread Naz Gassiep

 I have always found MySQL's query cache to be utterly useless.

 Think about it this way :

 It only works for tables that seldom change.
 It does not work for big tables (like the posts table of a forum)
 because the cache would have to be huge.

 So, the most frequent usage of MySQL's query cache is for dumb
 applications who use, for instance, PHP, store their configuration in
 MySQL, and reload it on each and every page with a SELECT * FROM
 configuration.

 In this case, you save the query time, but you don't save : the
 roundtrip between PHP and the database, extracting query results,
 building objects from them, time spent in ORMs, etc.

 A much better solution is to do your own caching, for instance
 using shared memory in the application server, and then you cache
 native language objects. You not only save the query time, but also
 all the time spent building those objects on every page load.
This was actually my original question. In my web page, I cache the
config, user preferences and other static material in session vars and
only rerun the function that fetches them if the app ever changes them

If you are clever about db fetches in this way and store as much stuff
in session vars (which is just like storing it in a memcached instance
really) is there much or even any benefit in running memcached?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Update violating constraint

2007-05-02 Thread Naz Gassiep
Hi,
I'm trying to do an update on a table that has a unique constraint
on the field, I need to update the table by setting field = field+1 
however if this does not perform the updates on the table in a proper
order (from last to first) then the update will cause a violation of the
index *during* the update even though the table would be consistent
after the update completes. So the update fails. How do I get around
this without removing the constraint?
- Naz.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Update violating constraint

2007-05-02 Thread Naz Gassiep
Michael Glaesemann wrote:

 On May 2, 2007, at 23:01 , Naz Gassiep wrote:

 I'm trying to do an update on a table that has a unique constraint
 on the field, I need to update the table by setting field = field+1
 however if this does not perform the updates on the table in a proper
 order (from last to first) then the update will cause a violation of the
 index *during* the update even though the table would be consistent
 after the update completes.

 If field's values are all positive, I generally will do it in two steps:

 update foo
 set field = -1 * (field + 1);
 update foo
 set field = -1 * field
 where field  0;

 Another way to do it is to add and then remove a large offset:

 update foo
 set field = 10 * (field + 1);
 update foo
 set field = field - 10
 where field  10; 
Yes, in fact I actually use option one already in the handling of sql
trees, so I'm annoyed with myself for not figuring that out. I don't
know why you'd ever use your second option ever, as it virtually
guarantees problems at a random point in your DB's growth.

Thanks muchly for that!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Storing blobs in PG DB

2007-04-05 Thread Naz Gassiep
This doesn't answer your question, but I thought I'd throw my opinion in 
anyway.


My personal view is that in general, binary files have no place in 
databases. Filesystems are for files, databases are for data. My design 
choice is to store the files in a fileystem and use the database to hold 
metadata as well as a pointer to the file.


If you *must* put files into the database, then you can do so, and PG 
will handle that many files of those sizes with ease. For all intents 
and purposes, PG can store an unlimited number of files. You're far more 
likely to run into walls in the form of limitations in your disk I/O 
system then limitations in what PG will handle.


- Naz.

Nikolay Moskvichev wrote:

Hi All!

Question is : How suitable PG for storing about 2 000 000 binary files 
0,5-2,0 Mb size each ? It is not planned the big number of clients or 
a plenty of updatings. Like photoalbum on local host.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Lifecycle of PostgreSQL releases

2007-03-24 Thread Naz Gassiep

Tom Lane wrote:

Naz Gassiep [EMAIL PROTECTED] writes:
  

Joshua D. Drake wrote:


Example discussion with customer:
  

...
Finally, in the absence of security concerns or performance issues (and 
I mean the we can't afford to buy better hardware type edge of the 
envelope type issues) there is zero *need* to upgrade.



This line of argument ignores the fact that newer versions often contain
fixes for data-loss-grade bugs.  Now admittedly that is usually an
argument for updating to x.y.z+1 rather than x.y+1, but I think it
destroys any reasoning on the basis of if it ain't broke.
Not when you consider that I did say in the absence of security 
concerns. I consider the possibility that a bug can cause me to lose my 
data to be a security concern. If it's a cosmetic bug or something 
that otherwise does not affect a feature I use, then upgrading, as you 
say, is very much of a x.y+1 wait than upgrading minor releases 
sometimes multiple times a month.


It must be remembered that human error can result in downtime, which can 
cost money. Therefore its a foo risk vs bar risk type balance. At least, 
that's how I see it.


---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Bulk Data Entry

2007-03-21 Thread Naz Gassiep
This problem is to do with bulk loading of data. I use the following 
scripts to take data from a live DB and put it into a testing DB with 
the current version of the schema:


# SCRIPT 1
pg_dump blogbog -a -D -f blogbog_data.sql
dropdb blogbogtemp
createdb blogbogtemp
psql blogbogtemp -f /www/htdocs/mrnaz.com/sql_tables.sql
pg_dump blogbogtemp -D -f blogbog_tables.sql
cp blogbog_tables.sql blogbog_constraints.sql


I edit the blogbog_tables.sql file to remove the constraints and the 
blogbog_constraints.sql file to remove the tables. I then run the 
following script:



# SCRIPT 2
dropdb blogbogdev
createdb blogbogdev
psql blogbogdev -f ./blogbog_tables.sql  ./blogbog_tables_inserted.log
psql blogbogdev -f ./blogbog_data.sql  ./blogbog_data_inserted.log
psql blogbogdev -f ./blogbog_constraints.sql  
./blogbog_constraints_applied.log



Somewhere in the insertion of the data (4th line of script 2) there is a 
failure, as no data appears in the blogbogdev database. This is likely 
due to a column in the live data somewhere that has been deprecated from 
the schema causing an insert failure on a table causing failure on all 
tables referring to it and so on cascading down the reference paths.


I really would prefer psql to halt on error instead of just continuing 
to plow right ahead, but IIRC there was a discussion about this and it 
was decided that continuing was the best behavior.


I have grepped the .log files that the script outputs for ERROR but 
there is none. How would one go about finding where the error in an SQL 
script is?

Bulk

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Lifecycle of PostgreSQL releases

2007-03-21 Thread Naz Gassiep

Joshua D. Drake wrote:

Example discussion with customer:

Customer: CMD, should we update to 8.2.3
CMD: Is there something in 8.2.3 that will benefit you?
Customer: We don't know
CMD: Are you having problems with 8.1? (We try to push all customers to
at least 8.1)
Customer: No, it is just that 8.2 is the current release
CMD: True, but 8.3 is due out in the summer and 8.3 is a standard cycle
release
Customer: Oh... o.k. let's wait.
CMD: I think that is probably prudent.
  
That's how it is with me. I upgraded to 8.1 from 7.4 because there was 
nothing in 8.0 that I *needed* and performance was already more than 
sufficient on my ridiculous overkill hardware. I recently upgraded from 
8.1.x to 8.2.3 only because of the DST updates in Western Australia. I 
would not have otherwise. If it ain't broke, don't fix it.


Furthermore, upgrading is inherently risky. There is always the chance 
of human error induced downtime, and so doing it just coz is not a 
prudent policy.


Finally, in the absence of security concerns or performance issues (and 
I mean the we can't afford to buy better hardware type edge of the 
envelope type issues) there is zero *need* to upgrade. Sure, it may be 
better to use a new and shiny version, however I always favor a 
realistic and honest assessment of *needs* over *perceived needs*.


All that being said, the older the version you are running, the higher 
the weight that should be attributed to the upgrading is a good idea 
just coz argument. After a point, upgrading is just a good idea just 
coz. I wouldn't recommend anyone continue to run 7.2.x merely because 
it was working for them.


Just my 2c (adjusted for inflation).


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Postgres Database size

2007-03-19 Thread Naz Gassiep




I have been looking for such a function. Having Just upgraded to 8.2,
this function is a very welcome addition to my arsenal of tools.
Many thanks!
- Naz.

Reece Hart wrote:

  On Sun, 2007-03-18 at 00:40 +0530, Mageshwaran wrote:
  
  
how to find the size of a particular database in postgres...

  
  
The old way was to use du or similar. Recent versions (I believe =8.1,
but check the release notes to be sure) provide several useful functions
for this:
 pg_column_size
 pg_database_size
 pg_relation_size
 pg_size_pretty
 pg_tablespace_size
 pg_total_relation_size


For example:

[EMAIL PROTECTED] select datname,pg_size_pretty(pg_database_size(oid)) from
pg_database ;
   datname   | pg_size_pretty 
-+
 postgres| 3905 kB
 csb | 113 GB
 template0   | 3840 kB
 csb-dev | 124 GB
 csb-dev-snapshot-2007-03-08 | 123 GB
 csb_02_02_2007  | 121 GB
 template1   | 3840 kB


  





[GENERAL] Design / Implementation problem

2007-03-18 Thread Naz Gassiep
This is possibly not a DB only problem, the solution may involve 
application logic as well. But PG users

are the smartest bunch I know. Ass kissing aside, here are the details:

*** The Scenario ***

We are running a customer loyalty program whereby customers earn points 
for purchasing products. Each
product has a value of points that are earned by purchasing it, and a 
value of points required to

redeem it.

In order to prevent customers from stockpiling points, we want to place 
an expiry date on points so
that unused points expire and are lost if they are not redeemed within a 
certain period of time. This
will be calculated on a FIFO basis, I.e., the oldest points will expire 
first.


We will assume the expiry period is 12 months.

*** The Problem ***

Ascertaining which points to expire is fairly conceptually simple. At 
any given point in time, the
points expired is simply the balance on hand at the start of the period, 
less redemptions in that
period. If the redemptions is less than the balance at open, not all 
points that were available on

that date were used, and the difference is the expiry.

This can be done periodically, say, at the start of every month. However 
there are a few problems

with doing it periodically

1. The runs are likely to be too large to be manageable. A DB with tens 
of thousands of customers
  and many hundreds of thousands or even millions of sales in the 
records tables will require several
  queries and some application calculation to compute. If it takes 2 
seconds to compute each balance
  of a 20,000 strong customer base, that's over 11 hours of heavy 
lifting in the DB, which will
  likely result in severely degraded performance during those hours. 
This problem can only get
  worse as time goes on, and hardware upgrade requirements just to 
accommodate a 12 hour window

  once a month is the sign of an app not designed to scale well.

2. Calculating the balance on the fly would be more effective, as it is 
unlikley that many customers
  will check their balance on a regular basis. It is likely that a 
small fraction of customers will
  check their balance in a given month, meaning that calculating it on 
the fly would both spread
  the load over time as well as reduce the total load, even if on the 
fly calculation results in

  significantly higher per-customer calculation time.

3. The app is a web app, and it would be preferable to contain business 
logic within the database
  itself or the current app codebase. Spreading application logic into 
an external mechanism such
  as cron or an external daemon would be undesirable unless there was 
no other way.


*** A Possible Solution ***

Calculating the balance on the fly can be done easily if it is done at 
the time the customer seeks
to redeem a voucher. Expired points are only relevant at these times, 
and so the expired points

would be calculated with an application function that did the following:

1. Get the balance as it was 12 months ago by getting total points 
earned less redemptions and expiries

  up to that date.
2. Subtract from it redemptions and expiries since then. The value 
obtained, if it is positive, is the

  value of points to expire.
3. Log the expiry entry, and then calculate the balance of points to the 
current date by subtracting

  total points redeemed and expired from total points earned.

This procedure has a few practical problems, however:
1. Customers, when they view their running total, will not be aware that 
some of the points included
  in it will have expired, as the expiry will only happen when the 
application attempts to log a

  redemption.
2. A customer may attempt to redeem a product they do not have enoughh 
points for, and be told at

  the last minute that they do not have enough points, leading to acrimony.

The solution is then to calculate it on only on redemptions, but also 
whenever the customer attempts
to view their balance. This will ensure that expired points will never 
be shown in the current balance
of available points. This, however, means the calculation may be done 
many times by each customer in

a single catalog browsing session.

*** The Question ***

Is there a way to design the DB schema as well as the query in such a 
manner that calculating the point
balance on the fly is not an unfeasibly heavy duty calculation to be 
done at every page view?


This problem does not appear to be solved comprehensively by anyone. 
When I log into my credit card
company web site to check my points, I get a message please come back 
in an hour, and your points
will be calculated if I haven't logged in for over a week. So obviously 
they calculate the balance

and put it in a table that acts as a cached balance.

Emirates has a different solution, they do bi-annual runs, so points 
expire every March and September

for them.

Neither of these solutions appeals to me, and there must be A Better 
Way(tm).


---(end of 

Re: [GENERAL] Design / Implementation problem

2007-03-18 Thread Naz Gassiep

Here it is again with more sensible wrapping:


*** The Scenario ***

We are running a customer loyalty program whereby customers earn points
for purchasing products. Each product has a value of points that are
earned by purchasing it, and a value of points required to redeem it.

In order to prevent customers from stockpiling points, we want to place
an expiry date on points so that unused points expire and are lost if
they are not redeemed within a certain period of time. This will be
calculated on a FIFO basis, I.e., the oldest points will expire first.

We will assume the expiry period is 12 months.


*** The Problem ***

Ascertaining which points to expire is fairly conceptually simple. At any
given point in time, the points expired is simply the balance on hand at
the start of the period, less redemptions in that period. If the
redemptions is less than the balance at open, not all points that were
available on that date were used, and the difference is the expiry.

This can be done periodically, say, at the start of every month. However
there are a few problems with doing it periodically

1. The runs are likely to be too large to be manageable. A DB with tens of
thousands of customers and many hundreds of thousands or even millions of
sales in the records tables will require several queries and some
application calculation to compute. If it takes 2 seconds to compute
each balance of a 20,000 strong customer base, that's over 11 hours of
heavy lifting in the DB, which will likely result in severely degraded
performance during those hours. This problem can only get worse as time
goes on, and hardware upgrade requirements just to accommodate a 12 hour
window once a month is the sign of an app not designed to scale well.

2. Calculating the balance on the fly would be more effective, as it is
unlikley that many customers will check their balance on a regular basis.
It is likely that a small fraction of customers will check their balance
in a given month, meaning that calculating it on the fly would both spread
the load over time as well as reduce the total load, even if on the fly
calculation results in significantly higher per-customer calculation time.

3. The app is a web app, and it would be preferable to contain business
logic within the database itself or the current app codebase. Spreading
application logic into an external mechanism such as cron or an external
daemon would be undesirable unless there was no other way.


*** A Possible Solution ***

Calculating the balance on the fly can be done easily if it is done at the
time the customer seeks to redeem a voucher. Expired points are only
relevant at these times, and so the expired points would be calculated
with an application function that did the following:

1. Get the balance as it was 12 months ago by getting total points earned
less redemptions and expiries up to that date.
2. Subtract from it redemptions and expiries since then. The value
obtained, if it is positive, is the value of points to expire.
3. Log the expiry entry, and then calculate the balance of points to the
current date by subtracting total points redeemed and expired from total
points earned.

This procedure has a few practical problems, however:
1. Customers, when they view their running total, will not be aware that
some of the points included in it will have expired, as the expiry will
only happen when the application attempts to log a redemption.
2. A customer may attempt to redeem a product they do not have enoughh
points for, and be told at the last minute that they do not have enough
points, leading to acrimony.

The solution is then to calculate it on only on redemptions, but also
whenever the customer attempts to view their balance. This will ensure
that expired points will never be shown in the current balance of
available points. This, however, means the calculation may be done many
times by each customer in a single catalog browsing session.


*** The Question ***

Is there a way to design the DB schema as well as the query in such a
manner that calculating the point balance on the fly is not an unfeasibly
heavy duty calculation to be done at every page view?

This problem does not appear to be solved comprehensively by anyone. When
I log into my credit card company web site to check my points, I get a
message please come back in an hour, and your points will be calculated
if I haven't logged in for over a week. So obviously they calculate the
balance and put it in a table that acts as a cached balance.

Emirates has a different solution, they do bi-annual runs, so points
expire every March and September for them.

Neither of these solutions appeals, and there must be A Better Way(tm).

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Query Assistance

2007-03-15 Thread Naz Gassiep

Is anyone able to tell me why in the last column of the returned result set, 
the value calculated is always 0?


QUERY:

 SELECT products.productid,
products.cost,
products.srp,
CASE WHEN products.srp  0 THEN (products.srp - products.cost) 
* 100 / products.srp ELSE 0 END AS margin,
products.type,
products.gstexempt,
productpointvalues.earnvalue,
productpointvalues.redeemvalue,
productpointvalues.earnvalue / productpointvalues.redeemvalue 
AS redemptionmargin
   FROM categories, products
LEFT OUTER JOIN productpointvalues USING (productid)
  WHERE products.active IS TRUE
AND products.catid = categories.catid
AND products.catid = 2
   ORDER BY products.name;



RESULT SET:

productid | cost  |  srp   |margin| type | gstexempt | 
earnvalue | redeemvalue | redemptionmargin
---+---++--+--+---+---+-+--
  716 |  8.60 |  10.00 |  14. | N| f |
50 |1500 |0
   15 | 87.00 | 100.00 |  13. | N| f |   
500 |   1 |0
   13 | 26.10 |  30.00 |  13. | N| f |   
150 |3000 |0
 1189 |  0.00 |  40.00 | 100. | N| f |   
200 |4000 |0
   14 | 43.50 |  50.00 |  13. | N| f |   
250 |5000 |0


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Query Assistance

2007-03-15 Thread Naz Gassiep

Indeed.
Thanks for that! I keep getting bitten by that too hehe.
- Naz.

William Garrison wrote:
My guess is that integer division is to blame: 50 divided by 1500 = 
0.03 which rounds to zero.  You probably have to cast them to real 
before doing the division.


Naz Gassiep wrote:
Is anyone able to tell me why in the last column of the returned 
result set, the value calculated is always 0?



QUERY:

 SELECT products.productid,
products.cost,
products.srp,
CASE WHEN products.srp  0 THEN (products.srp - 
products.cost) * 100 / products.srp ELSE 0 END AS margin,

products.type,
products.gstexempt,
productpointvalues.earnvalue,
productpointvalues.redeemvalue,
productpointvalues.earnvalue / 
productpointvalues.redeemvalue AS redemptionmargin

   FROM categories, products
LEFT OUTER JOIN productpointvalues USING (productid)
  WHERE products.active IS TRUE
AND products.catid = categories.catid
AND products.catid = 2
   ORDER BY products.name;



RESULT SET:

productid | cost  |  srp   |margin| type | gstexempt 
| earnvalue | redeemvalue | redemptionmargin
---+---++--+--+---+---+-+-- 

  716 |  8.60 |  10.00 |  14. | N| f 
|50 |1500 |0
   15 | 87.00 | 100.00 |  13. | N| f 
|   500 |   1 |0
   13 | 26.10 |  30.00 |  13. | N| f 
|   150 |3000 |0
 1189 |  0.00 |  40.00 | 100. | N| f 
|   200 |4000 |0
   14 | 43.50 |  50.00 |  13. | N| f 
|   250 |5000 |0



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Query timing

2007-03-04 Thread Naz Gassiep
Using EXPLAIN ANALYZE I can get the execution time of a query. Is there 
a command I can use to get the execution time without the planning 
information? I just need to time lots of queries that have complex plans 
and it'd be easier if I didn't have pages and pages of planning info 
between tries. The queries themselves are one line each, but refer to 
views, which make the plans pretty complex. Being able to see several 
times in one terminal window would be easier.


Thanks,
- Naz.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Query timing

2007-03-04 Thread Naz Gassiep
That's not quite as fast as I would like to do it, that throws in a few 
more steps which slow down the development process. However if there is 
no way I will persevere with the method I have now.

Thanks,
- Naz.

Jorge Godoy wrote:

Naz Gassiep [EMAIL PROTECTED] writes:

  

Using EXPLAIN ANALYZE I can get the execution time of a query. Is there a
command I can use to get the execution time without the planning information?
I just need to time lots of queries that have complex plans and it'd be easier
if I didn't have pages and pages of planning info between tries. The queries
themselves are one line each, but refer to views, which make the plans pretty
complex. Being able to see several times in one terminal window would be
easier.



Write the queries in a file, run them with psql -f and grep for the desired
lines only.

  


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Select retrieval slowdown after db drop/reload. Suggestions?

2007-02-28 Thread Naz Gassiep




You have to run ANALYZE; on your db after a drop/reload to recollect
the stats. In the rest db, jus run ANALYZE; and then see how fast it
is. I'd guess that this is your issue.
Regards,
- Naz.

Andrew Edson wrote:

  I have a select statement, used in a Perl program, which is
supposed to find all records related to those in one table which have a
delete_dt field value offour years or older. 
  
  This is the select statement:
SELECT t2.dist_id, t1.clnt_seq, t2.cntrct_seq, t2.cntrct_id,
t3.aunit_seq, t1.person_seq, t1.addr_seq, t3.addr_seq, t4.frst_nm || '
' || t4.lst_nm AS name, t5.addr_1, t6.acntdel_dt FROM t1, t2, t3, t4,
t5, t6 WHERE t1.clnt_seq = t2.clnt_seq AND t2.cntrct_seq =
t3.cntrct_seq AND t3.aunit_seq = t6.aunit_seq AND t1.person_seq =
t4.person_seq AND t3.addr_seq = t5.addr_seq AND t1.active_flg =0 AND
t2.active_flg =0 AND t3.active_flg = 0 AND t6.acntdel_dt  now() -
'4 years'::interval order by t2.cntrct_id asc;
  
  I'm working on a test box at the moment; the db I am using was
made by dumping the production db and copying it over to the test box
to be loaded into a newly-created db there. It took a while for me to
get the original Perl program working, as I don't really understand
Perl, but after I did so, I dropped the db and reloaded it again off of
the original files, so I could try another test run and pay attention
to what's happening.
  
  On the original load of the test db, the query above had a run
time of roughly 3, 3.5 minutes before giving results. Considering the
size of the db it's searching through, I feel that's fairly reasonable,
especially since that's about what the production db does on the same
query. Now, after the drop/recreate, the test db is taking somewhat
longer to give back its results; just shy of 7 minutes, if I've done
the math correctly. (Timing results - Time: 417531.436 ms)
  
  I'm the only person working on this particular box at this
point. This problem did not start until I reloaded the db from the
original files. Obviously, I managed to miss something in the
drop/reload process, but I have no clue what. I'm running a vacuum
full analyze at the moment; if anyone has any other suggestions as to
what I could do to solve this (admittedly minor) problem, I would be
grateful to hear them.
  
  Thank you for your consideration.
  
   
  No need to miss a message. Get
email on-the-go 
with Yahoo! Mail for Mobile. Get
started.





[GENERAL] max_fsm_pages

2006-11-05 Thread Naz Gassiep

I just did a vacuum analyze and I got a message I've never seen before:

conwatchlive=# vacuum analyze;
NOTICE:  number of page slots needed (27056) exceeds max_fsm_pages (2)
HINT:  Consider increasing the configuration parameter max_fsm_pages 
to a value over 27056.

VACUUM
conwatchlive=#

What does this mean? I assume it has nothing to do with the Flying 
Spaghetti Monster.


More generally, I am a novice at administering a PostgreSQL database, is 
there a list of tips somewhere that I can read to improve performance?


---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Ghost open transaction

2006-10-20 Thread Naz Gassiep
I was performing a bunch of INSERTs into a table, users, that has a 
unique index on username. During the transaction, my internet connection 
dropped. The INSERTs were being done inside a transaction.


Once I had manhandled my DSL router back online, I went back into a 
console to redo the inserts. I found that after I did the first insert, 
it appeared to freeze. I thought that my net had dropped out again, but 
I was able to Ctrl+C the command and rollback and do it again, with the 
same result. The previous connection is obviously still active, locking 
the transaction until the fate of the previous insert with that username 
is known, i.e., the ghost connection rolls back or commits.


How do I determine which connection is the ghost connection, and how do 
I tell PG to kill it? Also, is it an OS setting to determine the timeout 
on open but inactive connections, or a PG setting?


- Naz.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] vista

2006-09-19 Thread Naz Gassiep



It's the folks who think that non-Windows-using developers should care
about Vista that bug me.  This is open-source code, people.  Scratch
your own itch.
  
The scratch your own itch line can only be pushed so far, if it is 
being said by a developer who works on a project that desires to be 
taken seriously by professionals in industry. For minor features, yes, 
perhaps it could be argued that the core team could ignore certain 
issues, and just wait for a patch. For something like Vista 
compatibility, if you want to be taken seriously by anyone who uses 
Windows (hands up anyone who knows a Windows user), scratch your own 
itch is not really going to cut it, IMHO. I'm used to getting that line 
when talking to 2 developer obscure projects that have a userbase of a 
half a dozen, but for a project like PostgreSQL, the they tell you to 
do it yourself brush is one we do NOT want to get tarred with.


If we don't have the resources to cope with a Vista port immediately 
then so be it. If it's low priority, so be it. However, lets not appear 
to deride as unnecessary that which we cannot immediately provide a 
solution to. That's small time project mentality.


- Naz.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] vista

2006-09-19 Thread Naz Gassiep




So what do you suggest? -core vote and order someone to do the work?
Postgresql.org isn't a business and doesn't employ any developer - we
only have the option of accepting patches from people/companies with
itches.
  
I don't suggest any chance to any structures in place, it's a purely PR 
point.


That's important and we acknowledge the need.
Even in the absence of any progress on that item, a statement like this 
sounds better to PHBs than

If you need it, submit a patch.

Regards,
- Naz.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] vista

2006-09-19 Thread Naz Gassiep



So...

If you're not a PostgreSQL Win32 port dev, and *don't know* what they're
up to as far as Vista, why respond to the Q?  Or why respond fix it
yourself instead of ask this guy or nobody here will know yet or
post your query on -ports or -hackers.  
  
Precisely. My point is not that people *should* care about Win32, or 
that some coercive system should be put into place to force devs onto 
RFEs, or even that we should commit to having something ready at all 
ever. It's a point of apparent attitude. Responding with a useless 
answer is *worse* than simply ignoring the question. I am not a Win32 
user (at least not on servers), but if I wanted to know if Vista 
compatibility was being worked on getting if you want it, go do it 
would be even less useful than an effort to convince me to run the DB on 
a *nix back end. I'd rather have someone tell me how and why to migrate 
to a better OS.

Frankly, I too could care less about PG on Vista.  Longhorn isn't due
until Vista SP1, so PG support has a long time to go before it's a real
concern.  But then I didn't try to answer the question.
  
Same here. It's a matter of apparent attitude of the community to 
outsiders or new users. Even with something like future Win32 support, 
I'd rather see people being told that issue is not important to our 
project because x, y and z than go do it yourself.


Anyway, I seem to have kicked up a bit of a hornet's nest here so I'll 
shut up now.


- Naz.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[GENERAL] Changing schemas

2006-09-18 Thread Naz Gassiep
   Is there a way to change the schema that all objects are in? 
Essentially I want to move everything currently in the database into 
public rather than having the complex schemas that I have at the moment. 
They are unnecessary and the DB complexity is trivial, so using schema 
partitioning is more trouble than it is worth. Is there a way I can do 
this? I tried editing a dump, but that has proven to be a little 
difficult, as I am not a master of regex.


If I can do it from within the DB before I do the dump that'd be easier. 
Is there a way to do this from within the DB?


Thanks,
- Naz.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Atomicity?

2006-08-28 Thread Naz Gassiep
I am getting an error that I think I understand, but that I didn't think 
should happen.


Below is the output from psql that I am getting to trigger this error. 
If the violation of the constraint really is being caused WITHIN the 
query, doesn't that violate the principle of atomicity? I.e., operations 
and entities should be considered a single entire construct rather than 
a collection of smaller, discrete parts. Or do I have my understanding 
all wrong?


In any case, how do I get around this problem?

Regards,
- Naz.


conwatch=# \d replies;
Table conwatch.replies
 Column   |   Type   | Modifiers
---+--+---
replyid   | integer  | not null default 
nextval('replies_replyid_seq'::regclass)

postid| integer  | not null
lft   | smallint | not null
rgt   | smallint | not null
poster| integer  | not null
posted| timestamp with time zone | not null default now()
title | character varying(100)   | not null
body  | text |
anonymous | boolean  | not null default false
Indexes:
   replies_pkey PRIMARY KEY, btree (replyid)
   replies_lft_postid UNIQUE, btree (lft, postid)
   replies_rgt_postid UNIQUE, btree (rgt, postid)
   replies_lft_index btree (lft)
   replies_rgt_index btree (rgt)
Foreign-key constraints:
   replies_poster_fkey FOREIGN KEY (poster) REFERENCES users(userid)
   replies_postid_fkey FOREIGN KEY (postid) REFERENCES posts(postid)

conwatch=# select replyid, postid, lft, rgt, title from replies where 
postid = 18 order by lft;

replyid | postid | lft | rgt |title
-++-+-+--
 24 | 18 |   1 |  14 | Invisible root post.
 25 | 18 |   2 |   7 | Re: Pronto
 26 | 18 |   3 |   6 | Re: Pronto
 27 | 18 |   4 |   5 | Re: Pronto
 29 | 18 |   8 |  13 | Re: Pronto
 31 | 18 |   9 |  12 | Re: Pronto
 32 | 18 |  10 |  11 | Re: Pronto
(7 rows)

conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt = 11;
ERROR:  duplicate key violates unique constraint replies_rgt_postid
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt = 14;
UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt = 13;
UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt = 12;
UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt = 11;
UPDATE 1
conwatch=#

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Atomicity?

2006-08-28 Thread Naz Gassiep






Peter Eisentraut wrote:

  Naz Gassiep wrote:
  
  
If the violation of the constraint really is being caused
WITHIN the query, doesn't that violate the principle of atomicity?
I.e., operations and entities should be considered a single entire
construct rather than a collection of smaller, discrete parts.

  
  
The principle of atomicity merely says that transactions are either 
performed entirely or not at all.  If the transaction is not performed, 
then there is no violation of atomicity.

  
  
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt


  = 11;
  

ERROR:  duplicate key violates unique constraint "replies_rgt_postid"

  
  
This is a well-known deficiency in PostgreSQL.  You will have to work 
around it somehow (by changing the query, the schema, or the index).
  

Do we have an ETA on fixing it? Or is it a long term outstanding issue
with no ETA as yet?
Thanks for the reply,
- Naz




Re: [GENERAL] Atomicity?

2006-08-28 Thread Naz Gassiep
No, the subsequent UPDATEs were just there to show you they worked... I 
was only interested in the failed update, and why it failed. The DB was 
consistent before the query, and it would have been after the query, so 
I did not understand why the query failed unless the query made teh DB 
inconsistent at some point DURING its execution. This seems odd to me, 
as queries should not trigger errors like that if the DB is only out of 
consistency DURING its execution, as long as it is consistent before and 
after.

Regards,
- Naz.

Joshua D. Drake wrote:

Naz Gassiep wrote:
I am getting an error that I think I understand, but that I didn't 
think should happen.


Below is the output from psql that I am getting to trigger this 
error. If the violation of the constraint really is being caused 
WITHIN the query, doesn't that violate the principle of atomicity? 
I.e., operations and entities should be considered a single entire 
construct rather than a collection of smaller, discrete parts. Or do 
I have my understanding all wrong?


In any case, how do I get around this problem?


If you do not specify the beginning of a transaction, all statements 
are run within their own transaction.. e;g:


Your example actually means:

begin;

conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt 
= 11;


commit;

begin;
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt 
= 14;

commit;

What you want is:

begin;


UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt 
= 13;

UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt 
= 12;

UPDATE 1
conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt 
= 11;

UPDATE 1
conwatch=#


commit;

Joshua D. Drake




---(end of broadcast)---
TIP 6: explain analyze is your friend






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Atomicity?

2006-08-28 Thread Naz Gassiep
I would like more information on this deficiency and what causes it so I 
know when to anticipate it. This resulted in a rather nasty bug which 
took me ages to track down. Is anyone able+willing to explain a little 
here or should I ask in -hackers ?

Regards,
- Naz.

Michael Glaesemann wrote:


On Aug 29, 2006, at 4:46 , Peter Eisentraut wrote:


Naz Gassiep wrote:

conwatch=# UPDATE replies SET rgt = rgt + 2 WHERE postid = 18 AND rgt

= 11;

ERROR:  duplicate key violates unique constraint replies_rgt_postid


This is a well-known deficiency in PostgreSQL.  You will have to work
around it somehow (by changing the query, the schema, or the index).


One such workaround is:

BEGIN;

UPDATE replies
SET rgt = -1 * (rgt + 2)
WHERE postid = 18
AND rgt = 11;

UPDATE replies
SET rgt = -1 * rgt
WHERE rgt  0;

COMMIT;

Michael Glaesemann
grzm seespotcode net






---(end of broadcast)---
TIP 6: explain analyze is your friend