Re: [GENERAL] Transactional DDL
Harpreet Dhaliwal wrote: And this feature i.e. transactional DDL is not there in other major RDBMS like sql server, oracle etc? You've had about 50 answers to that question already I think. The answer is No. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Transactional DDL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/15/07 00:05, Harpreet Dhaliwal wrote: > And this feature i.e. transactional DDL is not there in other major > RDBMS like sql server, oracle etc? Define "major". Does it mean "popular" or "used on very large systems"? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwp1GS9HxQb37XmcRAtdIAKC+7kG6K4WVxgGGSVT/AHcWCo6I8gCfZ9y5 bVcXkbWY4E9OzYss8g1i7Q4= =/dqV -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Transactional DDL
Harpreet Dhaliwal wrote: > So you mean to say something like this as far as oracle is concerned: > > BEGIN > DDL 1 (commits right after its execution) > DDL 2 (commits right after its execution) > END > > That means there's no concept of putting DDL statements in a transaction > in oracle basically, right? Yes. //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] language interface in postgresql
Trevor Talbot wrote: > On 8/14/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > >> Let me fine tune my question here. What I mean to say is the way we can >> write stored procedures in C, perl etc in Postgres specifying the language >> parameter at the end of stored procedure, compared to that, in SQL Server >> 2000 I've seen SP writing in pure SQL only. >> Can you write Stored Procedures in SQL Server 2000 or Oracle in different >> programing languages as well ? > > AFAIK SQL Server 2000 only has a C interface as the other option; CLR > hosting was added in SQL Server 2005. Because the CLR is a virtual > machine that runs compiled bytecode, and compilers for all of the > available languages are not necessarily available at runtime, it > doesn't make sense to specify such code in source form. The process > is more like creating a function in C in PostgreSQL (compile and load > a shared library). Details here, if you're curious: > http://msdn2.microsoft.com/en-us/library/ms345136.aspx > > I don't know what Oracle supports. I believe Oracle support Java in the same way MSSQL supports .net, give or take. And IIRC the method is you build a DLL on your client and upload it to the server so no, source not specified int he CREATE PROCEDURE call. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Transactional DDL
So you mean to say something like this as far as oracle is concerned: BEGIN DDL 1 (commits right after its execution) DDL 2 (commits right after its execution) END That means there's no concept of putting DDL statements in a transaction in oracle basically, right? Thanks, ~Harpreet On 8/15/07, Scott Marlowe <[EMAIL PROTECTED]> wrote: > > On 8/14/07, Harpreet Dhaliwal <[EMAIL PROTECTED]> wrote: > > So you mean to say DDL statements can't be put in one single transaction > in > > Oracle ? > > You can put them in, but then they will cause the previous DMK to be > silently committed >
Re: [GENERAL] Transactional DDL
On 8/14/07, Harpreet Dhaliwal <[EMAIL PROTECTED]> wrote: > So you mean to say DDL statements can't be put in one single transaction in > Oracle ? You can put them in, but then they will cause the previous DMK to be silently committed ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] language interface in postgresql
"Scott Marlowe" <[EMAIL PROTECTED]> writes: > On 8/14/07, Tom Lane <[EMAIL PROTECTED]> wrote: >> ... I think we probably are unique in being so >> aggressively agnostic about what the function language is. That's >> not necessarily all good, as it's driven us to invent curiosities >> like dollar-quoting to avoid having to mesh lexical details of the >> function language and the outer SQL language. > Well, I for one LOVE $$ quoting in the newer versions of pgsql. > Having statements with value in the, was downright > annoying in 7.4... You bet, but the reason why it was like that was we insisted on the function body being a string literal in the eyes of the outer CREATE FUNCTION command. Anytime the function's language had the same ideas about string quoting and escaping rules as the outer SQL language does, you were in for some recursively bad experiences. Dollar-quoting is a cute technical solution to that, but you can't deny that it's simpler if you just restrict the function language to be SQL-ish so that CREATE FUNCTION can parse it without any interesting quoting rules. So sayeth Oracle and the SQL standards committee, anyway. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Transactional DDL
And this feature i.e. transactional DDL is not there in other major RDBMS like sql server, oracle etc? thanks ~Harpreet On 8/15/07, Tom Lane <[EMAIL PROTECTED]> wrote: > > "Harpreet Dhaliwal" <[EMAIL PROTECTED]> writes: > > I read a few lines about SP compilation in postgres > > > http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci1179016,00.html > > > Is this what the Transactional DDL feature of postgresql talks about ? > > I'd say it's one very small aspect of what's involved in that. > > Updates of stored procedures are a relatively trivial matter, because a > procedure is defined by just a single catalog entry (one row in > pg_proc). So either you see the new version or you see the old version, > not much to talk about. The DDL updates that are really interesting > ... at least from an implementor's standpoint ... are the ones that > involve coordinated changes to multiple catalog entries and some > underlying filesystem files as well. In other words, ALTER TABLE. > There are not that many other systems that can choose to commit or roll > back an arbitrary collection of ALTER TABLE commands. > > This doesn't come for free of course. What it mostly costs you in > Postgres-land is transient disk space requirements, since we have to > store both the "before" and "after" states until commit/rollback. > >regards, tom lane >
Re: [GENERAL] Transactional DDL
"Harpreet Dhaliwal" <[EMAIL PROTECTED]> writes: > I read a few lines about SP compilation in postgres > http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci1179016,00.html > Is this what the Transactional DDL feature of postgresql talks about ? I'd say it's one very small aspect of what's involved in that. Updates of stored procedures are a relatively trivial matter, because a procedure is defined by just a single catalog entry (one row in pg_proc). So either you see the new version or you see the old version, not much to talk about. The DDL updates that are really interesting ... at least from an implementor's standpoint ... are the ones that involve coordinated changes to multiple catalog entries and some underlying filesystem files as well. In other words, ALTER TABLE. There are not that many other systems that can choose to commit or roll back an arbitrary collection of ALTER TABLE commands. This doesn't come for free of course. What it mostly costs you in Postgres-land is transient disk space requirements, since we have to store both the "before" and "after" states until commit/rollback. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Transactional DDL
So you mean to say DDL statements can't be put in one single transaction in Oracle ? On 8/15/07, Scott Marlowe <[EMAIL PROTECTED]> wrote: > > On 8/14/07, Harpreet Dhaliwal <[EMAIL PROTECTED]> wrote: > > > > Hi, > > I read a few lines about SP compilation in postgres > > > > > http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci1179016,00.html > > > > 1. stored procedure compilation is transactional. > > "You can recompile a stored procedure on a live system, and only > > transactions starting after that compilation will see the changes," he > said. > > "Transactions in process can complete with the old version. Oracle just > > blocks on the busy procedure." > > > > Is this what the Transactional DDL feature of postgresql talks about ? > > That's just one of the DDLs that postgresql can handle in a > transaction. Basically, create / drop database and create / drop > tablespace aren't transactable. Anything else is fair game. Note > that wrapping alter table or reindex or truncate in a long running > transaction will likely lock the table for an unacceptable period of > time. But, putting a migration script that includes DDL and DML > together and wrapping it in begin; commit; pairs means that either it > all goes or none does, and the locks on alter table etc are only held > for the period it takes the migration script to run. > > Oracle's lack of transactable DDL means you HAVE to take your system > down and have rollback scripts ready to go should your migration fail. > Having worked with both databases, I can honestly say this is one of > the areas PostgreSQL seriously beats Oracle in terms of usefulness. >
Re: [GENERAL] language interface in postgresql
On 8/14/07, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jasbinder Singh Bali" <[EMAIL PROTECTED]> writes: > > Let me fine tune my question here. What I mean to say is the way we can > > write stored procedures in C, perl etc in Postgres specifying the language > > parameter at the end of stored procedure, compared to that, in SQL Server > > 2000 I've seen SP writing in pure SQL only. > > Ah. I thought you were talking about client interface libraries in > different languages, which are surely a dime a dozen. As far as > server-side functions go, we might be unique in offering so many > languages to work in. I think we probably are unique in being so > aggressively agnostic about what the function language is. That's > not necessarily all good, as it's driven us to invent curiosities > like dollar-quoting to avoid having to mesh lexical details of the > function language and the outer SQL language. Well, I for one LOVE $$ quoting in the newer versions of pgsql. Having statements with value in the, was downright annoying in 7.4... I think the real issue with all the pl languages is the variations in quality of implementations out there. And some don't always make it to the next release, or if they do they lag behind. I believe plsh was quite late to show up for 8.0 back in the day... If you're relying on a particular pl{lang} you kinda want to check to see if it's still working on the latest pg version before upgrading. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Transactional DDL
On 8/14/07, Harpreet Dhaliwal <[EMAIL PROTECTED]> wrote: > > Hi, > I read a few lines about SP compilation in postgres > > http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci1179016,00.html > > 1. stored procedure compilation is transactional. > "You can recompile a stored procedure on a live system, and only > transactions starting after that compilation will see the changes," he said. > "Transactions in process can complete with the old version. Oracle just > blocks on the busy procedure." > > Is this what the Transactional DDL feature of postgresql talks about ? That's just one of the DDLs that postgresql can handle in a transaction. Basically, create / drop database and create / drop tablespace aren't transactable. Anything else is fair game. Note that wrapping alter table or reindex or truncate in a long running transaction will likely lock the table for an unacceptable period of time. But, putting a migration script that includes DDL and DML together and wrapping it in begin; commit; pairs means that either it all goes or none does, and the locks on alter table etc are only held for the period it takes the migration script to run. Oracle's lack of transactable DDL means you HAVE to take your system down and have rollback scripts ready to go should your migration fail. Having worked with both databases, I can honestly say this is one of the areas PostgreSQL seriously beats Oracle in terms of usefulness. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] language interface in postgresql
On 8/14/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > Let me fine tune my question here. What I mean to say is the way we can > write stored procedures in C, perl etc in Postgres specifying the language > parameter at the end of stored procedure, compared to that, in SQL Server > 2000 I've seen SP writing in pure SQL only. > Can you write Stored Procedures in SQL Server 2000 or Oracle in different > programing languages as well ? AFAIK SQL Server 2000 only has a C interface as the other option; CLR hosting was added in SQL Server 2005. Because the CLR is a virtual machine that runs compiled bytecode, and compilers for all of the available languages are not necessarily available at runtime, it doesn't make sense to specify such code in source form. The process is more like creating a function in C in PostgreSQL (compile and load a shared library). Details here, if you're curious: http://msdn2.microsoft.com/en-us/library/ms345136.aspx I don't know what Oracle supports. I'm certainly not familiar with all of the databases out there, but I can't think of any offhand that support different languages in the same form as PostgreSQL. Aside from VM platforms like the CLR, I can't think of any with the same range either. Most databases seem to pick a specific language or two in script-style form, and possibly support binary extensions (written in C or some other native language) for work beyond that. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] language interface in postgresql
"Jasbinder Singh Bali" <[EMAIL PROTECTED]> writes: > Let me fine tune my question here. What I mean to say is the way we can > write stored procedures in C, perl etc in Postgres specifying the language > parameter at the end of stored procedure, compared to that, in SQL Server > 2000 I've seen SP writing in pure SQL only. Ah. I thought you were talking about client interface libraries in different languages, which are surely a dime a dozen. As far as server-side functions go, we might be unique in offering so many languages to work in. I think we probably are unique in being so aggressively agnostic about what the function language is. That's not necessarily all good, as it's driven us to invent curiosities like dollar-quoting to avoid having to mesh lexical details of the function language and the outer SQL language. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Blobs in Postgresql
> So turning to Postgresql, can I get any recommendations, suggestions and > tips on blob handling in the database? The image sizes will be pretty > variable, from a few kilobytes to several hundred megabytes, so I need > something that will handle the various file sizes, hopefully transparently. > PostgreSQL BLOB implementation is well. We used it without any problems with images from 20K-30M. Regards Pavel Stehule ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Custom functions for default values for columns on insert
"Lim Berger" <[EMAIL PROTECTED]> writes: > create table test (id serial primary key, nowd timestamp without time > zone, processed_id varchar(10)); > create or replace rule test_ins as on insert to test > DO UPDATE test > SET processed_id = MYFUNCTION(NEW.id) > WHERE id = NEW.id > ; I think you are getting burnt by the fact that a rule is a macro and therefore subject to multiple-evaluation-of-arguments hazards. In particular, the reference to NEW.id probably results in an extra evaluation (or two?) of nextval() on the serial sequence. Even if this worked, it'd be horrendously inefficient, because of having to apply the full machinery of UPDATE to fix up the row. Instead you should use a BEFORE INSERT trigger to apply the change to the NEW record before it ever gets stored. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] language interface in postgresql
Hi, Let me fine tune my question here. What I mean to say is the way we can write stored procedures in C, perl etc in Postgres specifying the language parameter at the end of stored procedure, compared to that, in SQL Server 2000 I've seen SP writing in pure SQL only. Can you write Stored Procedures in SQL Server 2000 or Oracle in different programing languages as well ? Thanks ~Jas On 8/15/07, Trevor Talbot <[EMAIL PROTECTED]> wrote: > > On 8/14/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > > > I was wondering if any other database has language interface for > different > > programing languages such as C , perl , python etc the way postgresql > has. > > Assuming you mean stored procedures, Microsoft's SQL Server hosts the > CLR, which means pretty much any language capable of targeting .NET > will work. The mechanisms behind this are quite involved, to > guarantee different levels of security and reliability; it's rather > interesting to dig into. > > It is just one platform though, so might not be quite what you had in mind > :) >
Re: [GENERAL] language interface in postgresql
On 8/14/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > I was wondering if any other database has language interface for different > programing languages such as C , perl , python etc the way postgresql has. Assuming you mean stored procedures, Microsoft's SQL Server hosts the CLR, which means pretty much any language capable of targeting .NET will work. The mechanisms behind this are quite involved, to guarantee different levels of security and reliability; it's rather interesting to dig into. It is just one platform though, so might not be quite what you had in mind :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Custom functions for default values for columns on insert
The "default" values of a column during table definition do not accept values generated by passing another column's value through my own function. So I try to do this with a rule as follows. The name of my function in this example is MYFUNCTION. drop table test cascade; create table test (id serial primary key, nowd timestamp without time zone, processed_id varchar(10)); create or replace rule test_ins as on insert to test DO UPDATE test SET processed_id = MYFUNCTION(NEW.id) WHERE id = NEW.id ; insert into test (nowd) values (current_timestamp); insert into test (nowd) values (now()); select * from test; This results in the "processed_id" column coming up blank. What am I doing wrong? How can I make sure that upon insert of a row, the value of one column ("id in my example) is used to immediately generate the value of another column ("process_id" in my example). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Transactional DDL
Hi, I read a few lines about SP compilation in postgres http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci1179016,00.html *1. stored procedure compilation is transactional. * *"You can recompile a stored procedure on a live system, and only transactions starting after that compilation will see the changes," he said. "Transactions in process can complete with the old version. Oracle just blocks on the busy procedure." * Is this what the Transactional DDL feature of postgresql talks about ? Thanks ~Harpreet
[GENERAL] language interface in postgresql
Hi, I was wondering if any other database has language interface for different programing languages such as C , perl , python etc the way postgresql has. Thanks, Jas
Re: [GENERAL] Moving to postgresql and some ignorant questions
On 8/14/07, Phoenix Kiula <[EMAIL PROTECTED]> wrote: > Thank you for this detailed explanation Alban. But I want to include > FK constraints on a table2 on a column in the referenced table1 where > column values are not unique. > > I just want row data to be consistent for the same ID. Yes, this is > repetitive and enough to rile DB purists, but it has its uses (for > performance in certain reporting queries). Then you need to define a lookup table, and have both of your tables reference it by foreign key. You can create an update trigger on one of the child tables to put a row into the lookup table if it doesn't exist. If I'm not clear, let me know. > 1. Should I explore views for this? I am very skeptical about them > coming from MySQL as the performance of MySQL views is horrendous. > Besides, if they are updated everytime, there's little use for a view > in the first place, I may as well simply query the table -- or is this > wrong? The UPDATE only locks and commits to the table, and then the > view gets auto updated? In pgsql, views are actually empty tables that are defined by a SQL statement and fired by rules when you select from the view. I.e. create view abc as select * from xyz becomes an empty table abc which has a rule for selects that runs select * from xyz when you access it. The performance of select * from abc will be almost exactly the same as select * from xyz, except for some very small overhead from the rules engine. The real uses for views are to allow you to reduce query complexity in the client. Suppose you have a query that joins and / or unions a dozen tables with really complex join logic. you can just wrap it in a view, and when you select from the view, postgresql will execute the real query behind it as though you passed it in. > 2. Or, I could do this with triggers, and now I realize also with > "rules" (CREATE RULE). Which are faster, rules or triggers, are they > similar in speed? Basically I want the rule/trigger to cascade the > update to table1.col1 and table1.col2 to similar columns in table2. I think you're talking about updatable views, which you can build with postgresql. Rules can let you do this pretty easily. > I just wanted to know what the usual thoughts on this are. Are views > updated as soon as its underlying table(s) are updated? Yep, because views are just enclosed queries. Note that you CAN do materialized views with pgsql. Once you've gotten familiar with regular postgresql stuff, look up materialized views for postgresql again on google. they're not that hard really, but most the time you really don't need them. > Would appreciate any thoughts on performance of views. PGSQL seems to > treat views just like tables, so I wonder if there's any performance > gain! nope, but no great loss either. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] query help
On 8/14/07, Kirk Wythers <[EMAIL PROTECTED]> wrote: > > I need some help with rewriting a query. I have a fairly complicated query > (for me anyway) that dumps daily climate data, filling in missing data with > monthly averages (one line per day). > > I want to output monthly averages (one line per month). I am having a hard > time wrapping my head around this. Particularly how to deal with the doy > column (day of year). I have tried several approaches and my forehead is > starting to get my keyboard bloody. I think this came up on IRC today, so perhaps this is only for the archives' sake, but you want to do something like this: Assuming you have a table as follows: CREATE TABLE climate_data ( measurement_time timestamp, measurement_value integer); ...and you insert data into it regularly, you can get the average measurement over a period of time with date_trunc(), which will truncate a date or timestamp value to match whatever precision you specify. For example, see the following: eggyknap=# select date_trunc('month', now()); date_trunc 2007-08-01 00:00:00-06 (1 row) Note: the -06 at the end means I'm in mountain time. So if you want to get the average measurement over a month's time, you need to do something like this: SELECT DATE_TRUNC('MONTH', measurement_time), AVG(measurement_value) FROM climate_data GROUP BY DATE_TRUNC('MONTH', measurement_time); This will chop all the measurement_time values down to the month the measurement was taken in, put all measurements in groups based on the resulting value, and take the average measurement_value from each group. - Josh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Blobs in Postgresql
Hi all- I am evaluating databases for use in a large project that will hold image data as blobs. I know, everybody says to just store pointers to files on the disk...can't do it here...the images are of a confidential nature and access to the database (and resulting client app) will be highly restricted. The underlying platform will likely be Linux though Solaris x86-64 has been suggested as well. I did some tests with MySQL and found the results very sub-par...the standard blob field only holds 64k (they have three types of blobs for whatever reason) and the real problem is that my uploads and downloads have failed because of packet size issues...this can be solved somewhat with server settings, but I get the impression that blobs are barely supported. So turning to Postgresql, can I get any recommendations, suggestions and tips on blob handling in the database? The image sizes will be pretty variable, from a few kilobytes to several hundred megabytes, so I need something that will handle the various file sizes, hopefully transparently. Thanks for any info, Ron
Re: [GENERAL] "Out of memory" errors..
> If this is only a PostgreSQL database server, don't limit the postgres user. > Don't tweak these limits unless you know exactly what you are doing. Unfortunately, it is not. It has other applications. Including Apache and so on. I tried not setting the ulimits at all, but it seems to be required for the system (by other requirements). So I would like to know optimal mappings between ulimits and postgres. > PS: "maintenance_work_mem" is completely unrelated to "max user processes" > or "open files", it's related to the allowed memory size. > Sorry, but this was suggested in this thread earlier. So how should I make sure that the vacuum analyze on slightly large tables is allowed without running out of memory? Would "shared_buffer" in conf be relevant, but I doubt it. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Trigger not working as expected, first row gets a null value
Thanks, triggers are new turf for me so nothing is quite obvious. I thought, the after insert change is snuck in after the rec is created and via some built in behavior allows mods to the rec without triggering a cascade. But the reality makes sense and of course I can adjust my work to handle in a different manner. I really appreciate you postgres pros taking the time to respond to the varous newb questions that get posed here. Tom Lane-2 wrote: > > novnov <[EMAIL PROTECTED]> writes: >> The function is meant to find a value via the internal sql, and then >> update >> a column in the same tabled that invoked the trigger with that value. Is >> that a non-workable scenario? IE, an after insert trigger can't modify a >> column in the same table that was updated? > > Um, if you're trying to do something by modifying the NEW record, that's > correct --- the insertion was already done and you can't change it > retroactively. You could issue a whole new INSERT or UPDATE command, > but beware of creating an infinite loop of trigger firings. It'd be > pretty inefficient anyway since it'd double the work involved for any > change. If you want to modify the row that's going into the table, far > better to do it in a before trigger. > > The design rule of thumb is "check or modify the new row in a BEFORE > trigger, propagate the info to other tables in an AFTER trigger". > > regards, tom lane > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org/ > > -- View this message in context: http://www.nabble.com/Trigger-not-working-as-expected%2C-first-row-gets-a-null-value-tf4238812.html#a12155583 Sent from the PostgreSQL - general mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Insert or Replace or \copy (bulkload)
Ow Mun Heng <[EMAIL PROTECTED]> writes: > Ps : Is it this list's norm to have the OP/sender in the "to" list and > mailing list on the "CC" list? Yes. If you don't like that you can try including a "Reply-To: " header in what you send to the list; or perhaps better, I think there's a way to tell the list bot to not send you copies of messages you're cc'd on. But the old-timers are all accustomed to this behavior. The PG lists are big enough that turnaround time is O(minutes) even when everything is going well, and sometimes it isn't --- so direct cc's to the people involved in a specific thread help keep the conversation moving. Also, some of us find a backup copy sent via the list to be a Good Thing. I for one have a reputation of running spam filters that eat pets and small children ... so if you want to be sure to get through to me, don't forget to cc: the list. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Compound Indexes
"Phoenix Kiula" <[EMAIL PROTECTED]> writes: > Thank you so much! My only concern, probably a hangover from the MySQL > world, is that if I have 5 of 6 indices, what would that do to INSERT > and UPDATE performance if all these indices have to be updated? Is > updating individual indices faster than one large compound index? Probably slower, but the compound index is enough more specialized that it's not as useful as the individual indexes. (I suspect that that compound index in MySQL didn't do you nearly as much good as you thought, either.) You should read http://www.postgresql.org/docs/8.2/static/indexes-bitmap-scans.html and nearby sections of the manual. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Insert or Replace or \copy (bulkload)
On Tue, 2007-08-14 at 10:16 -0500, Scott Marlowe wrote: > On 8/14/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote: > > > > In MySql, I was using mysqlimport --replace which essentially provided > > the means to load data into the DB, while at the same time, would > > provide the necessary logic to replace the entire row if there was a > > duplicate instead of dying. > > Anyway, I found a workaround, but, to me, even though it provides a > > means to an end, it still looks like it'll end up as a maintenance > > nightmare each time a table has any additional columns added. > > > > Solution is taken from this site: > > > > http://www.pointwriter.com/blog/index.php?/archives/6-REPLACE-in-PostgreSQL.html > > Example code snipped for brevity > > > Can anyone tell me if this won't turn out to be a maintenance nightmare? > > So, the pertinent question is, is there a better mousetrap available? > > I don't see why it would be a maintenance nightmare. Looks pretty > much like you just create it and go. Once it's in place it should > just work. There are other ways to skin this particular cat, but that > one seems as good as any. That would be true only if I didn't have to (remember to) add a alter the rule each time a new column is added. At the rate of things, it might be quite an often procedure. (unless of course, I script it, which is an idea by itself) Ps : Is it this list's norm to have the OP/sender in the "to" list and mailing list on the "CC" list? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Trigger not working as expected, first row gets a null value
novnov <[EMAIL PROTECTED]> writes: > The function is meant to find a value via the internal sql, and then update > a column in the same tabled that invoked the trigger with that value. Is > that a non-workable scenario? IE, an after insert trigger can't modify a > column in the same table that was updated? Um, if you're trying to do something by modifying the NEW record, that's correct --- the insertion was already done and you can't change it retroactively. You could issue a whole new INSERT or UPDATE command, but beware of creating an infinite loop of trigger firings. It'd be pretty inefficient anyway since it'd double the work involved for any change. If you want to modify the row that's going into the table, far better to do it in a before trigger. The design rule of thumb is "check or modify the new row in a BEFORE trigger, propagate the info to other tables in an AFTER trigger". regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] MVCC cons
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 20:36, Merlin Moncure wrote: [snip] > > PostgreSQL wins in terms of better concurrency (especially in long > transactions or transactions that touch a lot of records), cheap > rollbacks, and all the advantages of a sophisticated locking engine > (transactional ddl for example). Although MVCC is not a /sine qua non/ of transactional ddl. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwl4GS9HxQb37XmcRAhm5AJ9pSZR08Vj5aOYtMnQdQaPsjtPMOQCghP+v k81CxBZyH/42wHjPeT0Azvc= =WRv0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] more select-for-update questions
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Ed L. wrote: >> I ran a simple select-for-update test on 8.1.2 and was curious as >> to why the semantics are what they are. Specifically, when you >> have multiple select-for-update-limit queries on the same rows, >> why are rows selected by the blocked query before knowing if >> some of those rows will be removed/eliminated by a preceding >> select-for-update-limit? > This is how it is just because of "historical reasons", i.e. it's a > known misfeature that no one has bothered to fix. Not so much not bothered, as that the backwards-compatibility issues seem a bit scary. If we change this behavior we could easily break more apps than we help. The implementation reason why it's like that is that FOR UPDATE filtering is handled in the top-level executor code (execMain.c) while LIMIT is a plan node type. To change it we'd need to make the FOR UPDATE filter into a plan node type that we could put underneath LIMIT instead of atop it. I occasionally think about doing that as a means for supporting FOR UPDATE in sub-SELECTs, but the real issue with that whole idea is that we don't promise a darn thing about how many times a join input relation will be read or how far it will be read or in what order. So the semantic effect of FOR UPDATE in a sub-SELECT, in terms of exactly which rows will get locked, seems impossible to state precisely. Or to put it more plainly: right now, we lock only rows that we are about to return to the client. So "which rows get locked" is exactly as well-defined as the query as a whole is. As soon as we push the locking further down into the plan, there's a bunch of unspecified implementation behaviors that will affect which rows get locked, and it's very likely that some will get locked that have nothing to do with any row that's returned to the client. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Compound Indexes
On 15/08/07, Chris Browne <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] ("Phoenix Kiula") writes: > > I have a table with ten columns. My queries basically one column as > > the first WHERE condition, so an index on that column is certain. But > > the columns after that one vary depending on end-user's choice (this > > is a reporting application) and so does the sorting order. > > > > In MySQL world, I had sort_buffer in the config file, and I made a > > compound index with the columns most often used in these types of > > queries. So my index looked like: > > > > INDEX idx_trades(id, t_id, c_id, s_id, t_brief, created_on); > > > > This has five columns in it. While reading the pgsql documentation, I > > gather than anything beyond three columns offers diminishing benefits. > > > > My queries will look like these: > > > >SELECT * from trades where id = 9 > >and c_id = > >ORDER by s_id; > > > >SELECT * from trades where id = 9 > >and s_id = 0 > >ORDER by created_on desc ; > > > >SELECT * from trades where id = 9 > >and s_id = 0 > >and t_brief ~* 'more|than|one|word' > >ORDER by created_on desc ; > > > > So my question: how does PGSQL optimize its sorts? If I were to index > > merely the columns that are most commonly used in the reporting WHERE > > clause, would that be ok? Some ofthese columns may be "TEXT" type -- > > how should I include these in the index (in MySQL, I included only the > > first 100 words in the index). > > If you have only these three sorts of queries, then I would speculate > that the following indices *might* be useful: > > create idx1 on trades (id); > create idx2 on trades (c_id); > create idx3 on trades (s_id); > create idx4 on trades (created_on); > create idx5 on trades (created_on) where t_brief ~* 'more|than|one|word'; > create idx6 on trades (id, s_id) where t_brief ~* 'more|than|one|word'; > > (I'm assuming with idx5 and idx6 that you were actually searching for > 'more|than|one|word'; if what is searched for can vary, then idx5/idx6 > are worthless.) > > You could try adding them all, and check out which of them are > actually used by the query planner. And eventually drop out the > irrelevant ones. > > PostgreSQL has a rather sophisticated query planner (pretty much > "rocket science," compared to MySQL), and it is even possible that it > would use multiple of those indices simultaneously for some of the > queries. Which indexes, if any, it will use will vary from query to > query based on the parameters in the query. Thank you so much! My only concern, probably a hangover from the MySQL world, is that if I have 5 of 6 indices, what would that do to INSERT and UPDATE performance if all these indices have to be updated? Is updating individual indices faster than one large compound index? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
"madhtr" <[EMAIL PROTECTED]> writes: > From: "Tom Lane" <[EMAIL PROTECTED]> >> ... although it takes a good long while (several seconds) because of the >> "sleep(1)" in the interaction with the postmaster. > Sleep(1) should be only 1/1000 of a second. I do that so I don't hammer the > processor with my while loop when i am not using a select(). Ah. I was interpreting it in Unix terms, where sleep() measures in seconds. With a wait of a few msec it might not be too intolerable. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] MVCC cons
On 8/14/07, Kenneth Downs <[EMAIL PROTECTED]> wrote: > RPK wrote: > > I want to know whether MVCC has cons also. Is it heavy on resources? How > > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. > > > > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and > update) appears on my systems to be almost exactly linear to row size. > Inserting 1000 rows into a table with row size 100 characters takes > twice as long as inserting 1000 rows into a table with row size 50 > characters. You were half right. Inserts in PostgreSQL perform similar to other databases (or at least, use similar mechanisms). It's the updates that suffer, because this translates to delete + insert essentially. Databases that use simple locking strategies can simply update the record in place. PostgreSQL wins in terms of better concurrency (especially in long transactions or transactions that touch a lot of records), cheap rollbacks, and all the advantages of a sophisticated locking engine (transactional ddl for example). merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
- Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "madhtr" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, August 14, 2007 18:50 Subject: Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll "madhtr" <[EMAIL PROTECTED]> writes: cleary I am going to have to study this documentation more carefully ... So ... for simplicity's sake, If I just do the following, how do I get back "database does not exist" ? [ shrug... ] Your program works perfectly for me: $ ./testit PQerrorMessage(lpcn) returns: FATAL: database "nonexistantdb" does not exist PQstatus(lpcn) returns 1 $ ... although it takes a good long while (several seconds) because of the "sleep(1)" in the interaction with the postmaster. hmm ... TY, must be my version or something like you state further down. Sleep(1) should be only 1/1000 of a second. I do that so I don't hammer the processor with my while loop when i am not using a select(). Maybe your problem is not with the program, but with the postmaster you're trying to connect to? Does psql work? yep, good thought. psql command line works fine, its sycnhronous tho. source: http://www.postgresql.org/docs/7.3/static/libpq-connect.html Another line of thought, given the reading-between-the-lines conclusion that you are trying to use PG 7.3 libraries on Windows, is that there was something broken in the async-connect code back then on that platform. If you really are trying to do that, do yourself a favor and move to 8.0 or later. Nobody's going to be very interested in fixing 7.3. (I did try your program with 7.3 on Unix, though, and it seemed fine except the error message was spelled a bit differently.) Ty, I'll check that ... :) I was also under the assumption that I would not need to perform my own selects on the underlying socket, and that whatever I got back would be either a null pointer, a successful connection pointer, or a broken connection pointer with an error indication. You don't *have* to perform selects on the underlying socket, but if you are not multiplexing this activity with some other I/O, I have to wonder why you are bothering with an asynchronous connect at all. What you've got at the moment is a poorly-implemented equivalent of PQconnectdb(). yep, the little simplified program is fairly pointless... but in RL, i will pass a pointer to a cancel flag ... void connect(bool* lpcancel){ while (!*lpcancel && trying2connect){ // yadda yadda yadda }; }; so that the user can click the "connect" button, start a thread, and then click the "cancel" button instead giving my app the three finger salute if they grow impatient, heh;) In any case, I very much appreciate your help and time, I'll let you know what I figure out. I bet you're right about the version. Hopefully I can contribute something back to the list at some point. Again, sry for the sloppy code at the beginning :) madhtr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Moving to postgresql and some ignorant questions
On 8/14/07, Phoenix Kiula <[EMAIL PROTECTED]> wrote: > > *And* you can define compound foreign key constraints, > Thank you for this detailed explanation Alban. But I want to include > FK constraints on a table2 on a column in the referenced table1 where > column values are not unique. > > I just want row data to be consistent for the same ID. Yes, this is > repetitive and enough to rile DB purists, but it has its uses (for > performance in certain reporting queries). I'm not sure I understand this. If the set of values you want to reference is not unique, what are the semantics of this reference? What should happen if one of those sets gets deleted, for instance? Perhaps you mean table1 has columns "id" and "col2", and the "col2" values are not unique, but "id" and "col2" together are? In that case you can simply put a UNIQUE constraint on them together, and use both columns in a single foreign key constraint from table2. > 1. Should I explore views for this? I am very skeptical about them > coming from MySQL as the performance of MySQL views is horrendous. > Besides, if they are updated everytime, there's little use for a view > in the first place, I may as well simply query the table -- or is this > wrong? The UPDATE only locks and commits to the table, and then the > view gets auto updated? A view is simply an alternate presentation of data in one or more tables. It's a persistent query, and the performance is effectively the same as the query itself. (More on "materialized views" below.) In regard to locking, it sounds like you're thinking in MYISAM terms; PostgreSQL uses MVCC instead of locking. Two UPDATEs attempted at the same time may cause one to wait for the other (if both touch the same rows), but during this time all readers (e.g. SELECT) will continue to run without waiting. If you were running into concurrent performance issues with MYISAM due to locking, PostgreSQL might surprise you. > 2. Or, I could do this with triggers, and now I realize also with > "rules" (CREATE RULE). Which are faster, rules or triggers, are they > similar in speed? Basically I want the rule/trigger to cascade the > update to table1.col1 and table1.col2 to similar columns in table2. They are simply different mechanisms. Rules rewrite a query during the parsing phase, and then run the eventual query tree as if you'd entered it yourself. Triggers are procedural actions in response to events. Normally you pick one based on the semantics of what you want to do. If you can use foreign keys as above, ON UPDATE CASCADE will probably do what you want without having to do anything else. > I will surely be exploring views, and reading more of this: > http://www.postgresql.org/docs/8.2/interactive/rules-views.html , but > I just wanted to know what the usual thoughts on this are. Are views > updated as soon as its underlying table(s) are updated? Can I control > the duration or timing of their update? I searched for "materialized > views", after having seen that word on the performance list, but most > of the search results and the discussions on that forum are beyond my > comprehension!! Yeah, you're confusing standard views with "materialized views". A standard view is basically a persistent query, and does not store any data itself. PostgreSQL implements them with rules. When you SELECT against a view, the query rewriter combines your conditions with the stored SELECT used to create the view, and runs the final query against the table(s) you created the view from, just as if you'd entered it yourself. If the original query against the table(s) is complex, a view makes a very nice way to simplify things for applications. If you want to create an "updateable view", where applications can INSERT/UPDATE/DELETE against the view itself, you need to add rules to it to translate operations on the presented view columns into operations on the table(s) behind them. A "materialized view" is basically a view that stores copies of its data. This is useful in situations where you need some kind of cache, possibly because the query behind the view takes a long time to complete. PostgreSQL does not have built in support for materialized views, but rules and triggers can be used to create them. The discussions you encountered revolve around the finer points of doing that. > Would appreciate any thoughts on performance of views. PGSQL seems to > treat views just like tables, so I wonder if there's any performance > gain! In general, a view is performance neutral: it's just a mechanism for simplifying presentation of data. It also has uses for security, since you can create a view and grant roles access to it while still denying them access to the underlying tables. This could be used to hide a particular table column, for instance. It doesn't sound like this is of any use for your application though. ---(end of broadcast)--- TIP 9: In versions below 8.0, the
Re: [GENERAL] Trigger not working as expected, first row gets a null value
I have been able to spend some more time working on this and have made a bit of progress. I had the trigger set to run before insert; and the sql inside the trigger needed to ref the new record in order to supply a value. I think that's why the first try always applied a null value. So I've changed the trigger to go after insert. Now, all of my trigger function variables are populated properly but the record is never updated. The function is meant to find a value via the internal sql, and then update a column in the same tabled that invoked the trigger with that value. Is that a non-workable scenario? IE, an after insert trigger can't modify a column in the same table that was updated? Martijn van Oosterhout wrote: > > On Wed, Aug 08, 2007 at 03:20:00PM -0700, novnov wrote: >> >> I know that would be best but it'd be a major undertaking to try to repro >> this situation. I was hoping for some hints, 'it sounds like xyz', >> because >> I'm pretty sure I'm just tripping over a commonplace issue. > > It doesn't have to be repoducable, but the definition of the tables > involves + the code of the trigger would help. I read your description > three times and still couldn't quite work out exactly what the problem > was or what you were expecting to happen... > > Have a nice day, > -- > Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ >> From each according to his ability. To each according to his ability to >> litigate. > > > -- View this message in context: http://www.nabble.com/Trigger-not-working-as-expected%2C-first-row-gets-a-null-value-tf4238812.html#a12154927 Sent from the PostgreSQL - general mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] more select-for-update questions
Ed L. wrote: > I ran a simple select-for-update test on 8.1.2 and was curious as > to why the semantics are what they are. Specifically, when you > have multiple select-for-update-limit queries on the same rows, > why are rows selected by the blocked query before knowing if > some of those rows will be removed/eliminated by a preceding > select-for-update-limit? This is how it is just because of "historical reasons", i.e. it's a known misfeature that no one has bothered to fix. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
"madhtr" <[EMAIL PROTECTED]> writes: > cleary I am going to have to study this documentation more carefully ... So > ... for simplicity's sake, If I just do the following, how do I get back > "database does not exist" ? [ shrug... ] Your program works perfectly for me: $ ./testit PQerrorMessage(lpcn) returns: FATAL: database "nonexistantdb" does not exist PQstatus(lpcn) returns 1 $ ... although it takes a good long while (several seconds) because of the "sleep(1)" in the interaction with the postmaster. Maybe your problem is not with the program, but with the postmaster you're trying to connect to? Does psql work? > source: http://www.postgresql.org/docs/7.3/static/libpq-connect.html Another line of thought, given the reading-between-the-lines conclusion that you are trying to use PG 7.3 libraries on Windows, is that there was something broken in the async-connect code back then on that platform. If you really are trying to do that, do yourself a favor and move to 8.0 or later. Nobody's going to be very interested in fixing 7.3. (I did try your program with 7.3 on Unix, though, and it seemed fine except the error message was spelled a bit differently.) > I was also under the assumption that I would not need to perform my own > selects on the underlying socket, and that whatever I got back would be > either a null pointer, a successful connection pointer, or a broken > connection pointer with an error indication. You don't *have* to perform selects on the underlying socket, but if you are not multiplexing this activity with some other I/O, I have to wonder why you are bothering with an asynchronous connect at all. What you've got at the moment is a poorly-implemented equivalent of PQconnectdb(). regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
I did make an error on the zero assumption, ty :) However, the reason PGRES_POLLING_FAILED and PGRES_POLLING_OK both break the loop is because of this: "If this call returns PGRES_POLLING_FAILED, the connection procedure has failed. If this call returns PGRES_POLLING_OK, the connection has been successfully made." source: http://www.postgresql.org/docs/7.3/static/libpq-connect.html I was also under the assumption that I would not need to perform my own selects on the underlying socket, and that whatever I got back would be either a null pointer, a successful connection pointer, or a broken connection pointer with an error indication. cleary I am going to have to study this documentation more carefully ... So ... for simplicity's sake, If I just do the following, how do I get back "database does not exist" ? #pragma once #include #include #include int main(int na,char** sa){ char* host= "localhost"; unsigned short port = 5432; char* dbname = "nonexistantdb"; char* user= "user"; char* password= "pass"; int e = 0; PGconn* lpcn = 0; bool keepon = true; char cs[1024]; sprintf( cs, "host=%s port=%u dbname=%s user=%s password=%s", host,port,dbname,user,password ); if (lpcn = PQconnectStart(cs)){ while (keepon){ switch(e = PQconnectPoll(lpcn)){ case PGRES_POLLING_FAILED: case PGRES_POLLING_OK: keepon = false; break; }; Sleep(1); }; printf( "PQerrorMessage(lpcn) returns:\n\n%s\n\nPQstatus(lpcn) returns %d\n", PQerrorMessage(lpcn),PQstatus(lpcn) ); PQfinish(lpcn); } else printf("I am assuming we are out of memory ...\n"); return e; }; / - Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "madhtr" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, August 14, 2007 15:53 Subject: Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll "madhtr" <[EMAIL PROTECTED]> writes: ... here's the source ... can u tell me whats wrong? Well, your usage of "pge" seems fairly broken, in particular the random (and wrong) assumptions about which values are or are not zero. AFAICT this code doesn't really distinguish between PGRES_POLLING_FAILED and PGRES_POLLING_OK. And if it does return failure, there's no way for the caller to know which enum type the failure code belongs to. You didn't show us the code that is actually reporting the error, but I wonder whether it isn't equally confused about how to determine what the error is. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] MVCC cons
"Kenneth Downs" <[EMAIL PROTECTED]> writes: > RPK wrote: >> I want to know whether MVCC has cons also. Is it heavy on resources? How >> PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Well the fundamental con of MVCC versus serializing everything using locks is that you have to keep more data around which takes space and i/o resources. > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and > update) > appears on my systems to be almost exactly linear to row size. Inserting 1000 > rows into a table with row size 100 characters takes twice as long as > inserting > 1000 rows into a table with row size 50 characters. Well, that's going to be true (assuming i/o is your limiting factor at all) regardless of your architecture. There's no magic here, more bytes take more blocks of space which take more time to write or read. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] MVCC cons
> On 08/14/07 14:34, Kenneth Downs wrote: > > Tom Lane wrote: > >> Kenneth Downs <[EMAIL PROTECTED]> writes: > >> > >>> Speaking as an end-user, I can give only one I've ever seen, which is > >>> performance. Because of MVCC, Postgres's write performance (insert > >>> and update) appears on my systems to be almost exactly linear to row > >>> size. Inserting 1000 rows into a table with row size 100 characters > >>> takes twice as long as inserting 1000 rows into a table with row size > >>> 50 characters. > >>> > >> > >> Not sure why you'd think that's specific to MVCC. It sounds like > purely > >> an issue of disk write bandwidth. > >> > >> regards, tom lane > >> > > > > I did not see this in MS SQL Server. > > It is only logical that it will take 2x as long to insert 2x as much > data. > > Maybe SQL Server is compressing out white space? Or (shudder) > heavily caching writes? There's no SQL Server magic. It doesn't compress whitespace or cache writes in any scary way. Doubling with row width does double the insert time. On SQL Server 2000 sp4: Setup via: create database test_db use test_db create table t50 ( f1 char(50) ) create table t100 ( f1 char(100) ) Test 1: declare @start datetime select @start = getdate() begin transaction insert into t50 ( f1 ) values ( '01234567890123456789012345678901234567890123456789' ); -- Repeat above insert 1000 times commit transaction select datediff(ms, @start, getdate()) Test 2: declare @start datetime select @start = getdate() begin transaction insert into t50 ( f1 ) values ( '012345678901234567890123456789012345678901234567890123456789012345678901234 5678901234567890123456789' ); -- Repeat above insert 1000 times commit transaction select datediff(ms, @start, getdate()) On my system, test one averages around 16ms over 100 tests. Test 2 averages around 33ms over 100 tests. I would wager my week's coffee change the same outcome on SQL 2005 sp2. Paul ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Persistent connections in PHP
Pconnects are absolutely necessary if we use tsearch2, because it initializes its dictionaries on a first query in a session. It's a very heavy process (500 ms and more). So, if we do not use pconnect, we waste about 500 ms on each DB connection. Too much pain. Or, of course, pconnect may be replaced with pgbouncer. It's even better. On 8/13/07, Scott Marlowe <[EMAIL PROTECTED]> wrote: > > On 8/13/07, Josh Trutwin <[EMAIL PROTECTED]> wrote: > > On Mon, 13 Aug 2007 09:44:26 -0500 > > Erik Jones <[EMAIL PROTECTED]> wrote: > > > > > I'll agree with Scott on this one. (Not that I can recall > > > specifically ever disagreeing with him before...). Unless you > > > know all of the potential caveats associated with php's persisent > > > postgres connections and have a use case that fits them, don't use > > > them. If you need something to pool connections, look at pgpool. > > > > Could elaborate a little on the problems with using php's persistent > > connections? > > > > Personally I use ADODB php abstraction library (adodb.sf.net) for my > > database stuff and I think there's a way to enable persistent > > connections though I just use the default connection. > > > > I've heard before that php's persistent connections are to be > > avoided, was just curious as to why though? > > OK, there are a few things that gather together to make php's > persistant connections a problem. > > 1: Each apache / php process maintains its own connections, not > sharing with others. So it's NOT connection pooling, but people tend > to think it is. > 2: Each unique connection creates another persistent connection for > an apache/php child process. If you routinely connect to multiple > servers / databases or as > 1 user, then each one of those > combinations that is unique makes another persistent connection. > 3: There's no facility in PHP to clean an old connection out and make > sure it's in some kind of consistent state when you get it. It's in > exactly the same state it was when the previous php script finished > with it. Half completed transactions, partial sql statements, > sequence functions like currval() may have values that don't apply to > you. > 4: pg_close can't close a persistent connection. Once it's open, it > stays open until the child process is harvested. > 5: Apache, by default, is configured for 150 child processes. > Postgresql, and many other databases for that matter, are configured > for 100 or less. Even if apache only opens one connection to one > database with one user account, it will eventually try to open the > 101st connection to postgresql and fail. So, the default > configuration of apache / postgresql for number of connections is > unsafe for pconnect. > 6: The reason for connection pooling is primarily to twofold. One is > to allow very fast connections to your database when doing lots of > small things where connection time will cost too much. The other is > to prevent your database from having lots of stale / idle connections > that cause it to waste memory and to be slower since each backend > needs to communicate with every other backend some amount of data some > times. pconnect takes care of the first problem, but exacerbates the > second. > > P.s. dont' think I'm dogging PHP, cause I'm not. I use it all the > time, and it's really great for simple small scripts that need to be > done NOW and need to be lightweight. I even use pconnect a bit. But > my machine is set for 50 or fewer apache children and 150 postgresql > connects, and I only use pconnect on small, lightweight things that > need to zoom. Everything else gets regular old connect. > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster >
Re: [GENERAL] Creating a row during a select
Try to read about CREATE RULE in the documentation. On 8/14/07, Michal Paluchowski <[EMAIL PROTECTED]> wrote: > > > Hi, > > is there a way to have PostgreSQL insert a new row into a table during > a SELECT query if no row is found by that query? > > > -- > Best regards, > Michal mailto:[EMAIL PROTECTED] > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq >
Re: [GENERAL] Compound Indexes
[EMAIL PROTECTED] ("Phoenix Kiula") writes: > I have a table with ten columns. My queries basically one column as > the first WHERE condition, so an index on that column is certain. But > the columns after that one vary depending on end-user's choice (this > is a reporting application) and so does the sorting order. > > In MySQL world, I had sort_buffer in the config file, and I made a > compound index with the columns most often used in these types of > queries. So my index looked like: > > INDEX idx_trades(id, t_id, c_id, s_id, t_brief, created_on); > > This has five columns in it. While reading the pgsql documentation, I > gather than anything beyond three columns offers diminishing benefits. > > My queries will look like these: > >SELECT * from trades where id = 9 >and c_id = >ORDER by s_id; > >SELECT * from trades where id = 9 >and s_id = 0 >ORDER by created_on desc ; > >SELECT * from trades where id = 9 >and s_id = 0 >and t_brief ~* 'more|than|one|word' >ORDER by created_on desc ; > > So my question: how does PGSQL optimize its sorts? If I were to index > merely the columns that are most commonly used in the reporting WHERE > clause, would that be ok? Some ofthese columns may be "TEXT" type -- > how should I include these in the index (in MySQL, I included only the > first 100 words in the index). If you have only these three sorts of queries, then I would speculate that the following indices *might* be useful: create idx1 on trades (id); create idx2 on trades (c_id); create idx3 on trades (s_id); create idx4 on trades (created_on); create idx5 on trades (created_on) where t_brief ~* 'more|than|one|word'; create idx6 on trades (id, s_id) where t_brief ~* 'more|than|one|word'; (I'm assuming with idx5 and idx6 that you were actually searching for 'more|than|one|word'; if what is searched for can vary, then idx5/idx6 are worthless.) You could try adding them all, and check out which of them are actually used by the query planner. And eventually drop out the irrelevant ones. PostgreSQL has a rather sophisticated query planner (pretty much "rocket science," compared to MySQL), and it is even possible that it would use multiple of those indices simultaneously for some of the queries. Which indexes, if any, it will use will vary from query to query based on the parameters in the query. You can determine the query plan by prefixing the query with the keyword "EXPLAIN." Suppose you add the above 6 indexes, you could get query plans via running the following: >EXPLAIN SELECT * from trades where id = 9 >and c_id = >ORDER by s_id; > >EXPLAIN SELECT * from trades where id = 9 >and s_id = 0 >ORDER by created_on desc ; > >EXPLAIN SELECT * from trades where id = 9 >and s_id = 0 >and t_brief ~* 'more|than|one|word' >ORDER by created_on desc ; You may want to post the output to the list; learning to read query planner output is a bit of an art, and you won't necessarily make the right sense of the results on day #1... -- output = ("cbbrowne" "@" "cbbrowne.com") http://www3.sympatico.ca/cbbrowne/spreadsheets.html Consciousness - that annoying time between naps. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] MVCC cons
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 14:34, Kenneth Downs wrote: > Tom Lane wrote: >> Kenneth Downs <[EMAIL PROTECTED]> writes: >> >>> Speaking as an end-user, I can give only one I've ever seen, which is >>> performance. Because of MVCC, Postgres's write performance (insert >>> and update) appears on my systems to be almost exactly linear to row >>> size. Inserting 1000 rows into a table with row size 100 characters >>> takes twice as long as inserting 1000 rows into a table with row size >>> 50 characters. >>> >> >> Not sure why you'd think that's specific to MVCC. It sounds like purely >> an issue of disk write bandwidth. >> >> regards, tom lane >> > > I did not see this in MS SQL Server. It is only logical that it will take 2x as long to insert 2x as much data. Maybe SQL Server is compressing out white space? Or (shudder) heavily caching writes? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwhXTS9HxQb37XmcRAmdTAJ4rpK60hNtcvT82gCD4RG4EPtcC2wCeNR/C poURsgchjku2UC0y476KOfM= =KVNY -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
"madhtr" <[EMAIL PROTECTED]> writes: > ... here's the source ... can u tell me whats wrong? Well, your usage of "pge" seems fairly broken, in particular the random (and wrong) assumptions about which values are or are not zero. AFAICT this code doesn't really distinguish between PGRES_POLLING_FAILED and PGRES_POLLING_OK. And if it does return failure, there's no way for the caller to know which enum type the failure code belongs to. You didn't show us the code that is actually reporting the error, but I wonder whether it isn't equally confused about how to determine what the error is. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] PgAdmin .sql default handler
On 14.08.2007, at 17:13, brian wrote: Select a .sql file, right-click (ctrl-click) and choose "Get Info" (or just select the file and hit CMD-I). Expand the "Open With ..." tab, choose "Other" and select the app you want to handle these from your Applications directory. brian i am not such a noob :) what i mean is that every time i restart PgAdmin it again grabs the default application handling right for .sql files Krsto Kaiv http://kaiv.wordpress.com (PostgreSQL blog)
[GENERAL] query help
I need some help with rewriting a query. I have a fairly complicated query (for me anyway) that dumps daily climate data, filling in missing data with monthly averages (one line per day). I want to output monthly averages (one line per month). I am having a hard time wrapping my head around this. Particularly how to deal with the doy column (day of year). I have tried several approaches and my forehead is starting to get my keyboard bloody. Thanks in advance for any suggestions. Here is the daily query: SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id ELSE s.obs_id END AS obs_id, site_near.station_id, site_near.longname, w.year, w.doy, --replace missing values (-999) with the monthly average CASE w.tmax WHEN -999 THEN avgtmax.avg ELSE w.tmax END, CASE w.tmin WHEN -999 THEN avgtmin.avg ELSE w.tmin END, CASE s.par WHEN -999 THEN avgpar.avg ELSE s.par END, CASE w.precip WHEN -999 THEN avgprecip.avg ELSE w.precip END FROM site_near INNER JOIN solar s ON (site_near.ref_solar_station_id = s.station_id AND site_near.obs_year = s.year) INNER JOIN weather w ON (site_near.ref_weather_station_id = w.station_id AND site_near.obs_year = w.year AND s.date = w.date) INNER JOIN (SELECT MONTH, round(avg(tmax)::numeric, 2) AS avg FROM weather WHEREtmax != -999 GROUP BY MONTH) AS avgtmax ON (w.month = avgtmax.month) INNER JOIN (SELECT MONTH, round(avg(tmin)::numeric, 2) AS avg FROM weather WHEREtmin != -999 GROUP BY MONTH) AS avgtmin ON (w.month = avgtmin.month) INNER JOIN (SELECT MONTH, round(avg(par)::numeric, 2) AS avg FROM solar WHEREpar != -999 GROUP BY MONTH) AS avgpar ON (s.month = avgpar.month) INNER JOIN (SELECT MONTH, round(avg(precip)::numeric, 2) AS avg FROM weather WHEREprecip != -999 GROUP BY MONTH) AS avgprecip ON (w.month = avgprecip.month) --select station to output climate data by id number WHERE w.station_id = 219101
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
ok thanx:) ... here's the source ... can u tell me whats wrong? (the purpose of this function is to allow for thread safety on the connection, and allow for cancellation if the connection takes too long) BTW ... - coninfo is "host=localhost port=5432 dbname=testdb user=localuser password=localpassword" - I am using VC++ and compiling a windows execuatble ... PGconn* PQconnectStartCS(const char* coninfo,LPCRITICAL_SECTION lpcs,bool* lpcancel,int* lppge){ int& pge = *lppge; bool& cancel = *lpcancel; bool keepon = true; PGconn* pr = 0; pge = 0; EnterCriticalSection(lpcs); pr = PQconnectStart(coninfo); while (!killthread(&cancel) && keepon){ switch(pge = PQconnectPoll(pr)){ case PGRES_POLLING_FAILED: keepon = false; break; case PGRES_POLLING_OK: pge = 0; keepon = false; break; default: break; }; if (keepon) Sleep(1); }; LeaveCriticalSection(lpcs); if (!pge && pr){ switch(pge = PQstatus(pr)){ case CONNECTION_OK: pge = 0; break; }; }; return pr; }; - Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "madhtr" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, August 14, 2007 14:36 Subject: Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll "madhtr" <[EMAIL PROTECTED]> writes: when i intentioally try to connect asynchronously to a database that does not exist, i get "server closed the connection unexpectedly" There's something wrong with your code then. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings - Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "madhtr" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, August 14, 2007 14:36 Subject: Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll "madhtr" <[EMAIL PROTECTED]> writes: when i intentioally try to connect asynchronously to a database that does not exist, i get "server closed the connection unexpectedly" There's something wrong with your code then. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.
Don't forget to CC: the list. Andrew Edson wrote: I apologize; You are correct in that I mistyped my original structure. Here is the information for the correct explain and explain analyze statements. attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A'; No need for the simple explain - explain analyse includes all the information. attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A'; QUERY PLAN Bitmap Heap Scan on ptrans (cost=1223.86..149853.23 rows=85741 width=21) (actual time=2302.363..70321.838 rows=6701655 loops=1) -> Bitmap Index Scan on ptrans_cid_trandt_idx (cost=0.00..1223.86 rows=85741 width=0) (actual time=2269.064..2269.064 rows=204855 loops=1) Total runtime: 89854.843 ms Well, it's taking 90 seconds to return 6.7 million rows. Depending on your system and memory settings, that might not be unreasonable. It *is* getting the estimate of returned rows wrong (it thinks 85,741 will match) which is hugely out of line. Is there something odd with this table/column or haven't you analysed recently? How many unique values does rcrd_cd have, and how many rows does the table have? I don't know that you'll get this down to sub-second responses though, not if you're trying to return 6 million rows from an even larger table. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Moving to postgresql and some ignorant questions
"Phoenix Kiula" <[EMAIL PROTECTED]> writes: > There are some cases where I would like to bunch queries into a > transaction purely for speed purposes, but they're not interdependent > for integrity. E.g., How do you know you need to do this for speed if you haven't run it yet? I would suggest you build your application around the application needs first, then later look at how to optimize it. Remember the two rules of optimization: 1) Don't 2) (for experts only) Don't yet The only case where you should consider batching together transactions like that is if you're processing a batch data load of some kind. In that case you have a large volume of updates and they're all single-threaded. But usually in that case you want to abort the whole load if you have a problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
"madhtr" <[EMAIL PROTECTED]> writes: > when i intentioally try to connect asynchronously to a database that does > not exist, i get > "server closed the connection unexpectedly" There's something wrong with your code then. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] MVCC cons
Tom Lane wrote: Kenneth Downs <[EMAIL PROTECTED]> writes: Speaking as an end-user, I can give only one I've ever seen, which is performance. Because of MVCC, Postgres's write performance (insert and update) appears on my systems to be almost exactly linear to row size. Inserting 1000 rows into a table with row size 100 characters takes twice as long as inserting 1000 rows into a table with row size 50 characters. Not sure why you'd think that's specific to MVCC. It sounds like purely an issue of disk write bandwidth. regards, tom lane I did not see this in MS SQL Server. -- Kenneth Downs Secure Data Software, Inc. www.secdat.comwww.andromeda-project.org 631-689-7200 Fax: 631-689-0527 cell: 631-379-0010
[GENERAL] Compound Indexes
I have a table with ten columns. My queries basically one column as the first WHERE condition, so an index on that column is certain. But the columns after that one vary depending on end-user's choice (this is a reporting application) and so does the sorting order. In MySQL world, I had sort_buffer in the config file, and I made a compound index with the columns most often used in these types of queries. So my index looked like: INDEX idx_trades(id, t_id, c_id, s_id, t_brief, created_on); This has five columns in it. While reading the pgsql documentation, I gather than anything beyond three columns offers diminishing benefits. My queries will look like these: SELECT * from trades where id = 9 and c_id = ORDER by s_id; SELECT * from trades where id = 9 and s_id = 0 ORDER by created_on desc ; SELECT * from trades where id = 9 and s_id = 0 and t_brief ~* 'more|than|one|word' ORDER by created_on desc ; So my question: how does PGSQL optimize its sorts? If I were to index merely the columns that are most commonly used in the reporting WHERE clause, would that be ok? Some ofthese columns may be "TEXT" type -- how should I include these in the index (in MySQL, I included only the first 100 words in the index). TIA! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] MVCC cons
Kenneth Downs <[EMAIL PROTECTED]> writes: > Speaking as an end-user, I can give only one I've ever seen, which is > performance. Because of MVCC, Postgres's write performance (insert and > update) appears on my systems to be almost exactly linear to row size. > Inserting 1000 rows into a table with row size 100 characters takes > twice as long as inserting 1000 rows into a table with row size 50 > characters. Not sure why you'd think that's specific to MVCC. It sounds like purely an issue of disk write bandwidth. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] MVCC cons
On 8/14/07, RPK <[EMAIL PROTECTED]> wrote: > > I want to know whether MVCC has cons also. Is it heavy on resources? How > PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Of course it does. There ain't no such thing as a free lunch, after all. PostgreSQL's mvcc implementation means that a row that gets updated a lot may have many dead rows in the database, and if you don't run vacuum often enough, or have enough space allocated in your free space map, your tables can become bloated. In a worst case scenario, a highly updated table may get so big that normal vacuuming cannot salvage it and you would have to either reindex or perform a vacuum full on it. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] pqlib in c++: PQconnectStart PQconnectPoll
when i intentioally try to connect asynchronously to a database that does not exist, i get "server closed the connection unexpectedly" My intention is to create the database if it does not exist ... Is there any way retrive the actual error so i can know when to create the database? thanx:) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Moving to postgresql and some ignorant questions
> You could do this with savepoints which are a kind of sub-transaction inside a > "bigger" transaction. > > e.g.: > BEGIN TRANSACTION; > > SAVEPOINT sp1; > UPDATE1; > IF (failed) rollback to savepoint sp1; > > SAVEPOINT sp1; > UPDATE2; > IF (failed) rollback to savepoint sp2; Thanks Thomas, this is a great feature even if I am not looking for it right now, I bet I can use it at some point! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] MVCC cons
RPK wrote: I want to know whether MVCC has cons also. Is it heavy on resources? How PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. Speaking as an end-user, I can give only one I've ever seen, which is performance. Because of MVCC, Postgres's write performance (insert and update) appears on my systems to be almost exactly linear to row size. Inserting 1000 rows into a table with row size 100 characters takes twice as long as inserting 1000 rows into a table with row size 50 characters. This tends to be more of an issue for me because my system materializes derived information in columns, so my tables are more fat to begin with, and so this hits me harder. -- Kenneth Downs Secure Data Software, Inc. www.secdat.comwww.andromeda-project.org 631-689-7200 Fax: 631-689-0527 cell: 631-379-0010 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Moving to postgresql and some ignorant questions
On 8/14/07, Bill Moran <[EMAIL PROTECTED]> wrote: > > But the rule is, if any query within the transaction errors, then all > queries > within the transaction are rolled back. This is the default behaviour, but with psql and ON_ERROR_ROLLBACK parameter the behaviour can be changed. See http://www.postgresql.org/docs/8.2/interactive/app-psql.html Regards MP
Re: [GENERAL] Moving to postgresql and some ignorant questions
Phoenix Kiula wrote on 14.08.2007 19:46: There are some cases where I would like to bunch queries into a transaction purely for speed purposes, but they're not interdependent for integrity. E.g., BEGIN TRANSACTION; UPDATE1; UPDATE2; UPDATE3; COMMIT; If UPDATE2 fails because it, say, violates a foreign key constraint, then so be it. I want UPDATE3 to go ahead. Is this not possible, or is there an option I can use to do these kind of independent-query transactions? You could do this with savepoints which are a kind of sub-transaction inside a "bigger" transaction. e.g.: BEGIN TRANSACTION; SAVEPOINT sp1; UPDATE1; IF (failed) rollback to savepoint sp1; SAVEPOINT sp1; UPDATE2; IF (failed) rollback to savepoint sp2; COMMIT; Details here: http://www.postgresql.org/docs/8.2/static/sql-savepoint.html But I doubt that this would be faster that doing a transaction per update. Thomas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] MVCC cons
I want to know whether MVCC has cons also. Is it heavy on resources? How PGSQL MVCC relates with SQL Server 2005 new Snapshot Isolation. -- View this message in context: http://www.nabble.com/MVCC-cons-tf4268841.html#a12149505 Sent from the PostgreSQL - general mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Creating a row during a select
Hi, is there a way to have PostgreSQL insert a new row into a table during a SELECT query if no row is found by that query? -- Best regards, Michal mailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Moving to postgresql and some ignorant questions
In response to "Phoenix Kiula" <[EMAIL PROTECTED]>: > > If you issue a BEGIN then nothing gets committed until you issue a COMMIT. > > If > > anything happens in the meantime then everything you've done since the BEGIN > > disappears. > > There are some cases where I would like to bunch queries into a > transaction purely for speed purposes, but they're not interdependent > for integrity. E.g., > > BEGIN TRANSACTION; > UPDATE1; > UPDATE2; > UPDATE3; > COMMIT; > > If UPDATE2 fails because it, say, violates a foreign key constraint, > then so be it. I want UPDATE3 to go ahead. Is this not possible, or is > there an option I can use to do these kind of independent-query > transactions? That's not possible, by design. However, your application could keep track of which queries have run, and if one fails, start the transaction over without the failing query. But the rule is, if any query within the transaction errors, then all queries within the transaction are rolled back. -- Bill Moran http://www.potentialtech.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Moving to postgresql and some ignorant questions
> If you issue a BEGIN then nothing gets committed until you issue a COMMIT. If > anything happens in the meantime then everything you've done since the BEGIN > disappears. > There are some cases where I would like to bunch queries into a transaction purely for speed purposes, but they're not interdependent for integrity. E.g., BEGIN TRANSACTION; UPDATE1; UPDATE2; UPDATE3; COMMIT; If UPDATE2 fails because it, say, violates a foreign key constraint, then so be it. I want UPDATE3 to go ahead. Is this not possible, or is there an option I can use to do these kind of independent-query transactions? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.
Andrew Edson wrote: The following is a copy of my index creation statement, index name, and explain and explain analyze output on the statement I was trying to run. Would someone please help me figure out what I'm doing wrong here? > attest=# create index ptrans_cid_trandt_idx on ptrans(cntrct_id, tran_dt) where rcrd_cd = '0A'; attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = 0; attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = 0; The index has a where clause that doesn't match your query. It wouldn't use the index anyway - you're not filtering or sorting on it. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Moving to postgresql and some ignorant questions
> You're confusing CHECK constraints and FOREIGN KEY constraints. They're > different things ;) > > CHECK constraints verify that data in a certain column matches a certain > condition. I'm not sure they can reference columns in other tables, > unless you wrap those checks in stored procedures maybe... > > For example: > CREATE TABLE test ( > age int NOT NULL CHECK (age > 0) > ); > > Next to that, you can define DOMAINs - basically your own customized > data types that can follow _your_ rules. Admittedly I have never done > that yet, but it's supposed to be one of the key features of the > relational model (I've seen claims that you're actually not supposed to > use the base types, but define domains for all your data types). > > *And* you can define compound foreign key constraints, > for example: > > CREATE TABLE employee ( > employee_id serial NOT NULL, > company_id integer NOT NULL, > name text NOT NULL, > CONSTRAINT employee_pk > PRIMARY KEY (employee_id, company_id) > ); > > CREATE TABLE division ( > employee_id integer NOT NULL, > company_id integer NOT NULL, > name text NOT NULL, > CONSTRAINT division_fk > FOREIGN KEY (employee_id, company_id) > REFERENCES employee > ON DELETE SET NULL > ON UPDATE CASCADE > ); > > > Also a nice trick, when performing DDL statements (CREATE TABLE and > friends), you can wrap them in a transaction and commit (or rollback) if > you like the end result (or not). I believe the only exception to that > rule is CREATE DATABASE. Thank you for this detailed explanation Alban. But I want to include FK constraints on a table2 on a column in the referenced table1 where column values are not unique. I just want row data to be consistent for the same ID. Yes, this is repetitive and enough to rile DB purists, but it has its uses (for performance in certain reporting queries). Related Questions: 1. Should I explore views for this? I am very skeptical about them coming from MySQL as the performance of MySQL views is horrendous. Besides, if they are updated everytime, there's little use for a view in the first place, I may as well simply query the table -- or is this wrong? The UPDATE only locks and commits to the table, and then the view gets auto updated? 2. Or, I could do this with triggers, and now I realize also with "rules" (CREATE RULE). Which are faster, rules or triggers, are they similar in speed? Basically I want the rule/trigger to cascade the update to table1.col1 and table1.col2 to similar columns in table2. I will surely be exploring views, and reading more of this: http://www.postgresql.org/docs/8.2/interactive/rules-views.html , but I just wanted to know what the usual thoughts on this are. Are views updated as soon as its underlying table(s) are updated? Can I control the duration or timing of their update? I searched for "materialized views", after having seen that word on the performance list, but most of the search results and the discussions on that forum are beyond my comprehension!! Would appreciate any thoughts on performance of views. PGSQL seems to treat views just like tables, so I wonder if there's any performance gain! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.
I've been working on a db project intended to perform modifications to one db based on the data stored in another one. Specifically, I'm supposed to check a pair of tables based on two criteria; an id field, and a timestamp. This will be crossreferenced with the second database; the timestamps will be used to update timestamps on anything which resides in the table of interest there. I was running the sequence through perl; with 76 records in the test copy of the second database, I was getting a four, five minute run. Not really bad, but I'm expecting the number of records in the second db to hit several thousand in production, so I thought I'd see if I could speed things up by adding an index on what I'm searching for. After about an hour of run time, I killed the program and started looking into things. The following is a copy of my index creation statement, index name, and explain and explain analyze output on the statement I was trying to run. Would someone please help me figure out what I'm doing wrong here? > attest=# create index ptrans_cid_trandt_idx on ptrans(cntrct_id, tran_dt) where rcrd_cd = '0A'; > > > "ptrans_cid_trandt_idx" btree (cntrct_id, tran_dt) WHERE rcrd_cd = > '0A'::bpchar > > > attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = 0; >QUERY PLAN > > Seq Scan on ptrans (cost=0.00..426034.67 rows=82443 width=21) >Filter: ((rcrd_cd)::text = '0'::text) > (2 rows) > > > attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd > = 0; > QUERY PLAN > -- > Seq Scan on ptrans (cost=0.00..426034.67 rows=82443 width=21) (actual > time=60585.740..60585.740 rows=0 loops=1) >Filter: ((rcrd_cd)::text = '0'::text) > Total runtime: 60585.797 ms > (3 rows) - Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, when.
Re: [GENERAL] Moving to postgresql and some ignorant questions
On 8/14/07, Alban Hertroys <[EMAIL PROTECTED]> wrote: > *And* you can define compound foreign key constraints, > for example: > > CREATE TABLE employee ( > employee_id serial NOT NULL, > company_id integer NOT NULL, > name text NOT NULL, > CONSTRAINT employee_pk > PRIMARY KEY (employee_id, company_id) > ); > > CREATE TABLE division ( > employee_id integer NOT NULL, > company_id integer NOT NULL, > name text NOT NULL, > CONSTRAINT division_fk > FOREIGN KEY (employee_id, company_id) > REFERENCES employee > ON DELETE SET NULL > ON UPDATE CASCADE > ); You can also have multiple foreign keys to different tables, and to non-primary keys, as long as they are pointing to columns with a unique constraint on them. > Also a nice trick, when performing DDL statements (CREATE TABLE and > friends), you can wrap them in a transaction and commit (or rollback) if > you like the end result (or not). I believe the only exception to that > rule is CREATE DATABASE. One of my all time favorite features of pgsql. create tablespace is also non-transactable. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Moving to postgresql and some ignorant questions
Phoenix Kiula wrote: > Agree. That's what I am trying to do. Include as many UPDATEs etc into > the same TRANSACTION block, but my worry is when I read about > autocommit and how it is enabled by default in postgresql 8. > Specifying explicit BEGIN and COMMIT blocks should only commit when I > want the DB to, or will each UPDATE in the middle of this block get > executed? The database doesn't have autocommit, AFAIK. Some of the clients have, but it only applies if you don't put your SQL statements between BEGIN; and COMMIT; (or ROLLBACK;) statements. I never really use anything but psql for a client, so I can'treally say how other clients (pgadmin fe.) handle this. >> Incidentally, most data integrity checks are handled with CHECK constraints >> and FOREIGN KEY constraints rather than manual triggers. They're both easier >> and cheaper. > > The problem with simple CHECK constraints is that they can only > reference the primary key in another table. What if I want more than > one column to be the same as the referenced table, but do not want to > have a compound primary key in the referenced table? From reading and > re-reading the manual, I dont think FKs allow for this. Only primary > key references are supported. You're confusing CHECK constraints and FOREIGN KEY constraints. They're different things ;) CHECK constraints verify that data in a certain column matches a certain condition. I'm not sure they can reference columns in other tables, unless you wrap those checks in stored procedures maybe... For example: CREATE TABLE test ( age int NOT NULL CHECK (age > 0) ); Next to that, you can define DOMAINs - basically your own customized data types that can follow _your_ rules. Admittedly I have never done that yet, but it's supposed to be one of the key features of the relational model (I've seen claims that you're actually not supposed to use the base types, but define domains for all your data types). *And* you can define compound foreign key constraints, for example: CREATE TABLE employee ( employee_id serial NOT NULL, company_id integer NOT NULL, name text NOT NULL, CONSTRAINT employee_pk PRIMARY KEY (employee_id, company_id) ); CREATE TABLE division ( employee_id integer NOT NULL, company_id integer NOT NULL, name text NOT NULL, CONSTRAINT division_fk FOREIGN KEY (employee_id, company_id) REFERENCES employee ON DELETE SET NULL ON UPDATE CASCADE ); Also a nice trick, when performing DDL statements (CREATE TABLE and friends), you can wrap them in a transaction and commit (or rollback) if you like the end result (or not). I believe the only exception to that rule is CREATE DATABASE. -- Alban Hertroys [EMAIL PROTECTED] magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World // ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Moving to postgresql and some ignorant questions
"Phoenix Kiula" <[EMAIL PROTECTED]> writes: > If I am reading this right, does this mean it is probably better to > leave fsync as "fsync=off" on production machines? Only if you can reconstruct your data from other sources in the case of a server crash or power failure. I wouldn't recommend it. > Also, is COMMIT automatic for my queries? In some minor testing I did (not > scientific I did at all) some queries through Perl did not update the > database at all. I had "fsync=off" in my conf file, and there was no COMMIT > etc in my SQL, just plain SQL. So I am a bit confused. What's a good > starting point? psql runs in autocommit mode by default. If you want multiple queries in a transaction you have to issue a BEGIN statement. Drivers may do various things by default. >> Grouping more work into a single transaction makes the delay for the fsync at >> COMMIT time less of a problem. > > > Agree. That's what I am trying to do. Include as many UPDATEs etc into > the same TRANSACTION block, but my worry is when I read about > autocommit and how it is enabled by default in postgresql 8. > Specifying explicit BEGIN and COMMIT blocks should only commit when I > want the DB to, or will each UPDATE in the middle of this block get > executed? > > Sorry if this is a naive question. I am reading up as much as I can. If you issue a BEGIN then nothing gets committed until you issue a COMMIT. If anything happens in the meantime then everything you've done since the BEGIN disappears. For batch work like loading then it makes sense to handle a 100-1000 records per transaction. But for most purposes you want to group things together based on what you want to happen if an error occurs. Group together into a single transaction precisely the changes that you want to be committed together or rolled back together. Don't structure your program around the performance issues. For the remaining questions I would say you need to experiment. Perhaps others will have more ideas. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Moving to postgresql and some ignorant questions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 10:09, Phoenix Kiula wrote: > On 14/08/07, Gregory Stark <[EMAIL PROTECTED]> wrote: >> "Phoenix Kiula" <[EMAIL PROTECTED]> writes: [snip] >>> >>> QUESTION1: >>> Is this it? Or am I missing something in terms of execution? We use >>> Perl on our site but may be gradually switching to PHP at some point. >>> Will the above plan of execution be ok? >> A transaction is a bunch of queries which you want to all get committed or >> aborted together. The expensive step in Postgres is the COMMIT. Postgres does >> an fsync which forces the data to go to disk (if you're using good hardware, >> and don't have fsync=off). That takes from 5-15ms depending on how much data >> and how fast your drives are. > > > > > If I am reading this right, does this mean it is probably better to > leave fsync as "fsync=off" on production machines? Also, is COMMIT > automatic for my queries? In some minor testing I did (not scientific > I did at all) some queries through Perl did not update the database at > all. I had "fsync=off" in my conf file, and there was no COMMIT etc in > my SQL, just plain SQL. So I am a bit confused. What's a good starting > point? > [snip] > > So, again, in the conf file, is this what you recommend: > > fsync=off I seriously doubt that's what he means. "fsync=off" *is* faster than "fsync=on", but leaves your data at risk in case of a hardware crash. Turning it off during initial data load is quite common, though. Just remember to turn it back on!!! - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwczGS9HxQb37XmcRAhUvAJ9LsQPvd6tQDp+/Mzh3jl8oPs4mHQCffjev 2uCJa3x0/NzUQBVmaJMcVR4= =kVMU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Moving to postgresql and some ignorant questions
Thank you AM. Very useful note, must appreciate the info you shared. About COPY, I have two simple questions: 1. Is there anything like an ALTER DATABASE command? I would like to change the character set without having to recreate the DATABASE again! 2. Also, when I do a mysqldump I seem to be ending up with "\r" in my lines, or so the COPY command tells me: ERROR: literal carriage return found in data HINT: Use "\r" to represent carriage return. CONTEXT: COPY links, line 962974 But afaik, there is nothing in my data in MySQL that contains the new line characters! I have checked and rechecked in the DB to find such characters. I cannot open that text file to check because it contains a lot of data (from about 7 million records). So I tried to do a "grep '\r' FILENAME" but that doesn't help either because of course there is a newline character, that is what is used to separate a line! Is there anything else I can do to make sure mysqldump data comes through? Perhaps separate fields by TABS and lines by a specific character such as "`"? TIA for any thoughts. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Using PITR Backup and Recovery
On Tue, 2007-08-14 at 10:52 -0400, [EMAIL PROTECTED] wrote: > We recently moved to PITR backup and recovery solution as defined in > the documentation. Our basic setup executes the backup start command, > and then takes a snapshot of the filesystem and backups wal files. > However, we have database files ( not wal files ) that change while > the system is in backup mode. This happens during every backup. Is > this normal? It's OK if they change; thats the whole point of HOT backup - you can just continue working while we take the backup. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Moving to postgresql and some ignorant questions
On Tue, Aug 14, 2007 at 11:09:36PM +0800, Phoenix Kiula wrote: > If I am reading this right, does this mean it is probably better to > leave fsync as "fsync=off" on production machines? No, you want "fsync=on" on any machine which holds data you care about. And you want hardware which doesn't lie to you so that "fsync is finished" really means the data is on-disk. Else PostgreSQL cannot ensure ACID compliance. > Specifying explicit BEGIN and COMMIT blocks should only commit when I > want the DB to, yes > or will each UPDATE in the middle of this block get > executed? It will get executed but the effects will only become publicly visible after COMMIT (assuming no errors in the queries in which case you could issue COMMITs all day long and still see no change in the database from the transaction in which the error occurred) > So, again, in the conf file, is this what you recommend: > > fsync=off No. > max_connections=100 Yes, depending on your usage patterns. > The problem with simple CHECK constraints is that they can only > reference the primary key in another table. Not so. Or you need to explain what you mean by "simple CHECK constraints". > I am happy to do this, but I don't see an ALTER DATABASE command. ALTER DATABASE is there, of course, but it doesn't help you. > I would really like not to have to execute the CREATE DATABASE command > again! You'll have to, unfortunately, I fear. Once you go about it take the opportunity and make sure the locale and encoding settings of initdb are compatible with an UTF8 database. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Insert or Replace or \copy (bulkload)
On 8/14/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote: > I'm seeing an obstacle in my aim to migrate from mysql to PG mainly from > the manner in which PG handles duplicate entries either from primary > keys or unique entries. > > Data is taken from perl DBI into (right now) CSV based files to be used > via psql's \copy command to insert into the table. > > In MySql, I was using mysqlimport --replace which essentially provided > the means to load data into the DB, while at the same time, would > provide the necessary logic to replace the entire row if there was a > duplicate instead of dying. > > Under PG, I've yet to found something similar to this functionality and > searches via google has uncovered that this is one thing which _still_ > has not found its way into PG. (anyone knows why? Standards?) > > Anyway, I found a workaround, but, to me, even though it provides a > means to an end, it still looks like it'll end up as a maintenance > nightmare each time a table has any additional columns added. > > Solution is taken from this site: > > http://www.pointwriter.com/blog/index.php?/archives/6-REPLACE-in-PostgreSQL.html Example code snipped for brevity > Can anyone tell me if this won't turn out to be a maintenance nightmare? > So, the pertinent question is, is there a better mousetrap available? I don't see why it would be a maintenance nightmare. Looks pretty much like you just create it and go. Once it's in place it should just work. There are other ways to skin this particular cat, but that one seems as good as any. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Moving to postgresql and some ignorant questions
On 14/08/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > "Phoenix Kiula" <[EMAIL PROTECTED]> writes: > > > Though other threads I have learned that multiple inserts or updates > > can be sped up with: > > > > [QUOTE] > > - BEGIN TRANSACTION; > > - INSERT OR UPDATE queries, ideally PREPAREd; > > - COMMIT; > > [/QUOTE] > > > > QUESTION1: > > Is this it? Or am I missing something in terms of execution? We use > > Perl on our site but may be gradually switching to PHP at some point. > > Will the above plan of execution be ok? > > A transaction is a bunch of queries which you want to all get committed or > aborted together. The expensive step in Postgres is the COMMIT. Postgres does > an fsync which forces the data to go to disk (if you're using good hardware, > and don't have fsync=off). That takes from 5-15ms depending on how much data > and how fast your drives are. If I am reading this right, does this mean it is probably better to leave fsync as "fsync=off" on production machines? Also, is COMMIT automatic for my queries? In some minor testing I did (not scientific I did at all) some queries through Perl did not update the database at all. I had "fsync=off" in my conf file, and there was no COMMIT etc in my SQL, just plain SQL. So I am a bit confused. What's a good starting point? > Grouping more work into a single transaction makes the delay for the fsync at > COMMIT time less of a problem. Agree. That's what I am trying to do. Include as many UPDATEs etc into the same TRANSACTION block, but my worry is when I read about autocommit and how it is enabled by default in postgresql 8. Specifying explicit BEGIN and COMMIT blocks should only commit when I want the DB to, or will each UPDATE in the middle of this block get executed? Sorry if this is a naive question. I am reading up as much as I can. > Also having more connections (but not too many, > more than a few per processor is probably not helping, more than 100 and it's > probably slowing you down) also means it's less important since another > process can do some of its work while you're waiting for the fsync. So, again, in the conf file, is this what you recommend: fsync=off max_connections=100 ? > Yes, it's a pain. Running vacuum frequently will be necessary. You may also > have to raise your fsm settings to allow Postgres to remember more free space > between vacuums. Thank you for your comments about autovacuum. I have these FSM and memory type settings settings in my conf (picked off the internet :)) -- max_fsm_relations = 1500 max_fsm_pages = 8 shared_buffers = 21000 effective_cache_size = 21000 sort_mem = 16348 work_mem = 16348 vacuum_mem = 16348 temp_buffers = 4096 authentication_timeout = 10s ssl = off Do these sound right? > Incidentally, most data integrity checks are handled with CHECK constraints > and FOREIGN KEY constraints rather than manual triggers. They're both easier > and cheaper. The problem with simple CHECK constraints is that they can only reference the primary key in another table. What if I want more than one column to be the same as the referenced table, but do not want to have a compound primary key in the referenced table? From reading and re-reading the manual, I dont think FKs allow for this. Only primary key references are supported. > Sorry, this is one of the main deficiencies in Postgres. You will probably > have to convert your data to utf8 across the board and hopefully you'll find a > collation which satisfies all your needs. I am happy to do this, but I don't see an ALTER DATABASE command. I would really like not to have to execute the CREATE DATABASE command again! Can I make the entire DB utf8 using some command now? Have not been able to find it. ALl manual and google stuff seems to point to the CREATE DB command only. Many thanks! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Using PITR Backup and Recovery
In response to "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>: > We recently moved to PITR backup and recovery solution as defined in the > documentation. Our basic setup executes the backup start command, and then > takes a snapshot of the filesystem and backups wal files. However, we have > database files ( not wal files ) that change while the system is in backup > mode. This happens during every backup. Is this normal? Maybe. Not entirely sure I understand you're meaning, but ... My understanding is that pg_start_backup()'s purpose is to mark the database so it knows what the last complete transaction was before it was started. This doesn't prevent PostgreSQL from making changes to DB files, it just ensures that in the event of a restore, PG knows where to start as far as the data files and the WAL log are concerned. I'm curious as to why files would be changing if you made a filesystem snapshot, but that wouldn't be a problem with PostgreSQL, it would be a problem with the filesystem code. Or I could be misunderstanding what you mean. In any event, if database activity is occurring while the backup is running, PostgreSQL's data files will continue to change, but the archive of WAL logs will allow the system to recover from inconsistent changes during the recovery phase. I don't know if anyone's done extensive testing to know just how reliable PITR is, but it's worked every time for me. One caveat: you can't recover PITR data from an amd64 system to an i386 system :D PostgreSQL's data (and possibly the WAL logs as well) is architecture dependent. This can be a royal pain if you don't know about it and you have a mix of architectures. -- Bill Moran http://www.potentialtech.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Using PITR Backup and Recovery
We recently moved to PITR backup and recovery solution as defined in the documentation. Our basic setup executes the backup start command, and then takes a snapshot of the filesystem and backups wal files. However, we have database files ( not wal files ) that change while the system is in backup mode. This happens during every backup. Is this normal? Any insight appreciated. -bill
Re: [GENERAL] Moving to postgresql and some ignorant questions
"Phoenix Kiula" <[EMAIL PROTECTED]> writes: > Though other threads I have learned that multiple inserts or updates > can be sped up with: > > [QUOTE] > - BEGIN TRANSACTION; > - INSERT OR UPDATE queries, ideally PREPAREd; > - COMMIT; > [/QUOTE] > > QUESTION1: > Is this it? Or am I missing something in terms of execution? We use > Perl on our site but may be gradually switching to PHP at some point. > Will the above plan of execution be ok? A transaction is a bunch of queries which you want to all get committed or aborted together. The expensive step in Postgres is the COMMIT. Postgres does an fsync which forces the data to go to disk (if you're using good hardware, and don't have fsync=off). That takes from 5-15ms depending on how much data and how fast your drives are. Grouping more work into a single transaction makes the delay for the fsync at COMMIT time less of a problem. Also having more connections (but not too many, more than a few per processor is probably not helping, more than 100 and it's probably slowing you down) also means it's less important since another process can do some of its work while you're waiting for the fsync. > My queries are all optimized and indexed well. But the defragmentation > resulting from UPDATEs can be a pain. I wonder if PGSQL's autovacuum > will help. As for autovacuum we have every intention of leaving it on. > Will the following settings be alright? Yes, it's a pain. Running vacuum frequently will be necessary. You may also have to raise your fsm settings to allow Postgres to remember more free space between vacuums. > [QUOTE] > autovacuum = on > vacuum_cost_delay = 30 > stats_start_collector = on > stats_row_level = on > autovacuum_vacuum_threshold = 100 > autovacuum_analyze_threshold = 50 > [/QUOTE] > > I am hoping that the frequent vacuum thresholds will help, but: > > QUESTION 2: > Are these settings too aggressive? While an autovacuum is running in > the background, will it lock tables or anything? Can the tables still > be operational, and the autovacuum will automatically resume from > whatever point it was at? I am worried about how autovacuum will > perform when VACUUM doesn't lock tables. It's designed to operate without interfering. It does still take up i/o bandwidth which affects performance. The autovacuum_cost_delay above tells it to wait 30ms every few pages to try to avoid slowing down production. You'll have to judge based on experience whether it's taking too long with that time. You may be better off starting with 10ms or 20ms instead. I don't think the threshold parameters will be relevant to you. You should look at autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor. I may be wrong though, someone more familiar with autovacuum in 8.2 might have to speak up. > QUESTION 3. > Data integrity checks in MySQL world were very patchy, relying on CGI > stuff to make sure, which does not always work. We are trying our best > to get rid of them. With postgresql, I realize we can have triggers as > well as foreign keys. But searching through old threads on this group > suggests that triggers also present performance problems. On tables > that are infrequently updated, can I write triggers without worrying > about performance? Or, how can I make sure their performance is as > best as it can be, i.e., which of the config vars is responsible for > that? Triggers are quite efficient in Postgres but they still cost something. Nobody will be able to give you a blanket statement that you can do anything without testing it. But you're probably better off having them and then considering removing them later if you have a problem. Incidentally, most data integrity checks are handled with CHECK constraints and FOREIGN KEY constraints rather than manual triggers. They're both easier and cheaper. > QUESTION 4: > Character sets: In MySQL we had utf-8 as our default character set. > Yet, on some tables we had the collation of latin1_general_ci, and > only on specific columns we had "utf8_general_ci" (where non-english > text needed to be stored). How should I do this in pgsql? When I do a > mysqldump of these tables, and then COPY them back into pgsql, I > always see the error "ERROR: invalid byte sequence for encoding > "UTF8": 0xa7". So I do a > > \encoding latin1 > > And then my COPY import works. But this is not what I want. How can I > set up one of my columns in this table to be utf-8, and the rest to be > latin? Then I would like to import with "\encoding utf8". Can this be > somehow done? Sorry, this is one of the main deficiencies in Postgres. You will probably have to convert your data to utf8 across the board and hopefully you'll find a collation which satisfies all your needs. You can't switch encoding or collation on the fly. You could look at the CONVERT function which might help, but I'm not sure exactly what you would have to do to solve your immediate problem. If you really need multiple collations in a single database t
Re: [GENERAL] Downloading PostgreSQL source code version 7.1 through CVS
"Kuriakose, Cinu Cheriyamoozhiyil" <[EMAIL PROTECTED]> writes: > 2) cvs -z3 -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot > co -P pgsql > This will fetch the code, but the code that is fetched is the latest > code. You need "-r RELx_y_STABLE" to check out the branch for release x.y. > Can anyone please tell me what command I have to use to download the > source code For PostgreSQL-7.1. You do realize that 7.1 has not been maintained since 2001? I can hardly imagine a reason to be interested in it now. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Moving to postgresql and some ignorant questions
I have been a long time user of mysql. Switching to Postgresql because the true features included in 5.1 (as of this moment) are nothing to write home about. The InnoDB stuff is highly advocated but it has its own set of issues, and when one looks at things like backup/restore etc, it is clearly targeted at expensive installs with full DBAs and such, which we cannot afford. So I have been reading up feverishly on PGSQL and it surely is a more serious DB, which is good, but also a bit steep in its learning curve. I have some pretty silly questions to ask below. Just to confirm that I manage my switch as smoothly as possible! By way of information, we have new double AMD Opterons with 3GB of RAM. The postgres that has been installed is 8.2.3. But our DB is not as large as some of the discussions on pgsql-general. No table is more than 10 million records or likely to exceed that anytime soon. But I have some heavy simultaneous user connections much like any web application for a busy website. In particular I have a table that needs very high availability: it has bout 10,000 INSERTS a day, about 500,000 SELECTS a day (with or without joins), but most importantly about 1 million UPDATEs. (It is the UPDATE that is bothering the MYSQL engine of "MYISAM" type with frequent data corruption). Though other threads I have learned that multiple inserts or updates can be sped up with: [QUOTE] - BEGIN TRANSACTION; - INSERT OR UPDATE queries, ideally PREPAREd; - COMMIT; [/QUOTE] QUESTION1: Is this it? Or am I missing something in terms of execution? We use Perl on our site but may be gradually switching to PHP at some point. Will the above plan of execution be ok? My queries are all optimized and indexed well. But the defragmentation resulting from UPDATEs can be a pain. I wonder if PGSQL's autovacuum will help. As for autovacuum we have every intention of leaving it on. Will the following settings be alright? [QUOTE] autovacuum = on vacuum_cost_delay = 30 stats_start_collector = on stats_row_level = on autovacuum_vacuum_threshold = 100 autovacuum_analyze_threshold = 50 [/QUOTE] I am hoping that the frequent vacuum thresholds will help, but: QUESTION 2: Are these settings too aggressive? While an autovacuum is running in the background, will it lock tables or anything? Can the tables still be operational, and the autovacuum will automatically resume from whatever point it was at? I am worried about how autovacuum will perform when QUESTION 3. Data integrity checks in MySQL world were very patchy, relying on CGI stuff to make sure, which does not always work. We are trying our best to get rid of them. With postgresql, I realize we can have triggers as well as foreign keys. But searching through old threads on this group suggests that triggers also present performance problems. On tables that are infrequently updated, can I write triggers without worrying about performance? Or, how can I make sure their performance is as best as it can be, i.e., which of the config vars is responsible for that? QUESTION 4: Character sets: In MySQL we had utf-8 as our default character set. Yet, on some tables we had the collation of latin1_general_ci, and only on specific columns we had "utf8_general_ci" (where non-english text needed to be stored). How should I do this in pgsql? When I do a mysqldump of these tables, and then COPY them back into pgsql, I always see the error "ERROR: invalid byte sequence for encoding "UTF8": 0xa7". So I do a \encoding latin1 And then my COPY import works. But this is not what I want. How can I set up one of my columns in this table to be utf-8, and the rest to be latin? Then I would like to import with "\encoding utf8". Can this be somehow done? Sorry for this long post, but as exciting as this switch is, it is also daunting because I feel like I am moving into serious databases territory and I don't want to goof up. I have read up a lot and am continuing to, but it would be great if someone can shed some light on the above to begin with. TIA! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] PgAdmin .sql default handler
Kristo Kaiv wrote: How on earth can i turn off PgAdmin grabbing default program binding for .sql files on OS X? Select a .sql file, right-click (ctrl-click) and choose "Get Info" (or just select the file and hit CMD-I). Expand the "Open With ..." tab, choose "Other" and select the app you want to handle these from your Applications directory. brian ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] "Out of memory" errors..
Hi, > Now if I want a "maintenance_work_mem" of 64M for Postgresql, what > should the "max user processes" setting be in my ulimit, or the "open > files" setting etc? Is there a Postgresql help or doc page I can read > to see how these values map? I'd like to be more educated in how I > test to tweak these OS level values! If this is only a PostgreSQL database server, don't limit the postgres user. Don't tweak these limits unless you know exactly what you are doing. - Sander PS: "maintenance_work_mem" is completely unrelated to "max user processes" or "open files", it's related to the allowed memory size. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] "Out of memory" errors..
On 8/14/07, Sander Steffann <[EMAIL PROTECTED]> wrote: > Hi Lim, > > >> It might also be in /etc/security/limits.conf. > > > > Thanks. I see these two lines in that file: > > > > postgressoftnofile 8192 > > postgreshardnofile 8192 > > > > How should I change these values? I am not sure how this reflects the > > "ulimit" options. > > Those are limits to the allowed number of open files (ulimit -n). I think > 8192 should be enough for PostgreSQL. The problem you had were related to > other settings, so if only the "nofile" setting is changed your strange > ulimits do not come from here :-) I have finally figured out how to increase the ulimit for postgres user. My new ulimit values are: -- core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 1024 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 14335 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -- Now if I want a "maintenance_work_mem" of 64M for Postgresql, what should the "max user processes" setting be in my ulimit, or the "open files" setting etc? Is there a Postgresql help or doc page I can read to see how these values map? I'd like to be more educated in how I test to tweak these OS level values! Thanks. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Persistent connections in PHP
On Mon, 13 Aug 2007 11:30:37 -0500 "Scott Marlowe" <[EMAIL PROTECTED]> wrote: > Oh, one other thing that contributes to the problem Thanks for the replies - all of this was very useful info. Josh ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Postgresql INSERT speed (how to improve performance)?
On Aug 13, 2007, at 11:54 PM, Lim Berger wrote: I am testing through PHP microtime function. The query is administered through pg_query() function of PHP. I know there could be some latency coming in from the PHP's PG functions' overhead, but not such an order of magnitude different from "mysqli_query"! (I hope -- because this is quite a common real-world situation I would think). You might also try using prepared statements for your inserts (see PHP's pg_prepare and pg_execute functions). This will also save time you are likely using to escape the inserted strings. John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Downloading PostgreSQL source code version 7.1 through CVS
Hi all, Can anyone please tell me how to download the PostgreSQL-7.1 source code through CVS, i use the following set of commands to get the source code of postgreSQL. 1) cvs -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot login This command will do an initial login to the PostgreSQL repository. 2) cvs -z3 -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot co -P pgsql This will fetch the code, but the code that is fetched is the latest code. Can anyone please tell me what command I have to use to download the source code For PostgreSQL-7.1. Thanks in advance Regards Cinu Kuriakose ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Insert or Replace or \copy (bulkload)
I'm seeing an obstacle in my aim to migrate from mysql to PG mainly from the manner in which PG handles duplicate entries either from primary keys or unique entries. Data is taken from perl DBI into (right now) CSV based files to be used via psql's \copy command to insert into the table. In MySql, I was using mysqlimport --replace which essentially provided the means to load data into the DB, while at the same time, would provide the necessary logic to replace the entire row if there was a duplicate instead of dying. Under PG, I've yet to found something similar to this functionality and searches via google has uncovered that this is one thing which _still_ has not found its way into PG. (anyone knows why? Standards?) Anyway, I found a workaround, but, to me, even though it provides a means to an end, it still looks like it'll end up as a maintenance nightmare each time a table has any additional columns added. Solution is taken from this site: http://www.pointwriter.com/blog/index.php?/archives/6-REPLACE-in-PostgreSQL.html [..snip..] For the sake of this example we'll use a very simple table called 'map' with a two fields: 'key' and 'value'. Not surprisingly, key is the primary key. Let's further assume that every insert into this table should actually be a replace. Then all we need is this simple statement and we're done: CREATE RULE "replace_map" AS ON INSERT TO "map_example" WHERE EXISTS(SELECT 1 FROM map_example WHERE key=NEW.key) DO INSTEAD (UPDATE map_example SET value=NEW.value WHERE key=NEW.key) [...snip...] Populate with some test data == XMMS=> insert into map_example(key,value,value2) values (1,1,1); INSERT 0 1 XMMS=> insert into map_example(key,value,value2) values (2,2,2); INSERT 0 1 XMMS=> select * from map_example ; key | value | value2 -+---+ 1 | 1 | 1 2 | 2 | 2 (2 rows) Try to insert some duplicate data += XMMS=>insert into map_example(key,value,value2) values (2,20,20); INSERT 0 0 XMMS=> select * from map_example ; key | value | value2 -+---+ 1 | 1 | 1 2 |20 | >2 <= NOT Updated (2 rows) Add the new rule once we added the new columns == XMMS=>CREATE OR REPLACE RULE "replace_map" AS ON INSERT TO "map_example" WHERE EXISTS(SELECT 1 FROM map_example WHERE key=NEW.key) DO INSTEAD (UPDATE map_example SET value=NEW.value,value2=NEW.value2 WHERE key=NEW.key) ; XMMS=> insert into map_example(key,value,value2) values (2,20,20); INSERT 0 0 XMMS=> select * from map_example ; key | value | value2 -+---+ 1 | 1 | 1 2 |20 | 20 (2 rows) Can anyone tell me if this won't turn out to be a maintenance nightmare? So, the pertinent question is, is there a better mousetrap available? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] PgAdmin .sql default handler
How on earth can i turn off PgAdmin grabbing default program binding for .sql files on OS X? Kristo Kaiv http://kaiv.wordpress.com (PostgreSQL blog)
Re: [GENERAL] Postgresql INSERT speed (how to improve performance)?
"Lim Berger" <[EMAIL PROTECTED]> writes: > Here's the table definition: > > Table "public.cachedstats" > Column | Type | Modifiers > ---+---+-- > id| bigint| not null > prof_name | character varying(20) | not null > notes | text | not null > inform_prof_on_change | character(1) | not null default 'N'::bpchar > > Indexes: > "cachedstats_pkey" PRIMARY KEY, btree (id) > "idx_cachedstats_unique_prof_name" UNIQUE, btree (alias) What's "alias"? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend