Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict
Hi John, On Wed, 2012-03-14 at 21:54 -0700, John R Pierce wrote: So I have a CentOS 6.2 x86_64 system that was running postgres 9.0 from Devrim's yum repo... It is now a community repo ;) I installed 9.1 and was going to do a parallel upgrade, but the entry for 9.0 in /etc/ld.so.conf.d/postgresql-9.0-libs.conf was causing /usr/pgsql-9.1/bin/psql to load the wrong libpq, /usr/pgsql-9.0/lib/libpq.so.5 which was triggering the error... psql: invalid connection option client_encoding I have created a ticket for that, but could not close yet. You can see the progress from here: http://wiki.pgrpms.org/ticket/77 Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict
On 03/14/12 11:40 PM, Devrim GÜNDÜZ wrote: I have created a ticket for that, but could not close yet. You can see the progress from here: http://wiki.pgrpms.org/ticket/77 I don't see anything but the ticket itself? anyways, I'm quite sure this is the problem... # cat /etc/ld.so.conf.d/postgresql-9.0-libs.conf /usr/pgsql-9.0/lib/ # cat /etc/ld.so.conf.d/postgresql-9.1-libs.conf /usr/pgsql-9.1/lib/ I'm pretty sure that removing the ld.so.conf.d files will fix this. I'm setting up a test in a spare VM now. k, stock centos 6.2 with 9.0.7... $ rpm -qa |grep postgres postgresql90-libs-9.0.7-1PGDG.rhel6.x86_64 postgresql90-server-9.0.7-1PGDG.rhel6.x86_64 postgresql90-9.0.7-1PGDG.rhel6.x86_64 postgresql90-contrib-9.0.7-1PGDG.rhel6.x86_64 postgresql90-devel-9.0.7-1PGDG.rhel6.x86_64 $ ldd /usr/pgsql-9.0/bin/psql | grep libpq libpq.so.5 = /usr/pgsql-9.0/lib/libpq.so.5 (0x7f340f956000) so... I install 9.1 ... # rpm -ivh http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-centos91-9.1-4.noarch.rpm Retrieving http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-centos91-9.1-4.noarch.rpm Preparing...### [100%] 1:pgdg-centos91 ### [100%] # yum install postgresql91{-server,-contrib} Installing : postgresql91-libs-9.1.3-1PGDG.rhel6.x86_64 1/4 Installing : postgresql91-9.1.3-1PGDG.rhel6.x86_64 2/4 Installing : postgresql91-server-9.1.3-1PGDG.rhel6.x86_64 3/4 Installing : postgresql91-contrib-9.1.3-1PGDG.rhel6.x86_64 4/4 Installed: postgresql91-contrib.x86_64 0:9.1.3-1PGDG.rhel6 postgresql91-server.x86_64 0:9.1.3-1PGDG.rhel6 Dependency Installed: postgresql91.x86_64 0:9.1.3-1PGDG.rhel6 postgresql91-libs.x86_64 0:9.1.3-1PGDG.rhel6 Complete! and # /usr/pgsql-9.1/bin/psql psql: invalid connection option client_encoding # ldd /usr/pgsql-9.1/bin/psql |grep libpq libpq.so.5 = /usr/pgsql-9.0/lib/libpq.so.5 (0x7f989aa2) ps! whats 9.0 doing here?? k, so thats the problem. so... I got evil. I mv'd /etc/ld.so.conf.d/postgres*.conf to backup. # ldd /usr/pgsql-9.1/bin/psql |grep libpq libpq.so.5 = not found uh oh. that means they weren't linked with -R or -rpath /usr/pgsql-9.1/lib, which they should have been if --prefix /usr/pgsql-9.1 was used OH. CONFIGURE = '--disable-rpath' '--prefix=/usr/pgsql-9.1' . muh oh. why is that?!? -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_role vs. pg_shadow or pg_user
The 8.1 version of the docu explicitly outlined the migration, the 9.1 version no longer covers the way things were before 8.1. In the meantime I also found http://www.postgresql.org/docs/9.0/interactive/role-membership.html which cleared things up exhaustively and by example. Alex Am 14.03.2012 um 22:52 schrieb Tom Lane: Alexander Reichstadt l...@mac.com writes: in the documentation of 8.1 the concept of roles is outlined compared to users and groups at http://www.postgresql.org/docs/8.1/static/user-manag.html. Um ... why are you reading 8.1 documentation while running 9.1? There are likely to be some obsolete things in there. I also read today that pg_shadow is the real table containing the users as opposed to pg_user which is only a view and one never displaying anything but for the password. I don't have the link where that was, Whereever it was, it was even more obsolete than the 8.1 docs. pg_shadow has been a view (on pg_authid) for quite a while now. Try \d+ pg_shadow in psql. The reason this is such a mess is that we've changed the catalog representation several times, each time leaving behind a view that was meant to emulate the old catalog. For some time now, pg_authid has been the ground truth, but it stores entries for both login and non-login roles, which more or less correspond to what used to be users and groups. pg_roles is the only non-protected view that shows you all the entries. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] copy in date string 00-00-00 00:00:00
On Wed, 2012-03-14 at 21:52 -0700, Mark Phillips wrote: I am not familiar with sed, except for some trivial bits I nicked off the web. Enough to know it works, and to be dangerous. Nonetheless, using SED may be the way to go as there are two tables that contain a bit over 3,000,000 rows each. You should also consider using awk/gawk with the field separator (FS variable) set to match the one in your input (','). The advantages in this case are that it can be made to work on specific fields in the CSV file and not look at the rest, something like: BEGIN { FS = ','; } # set field sep to comma $5 == '-00-00 00:00:00' { $5 = '' } # empty field 5 if it matches {print } # output all lines Disclaimer: this is untested example code Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Backups
Hello all Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. Thanks Rich
Re: [GENERAL] Backups
Hi Richard, it's no easy answer. If your server has plenty of free resources there won't be trouble, but I do have customers who cannot even imagine of launching a dump in normal traffic hours. How loaded is your box, currently? Cheerio Bèrto On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com wrote: ** Hello all Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. Thanks Rich -- == If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
Re: [GENERAL] Backups
Thanks for a quick reply. The server has 6 cores, 6GB ram and top gets to 2.3-2.5 load average when running the dumpall. So I assume we are nowhere near this causing performance issues for users? Thanks Rich On 15/03/12 12:21, Bèrto ëd Sèra wrote: Hi Richard, it's no easy answer. If your server has plenty of free resources there won't be trouble, but I do have customers who cannot even imagine of launching a dump in normal traffic hours. How loaded is your box, currently? Cheerio Bèrto On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com mailto:rich...@scholarpack.com wrote: Hello all Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. Thanks Rich -- == If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
Re: [GENERAL] Backups
Hi yes, if you get like ~2.5 when running it in peak hour it should definitely be harmless. Bèrto On 15 March 2012 12:37, Richard Harley rich...@scholarpack.com wrote: ** Thanks for a quick reply. The server has 6 cores, 6GB ram and top gets to 2.3-2.5 load average when running the dumpall. So I assume we are nowhere near this causing performance issues for users? Thanks Rich On 15/03/12 12:21, Bèrto ëd Sèra wrote: Hi Richard, it's no easy answer. If your server has plenty of free resources there won't be trouble, but I do have customers who cannot even imagine of launching a dump in normal traffic hours. How loaded is your box, currently? Cheerio Bèrto On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com wrote: Hello all Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. Thanks Rich -- == If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music. -- == If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
[GENERAL] WHERE IN (subselect) versus WHERE IN (1,2,3,)
My apologies, I'm sure this question has been asked before but I couldn't find anything on the list that meant anything to me. We have a table contexts with 1.6 million rows, and a table articles with 1.4 million rows, where an article is a particular kind of context. We want to select from a join on those two tables like this SELECT COUNT(*) FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE contexts.context_key IN (...); /* and some combination of columns from articles and contexts */ If IN(...) is a query, then this guy does a seq scan on the contexts table, even if the subquery is select col_a from kgtest where kgtest has one row. If however I read the ids beforehand and write them into the query, a la IN (111,222,333...), then the everything is happy, up to at least 20,000 values written into the sql, at which point smaller machines will take 2-5 minutes to parse the query. I can certainly write the ids inline into the SQL, but when I do that I get the distinct impression that I'm Doing It Wrong. Is this expected behavior? It seems surprising to me. To demonstrate: /* nothing up my sleeve */ # select * from kgtest; cola - 1652729 (1 row) /* inline, good query plan */ # explain (analyze, buffers) select count(*) from contexts JOIN articles ON (articles.context_key=contexts.context_key) where contexts.context_key in (1652729); QUERY PLAN -- Aggregate (cost=3.82..3.83 rows=1 width=0) (actual time=0.188..0.189 rows=1 loops=1) Buffers: shared hit=7 - Nested Loop (cost=0.00..3.81 rows=1 width=0) (actual time=0.181..0.181 rows=0 loops=1) Buffers: shared hit=7 - Index Scan using contexts_pkey on contexts (cost=0.00..1.90 rows=1 width=4) (actual time=0.109..0.112 ro Index Cond: (context_key = 1652729) Buffers: shared hit=4 - Index Scan using articles_pkey on articles (cost=0.00..1.90 rows=1 width=4) (actual time=0.060..0.060 ro Index Cond: (articles.context_key = 1652729) Buffers: shared hit=3 Total runtime: 0.324 ms (11 rows) /* subselect, query plan does seq scan on contexts */ # explain (analyze, buffers) select count(*)from contexts JOIN articles ON (articles.context_key=contexts.context_key) where contexts.context_key in (select cola from kgtest); QUERY PLAN -- Aggregate (cost=118505.72..118505.73 rows=1 width=0) (actual time=0.274..0.275 rows=1 loops=1) Buffers: shared hit=5 - Hash Join (cost=12512.61..116661.91 rows=737524 width=0) (actual time=0.269..0.269 rows=0 loops=1) Hash Cond: (contexts.context_key = articles.context_key) Buffers: shared hit=5 - Seq Scan on contexts (cost=0.00..64533.03 rows=1648203 width=4) (actual time=0.009..0.009 rows=1 loops=1 Buffers: shared hit=1 - Hash (cost=412.56..412.56 rows=737524 width=8) (actual time=0.110..0.110 rows=0 loops=1) Buckets: 4096 Batches: 32 Memory Usage: 0kB Buffers: shared hit=4 - Nested Loop (cost=40.00..412.56 rows=737524 width=8) (actual time=0.107..0.107 rows=0 loops=1) Buffers: shared hit=4 - HashAggregate (cost=40.00..42.00 rows=200 width=4) (actual time=0.069..0.071 rows=1 loops=1) Buffers: shared hit=1 - Seq Scan on kgtest (cost=0.00..34.00 rows=2400 width=4) (actual time=0.048..0.050 rows Buffers: shared hit=1 - Index Scan using articles_pkey on articles (cost=0.00..1.84 rows=1 width=4) (actual time=0.0 Index Cond: (articles.context_key = kgtest.cola) Buffers: shared hit=3 Total runtime: 0.442 ms -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
[GENERAL] Fetch from cursor with indexed sorting
Hello I have CURSOR for query SELECT a FROM table ORDER BY xyz, where table a is extremely large and xyz is hard-to-evaluate, but indexable expression(using gist). As far as I understand, on SELECT a FROM table ORDER BY xyz LIMIT X query postgres do not fetch entire index or table, but it uses next operation on index, fetching tuples one-by-one until X tuples are fetched. This means, that such queries complexity depends on size of output data, and not depends on size of table and index. Correct me if this is not true. In fact, I have more complicated case. I do not know value of X. I need to create cursor, open it, fetch records one-by-one and aggregate. Then on some condition I stop fetching and return aggregated value. In fact in most cases my procedure scans a little part of table. But does postgres understand such usage of index and cursors? Will it really use index and fetch only little amount of data, or will it read entire table to memory, creating in-memory copy of sorted data? There are two troubles with it - the table is large and expression in ORDER BY is complex to evaluate. -- Regards, Andrey -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backups
On Thu, Mar 15, 2012 at 6:15 AM, Richard Harley rich...@scholarpack.com wrote: Hello all Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. I'd suggest you look at using PITR backups. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict
John R Pierce pie...@hogranch.com writes: CONFIGURE = '--disable-rpath' '--prefix=/usr/pgsql-9.1' . muh oh. why is that?!? It's distro policy on all Red Hat-based systems (and probably others as well) not to use rpath; you're supposed to use ldconfig instead. It's thus totally unsurprising that removing a package's /etc/ld.so.conf.d file breaks it. However, it's not clear to me that this policy can support multiple packages expecting to install different libraries with the same soname and major version. I'm not enough of a packaging weenie to know if there's a standard workaround for that. Devrim might have to resort to something like shipping the latest (9.1.x) version of libpq.so in all of his packages. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fetch from cursor with indexed sorting
Andrey Chursin an...@danasoft.ws writes: I need to create cursor, open it, fetch records one-by-one and aggregate. Then on some condition I stop fetching and return aggregated value. In fact in most cases my procedure scans a little part of table. But does postgres understand such usage of index and cursors? Not unless you tell it. You may need to reduce the value of cursor_tuple_fraction to inform the planner that you're only expecting to fetch a small part of the cursor's theoretical output. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backups
Richard Harley wrote: Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. pg_dump will be a performance hit, because it consumes disk I/O capacity. You could measure how the system is affected by your current backup. On the other hand, instead of doing an hourly pg_dump, it might be preferable to do a filesystem backup and PITR. That way you have to do a backup only once a day (or week, depends how much traffic you have and how fast you have to restore) and can still recover to an arbitrary point in time. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using copy with a file containing blank rows
Hi All, I am trying to use COPY to import postgresql logs into a postgresql database for further review and sorting. The problem I'm encountering happens when COPY hits a blank row: development=# COPY log development-# FROM 'D:/PostgreSQL/9.1/pg_log/postgresql-2012-03-15_093730.log' development-# WITH DELIMITER '^' NULL AS ''; ERROR: missing data for column ip_address CONTEXT: COPY log, line 2: Line 2 is a blank line. Is there any way to use COPY with blank lines or do the blank lines have to removed before using COPY? Thanks, George -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Did xpath_table get dropped.
On 03/14/2012 09:49 PM, Rob Sargent wrote: On 03/14/2012 08:57 PM, Tom Lane wrote: Rob Sargentrobjsarg...@gmail.com writes: I don't see anything to that effect in the release notes I've looked at. That's cause it's still there. Did you remember to build/install contrib/xml2? The xpath functions you're listing are all core, I think. regards, tom lane Thanks. I blew it on the configure then. Arrrg. Now off to find the contribs. Cheers. For closure, and for any poor slob in the same predicament: I carefully put my big-boy pants back on and cd postgres-9.1.2/contrib/xml2 make install create extension xml2; Joy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backups
Perhaps a RAM DISK could be considered in the equation On Thu, 2012-03-15 at 16:30 +0100, Albe Laurenz wrote: Richard Harley wrote: Very simple question - does pg_dump/dumpall hit the server in terms of database performance? We currently do nightly backups and I want to move to hourly backups but not at the expense of hogging all the resources for 5 mins. Pg_dumpall is currently producing a 1GB file - that's the combined size of around 30 databases and it takes around 5 mins to run. pg_dump will be a performance hit, because it consumes disk I/O capacity. You could measure how the system is affected by your current backup. On the other hand, instead of doing an hourly pg_dump, it might be preferable to do a filesystem backup and PITR. That way you have to do a backup only once a day (or week, depends how much traffic you have and how fast you have to restore) and can still recover to an arbitrary point in time. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how to measure wal_buffer usage
After reading this interesting article on shared_buffers and wal_buffers: http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html it got me wondering if my settings were ideal. Is there some way to measure wal_buffer usage in real time, so that I could simply monitor it for some period of time, and then come up with a way of determining if the current setting is sufficient? I tried googling, but every reference that I've found simply defaults to the trial error approach to performance tuning. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using copy with a file containing blank rows
On 03/15/2012 09:17 AM, George Weaver wrote: Hi All, I am trying to use COPY to import postgresql logs into a postgresql database for further review and sorting. Are you using the CSV format to log to the Postgres log?: http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG The problem I'm encountering happens when COPY hits a blank row: development=# COPY log development-# FROM 'D:/PostgreSQL/9.1/pg_log/postgresql-2012-03-15_093730.log' development-# WITH DELIMITER '^' NULL AS ''; ERROR: missing data for column ip_address CONTEXT: COPY log, line 2: Line 2 is a blank line. Is there any way to use COPY with blank lines or do the blank lines have to removed before using COPY? AFAIK, you have to remove blank lines. Thanks, George -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to perform full text search
User can enter any number of words as search string. In shopping cart the following query is used to find products, eq. if red cat is entered: select * from products where productname ilike '%'||'red cat'||'%' or productdescription ilike '%'||'red cat'||'%' limit 100 This does not find products like red or black cat. How to change this query so that it returns 100 best matches for for given search string? I read documentaton about full text search but havent found step by step solution for this. Using PostgreSQL 8.2.7 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) Should I install some contribs to 8.2 or is it better to upgrade server ? Where to find step by step instructions making this work ? Andrus.
Re: [GENERAL] psql latex and newlines
Excerpts from Wim Bertels's message of jue feb 23 12:46:29 -0300 2012: the problem arises when u have a field value that contains a newline character, when this field is not the first column, then all the data after this newline comes in the first column.. u can try this out, writing a function or table, and then add 'enters' or newline in the COMMENT on this function or table. the option i came across is the usage of \pbox, so the suggestion would be to put every field/cell inside a pbox. Did you come up with a patch for this? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict
On 03/15/12 7:27 AM, Tom Lane wrote: It's distro policy on all Red Hat-based systems (and probably others as well) not to use rpath; you're supposed to use ldconfig instead. It's thus totally unsurprising that removing a package's /etc/ld.so.conf.d file breaks it. ouch. I didn't realize that. -1 to RH for that policy. -rpath is *EXACTLY* the right solution for this sort of problem, I've always *HATED* stuff that relies on ld.so or equivalent, or even worse, LD_LIBRARY_PATH sigh. However, it's not clear to me that this policy can support multiple packages expecting to install different libraries with the same soname and major version. I'm not enough of a packaging weenie to know if there's a standard workaround for that. Devrim might have to resort to something like shipping the latest (9.1.x) version of libpq.so in all of his packages. maybe the postgresql91-libs should supercede the ld.so.conf.d file from previous versions so there would be only one such .conf file, pointing everyone at the latest installed version.otoh, if you removed 9.1, I don't know how that would be able to put a previous version back, ugh. -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] undo update
Hi all. Today an accident happened on one of my databases. I have a table named payments with about 5400 rows. I have done a query update payments set amount = 0; where id in (2354,2353,1232). Please note the semicolon inside — I missed it =( Now all my data is lost. And after this happened I realized that backups script is broken (my fault, I know) I googled a little and have found that there is a chance to restore my data using files in pg_xlog directory. But everybody say about PITR and I don't use it. Also there is a xlogviewer project (from old 2006) that I'm trying to install on my Gentoo right now. I copied all PGDATA directory and made a dump of all databases. Also I turned off my webserver. Postgres is still running. Please give me some step-by step guide what should I do next? Is there any chance to restore my data? I use postgresql 8.4 with default config (autovacuum is commented) -- __ Yours sincerely, Ivan Kuznetsov aka Kuzma mailto: kuzma...@gmail.com
Re: [GENERAL] undo update
On Thu, Mar 15, 2012 at 8:22 AM, Ivan kuzma...@gmail.com wrote: Hi all. Today an accident happened on one of my databases. I have a table named payments with about 5400 rows. I have done a query update payments set amount = 0; where id in (2354,2353,1232). Please note the semicolon inside — I missed it =( Now all my data is lost. And after this happened I realized that backups script is broken (my fault, I know) I googled a little and have found that there is a chance to restore my data using files in pg_xlog directory. But everybody say about PITR and I don't use it. Also there is a xlogviewer project (from old 2006) that I'm trying to install on my Gentoo right now. I copied all PGDATA directory and made a dump of all databases. Also I turned off my webserver. Postgres is still running. Please give me some step-by step guide what should I do next? Is there any chance to restore my data? I use postgresql 8.4 with default config (autovacuum is commented) PITR can't help you after the fact if you don't have a base backup and archives of the pg_xlog dir etc. You might be able to pg_resetxlog to make the old rows visible, but I'm no expert on doing that. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Indexing MS/Open Office and PDF documents
Hi, We are looking to use Postgres 9 for the document storing and would like to take advantage of the full text search capabilities. We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. Any advice would be appreciated. Regards, -Alex This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
Re: [GENERAL] undo update
I have installed xlogviewer and it gives me data like that: [cur:0/5770E87C, xid:355075, rmid:10(Heap), len:88/116, prev:0/5770E840] update: s/d/r:1663/90693/107093 block 1 off 36 to block 107 off 30 [cur:0/5770E8F0, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770E87C] insert_leaf: s/d/r:1663/90693/107099 tid 20/101 [cur:0/5770E930, xid:355075, rmid:11(Btree), len:38/66, prev:0/5770E8F0] insert_leaf: s/d/r:1663/90693/107100 tid 42/146 [cur:0/5770E974, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E930] insert_leaf: s/d/r:1663/90693/107101 tid 21/97 [cur:0/5770E9B0, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E974] insert_leaf: s/d/r:1663/90693/107102 tid 28/7 [cur:0/5770E9EC, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E9B0] insert_leaf: s/d/r:1663/90693/107103 tid 33/2 [cur:0/5770EA28, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E9EC] insert_leaf: s/d/r:1663/90693/107104 tid 18/232 [cur:0/5770EA64, xid:355075, rmid:11(Btree), len:54/82, prev:0/5770EA28] insert_leaf: s/d/r:1663/90693/107105 tid 46/109 [cur:0/5770EAB8, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EA64] insert_leaf: s/d/r:1663/90693/107106 tid 17/99 [cur:0/5770EAF4, xid:355075, rmid:10(Heap), len:84/112, prev:0/5770EAB8] update: s/d/r:1663/90693/107093 block 1 off 37 to block 107 off 31 [cur:0/5770EB64, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EAF4] insert_leaf: s/d/r:1663/90693/107099 tid 20/143 [cur:0/5770EBA4, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EB64] insert_leaf: s/d/r:1663/90693/107100 tid 30/80 [cur:0/5770EBE4, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EBA4] insert_leaf: s/d/r:1663/90693/107101 tid 21/132 [cur:0/5770EC20, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EBE4] insert_leaf: s/d/r:1663/90693/107102 tid 28/7 [cur:0/5770EC5C, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EC20] insert_leaf: s/d/r:1663/90693/107103 tid 33/2 [cur:0/5770EC98, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EC5C] insert_leaf: s/d/r:1663/90693/107104 tid 18/232 [cur:0/5770ECD4, xid:355075, rmid:11(Btree), len:50/78, prev:0/5770EC98] insert_leaf: s/d/r:1663/90693/107105 tid 40/100 [cur:0/5770ED24, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770ECD4] insert_leaf: s/d/r:1663/90693/107106 tid 30/137 [cur:0/5770ED60, xid:355075, rmid:10(Heap), len:84/112, prev:0/5770ED24] update: s/d/r:1663/90693/107093 block 1 off 38 to block 107 off 32 [cur:0/5770EDD0, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770ED60] insert_leaf: s/d/r:1663/90693/107099 tid 20/187 [cur:0/5770EE10, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EDD0] insert_leaf: s/d/r:1663/90693/107100 tid 31/43 [cur:0/5770EE50, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE10] insert_leaf: s/d/r:1663/90693/107101 tid 21/152 [cur:0/5770EE8C, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE50] insert_leaf: s/d/r:1663/90693/107102 tid 28/7 [cur:0/5770EEC8, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE8C] insert_leaf: s/d/r:1663/90693/107103 tid 33/2 [cur:0/5770EF04, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EEC8] insert_leaf: s/d/r:1663/90693/107104 tid 18/232 [cur:0/5770EF40, xid:355075, rmid:11(Btree), len:50/78, prev:0/5770EF04] insert_leaf: s/d/r:1663/90693/107105 tid 56/107 [cur:0/5770EF90, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EF40] insert_leaf: s/d/r:1663/90693/107106 tid 18/28 Is there any way to use this data for undo? On Thu, Mar 15, 2012 at 4:22 PM, Ivan kuzma...@gmail.com wrote: Hi all. Today an accident happened on one of my databases. I have a table named payments with about 5400 rows. I have done a query update payments set amount = 0; where id in (2354,2353,1232). Please note the semicolon inside — I missed it =( Now all my data is lost. And after this happened I realized that backups script is broken (my fault, I know) I googled a little and have found that there is a chance to restore my data using files in pg_xlog directory. But everybody say about PITR and I don't use it. Also there is a xlogviewer project (from old 2006) that I'm trying to install on my Gentoo right now. I copied all PGDATA directory and made a dump of all databases. Also I turned off my webserver. Postgres is still running. Please give me some step-by step guide what should I do next? Is there any chance to restore my data? I use postgresql 8.4 with default config (autovacuum is commented) -- __ Yours sincerely, Ivan Kuznetsov aka Kuzma mailto: kuzma...@gmail.com -- __ Yours sincerely, Ivan Kuznetsov aka Kuzma mailto: kuzma...@gmail.com
Re: [GENERAL] Indexing MS/Open Office and PDF documents
On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com wrote: Hi, We are looking to use Postgres 9 for the document storing and would like to take advantage of the full text search capabilities. We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. Any advice would be appreciated. The first step is to find a library that can parse such documents, or convert them to a format that can be parsed. After you do that, PostgreSQL allows you to load arbitrary code as functions (in various languages), so that will allow you to make use of the library. It's hard to give more specific advice until you've found the library you'd like to work with. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indexing MS/Open Office and PDF documents
On 15/03/12 21:12, Jeff Davis wrote: On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. The first step is to find a library that can parse such documents, or convert them to a format that can be parsed. I've used docx2txt and pdf2txt and friends to produce text files that I then index during the import process. An external script runs the whole process. All I cared about was extracting raw text though, this does nothing to identify headings etc. -- Richard Huxton Archonet Ltd -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indexing MS/Open Office and PDF documents
On Thu, Mar 15, 2012 at 4:12 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com wrote: Hi, We are looking to use Postgres 9 for the document storing and would like to take advantage of the full text search capabilities. We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. Any advice would be appreciated. The first step is to find a library that can parse such documents, or convert them to a format that can be parsed. I don't know about MS-Office document parsing, but the PoDoFo (pdf parsing library) can strip text from PDFs. Every now and then someone posts to the podofo mailing list with questions related to extracting text for the purposes of indexing it in FTS capable database. Podofo has excellent developer support. The maintainer is quick to accept patches, verify bugs, add features, etc... Disclaimer: I'm not a pdf nor podofo expert. I can't help you accomplish what you want. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] undo update
On 03/15/2012 07:22 AM, Ivan wrote: Hi all. Today an accident happened on one of my databases. I have a table named payments with about 5400 rows. I have done a query update payments set amount = 0; where id in (2354,2353,1232). Please note the semicolon inside — I missed it =( Now all my data is lost. And after this happened I realized that backups script is broken (my fault, I know) I googled a little and have found that there is a chance to restore my data using files in pg_xlog directory. But everybody say about PITR and I don't use it. Also there is a xlogviewer project (from old 2006) that I'm trying to install on my Gentoo right now. I copied all PGDATA directory and made a dump of all databases. Also I turned off my webserver. Postgres is still running. I would first stop PostgreSQL and *then* copy your PGDATA directory. Given how PostgreSQL handles updates in a MVCC-safe way, there is a reasonable possibility that the data is still contained somewhere in the file(s) associated with that table as long as you don't cause it to be overwritten by a CLUSTER, VACUUM FULL or VACUUM followed by more updates. However I cannot speak to the steps or difficulty involved in recovering it. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Commit hits time-out before CommandTimeout
Hello, I'm not sure if it is a Npgsql or a PostgreSQL issue which we run into at the office. I have also posted this on the Npgsql forum (see http://pgfoundry.org/forum/forum.php?thread_id=11381forum_id=519). For a project we have a windows service which synchronizes an online database from a supplier with a local database of a client. The queries against the local database are executed within a transaction block. At the end of the sync procedure we execute a commit on the transaction. This commit can take a few minutes to finish so we set the CommandTimeout high enough, and this works. However for one client we have set the CommandTimout to 3600 seconds (1 hour) but the commit hits a timeout each time after 1410 seconds (23 minutes and 30 seconds). What can be causing this premature timeout? The version of PostgreSQL used is 8.2.9 and the version of Npgsql is 2.0.11 Thanx in advance for the help! Ben
Re: [GENERAL] Temporal foreign keys
On Fri, 2012-02-03 at 07:58 +0100, Matthias wrote: Hey, how can I implement temporal foreign keys with postgresql? Is writing triggers the only way to enforce temporal referential integrity currently? Yes, currently that's the only way. Look at CREATE CONSTRAINT TRIGGER. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?
On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin atsaloli.t...@gmail.com wrote: Is there any way to consolidate the pages on the slave without taking replication offline? maybe CLUSTER? filip@dev= create table foobar (id serial primary key, load text); CREATE TABLE filip@dev= insert into foobar(load)select md5(random()::text) from generate_series(1,10); INSERT 0 10 filip@dev= delete from foobar where id%43; -- delete 75% of the table DELETE 75000 filip@dev= select pg_relation_size('foobar'); pg_relation_size -- 6832128 (1 row) filip@dev= CLUSTER foobar using foobar_pkey; CLUSTER filip@dev= select pg_relation_size('foobar'); pg_relation_size -- 1712128 (1 row) Of course events destined to this table will be queued by Slony while the table is locked. Filip
Re: [GENERAL] Indexing MS/Open Office and PDF documents
Word documents can be processed by Abiword into any msword document into html, latex, postscript, text formats with very simple commands; i guess it also exposes some api which can be integrated into document parsers/indexers. Spreadsheets can be processed by utilizing *ExcelFormat *library http://www.codeproject.com/Articles/42504/ExcelFormat-Library or * BasicExcel *library http://www.codeproject.com/Articles/13852/BasicExcel-A-Class-to-Read-and-Write-to-Microsoft Or even the GNU GNumeric project has some api to process spreadsheets which can be used to extract text and index. Code to extract text from PDF http://www.codeproject.com/Articles/7056/Code-to-extract-plain-text-from-a-PDF-file Overall, I guess there are bits and pieces available over the internet and some dedicated efforts are needed to assemble those and develop into a finished product, namely document indexer. Wish you success! On Fri, Mar 16, 2012 at 2:51 AM, dennis jenkins dennis.jenkins...@gmail.com wrote: On Thu, Mar 15, 2012 at 4:12 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com wrote: Hi, We are looking to use Postgres 9 for the document storing and would like to take advantage of the full text search capabilities. We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. Any advice would be appreciated. The first step is to find a library that can parse such documents, or convert them to a format that can be parsed. I don't know about MS-Office document parsing, but the PoDoFo (pdf parsing library) can strip text from PDFs. Every now and then someone posts to the podofo mailing list with questions related to extracting text for the purposes of indexing it in FTS capable database. Podofo has excellent developer support. The maintainer is quick to accept patches, verify bugs, add features, etc... Disclaimer: I'm not a pdf nor podofo expert. I can't help you accomplish what you want. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Optimise PostgreSQL for fast testing
Hi all, Just a follow-up. I found the biggest bottleneck and now my specs run as fast as the SQLite ones. TL;DR - the issue was the database cleanup that did the truncation. Apparently SQLite is way too fast there. To fix it I open a transaction before each test and roll it back at the end. Some numbers for ~700 tests. - Truncation: SQLite - 34s, PG - 76s. - Transaction: SQLite - 17s, PG - 18s. 2x speed increase for SQLite. 4x speed increase for PG. Hope that'll help some of you. Cheers, Dmytrii http://ApproachE.com On 27/02/2012, at 10:57 AM, Dmytrii Nagirniak wrote: Hi Guys, Sorry for the late reply. Thanks to all of you for the help. Appreciate all your suggestions. So far (with my pretty limited knowledge of PG) I could speed it up a little bit (~20% or so comparing to the original installation) only by tweaking the settings. I think it is relatively good keeping in mind that no single line of code has been changed. Just my quick summary. Not interested in query tuning for now, just the DB tweaking: Best perf optimisation - `fsync=off`. Paralelisation should be considered as the 2nd option after `fsync=off`. All further optimisations might not be worth the effort unless you know PG well. RAM Disk didn't improve perf much at all. As Craig Ringer replied to my question at SO, the PostgreSQL 9.0 High Performance is worth the read. PG has awesome documentation, including Perf related: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server So far this is my approach: Since SQLite has basic FTS support (which I totally missed; thanks for pointing that out!) I can go a long way with it and probably won't need PG soon. But when I do: Run most of the specs agains SQLite. Only run specs that rely on PG features against PG (which should be minority). Run full acceptance tests (Cucumber) against a production DB (be it SQLite or PG). Will parallelise both unit and acceptance tests in the future. Thanks a lot to all of you guys. Your suggestions, criticism and discussion was really healthy, helpful and to the point. Cheers, Dmytrii http://www.ApproachE.com On 24/02/2012, at 9:25 PM, Simon Riggs wrote: On Fri, Feb 24, 2012 at 12:16 AM, Dmytrii Nagirniak dna...@gmail.com wrote: That's totally fine if PG can't beat SQLite on speed in **this particular case**. The point is that PG can beat SQLite in this test *easily* if you choose to use the main architectural difference as an advantage: running tests concurrently. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services