Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict

2012-03-15 Thread Devrim GÜNDÜZ

Hi John,

On Wed, 2012-03-14 at 21:54 -0700, John R Pierce wrote:
 
 So I have a CentOS 6.2 x86_64 system that was running postgres 9.0
 from  Devrim's yum repo... 

It is now a community repo ;)

 I installed 9.1 and was going to do a parallel  upgrade, but the entry
 for 9.0 in  /etc/ld.so.conf.d/postgresql-9.0-libs.conf  was
 causing  /usr/pgsql-9.1/bin/psql to load the wrong libpq, 
 /usr/pgsql-9.0/lib/libpq.so.5 which was triggering the error...
 
 psql: invalid connection option client_encoding 

I have created a ticket for that, but could not close yet. You can see
the progress from here:

http://wiki.pgrpms.org/ticket/77

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict

2012-03-15 Thread John R Pierce

On 03/14/12 11:40 PM, Devrim GÜNDÜZ wrote:

I have created a ticket for that, but could not close yet. You can see
the progress from here:

http://wiki.pgrpms.org/ticket/77


I don't see anything but the ticket itself?


anyways, I'm quite sure this is the problem...

# cat /etc/ld.so.conf.d/postgresql-9.0-libs.conf
/usr/pgsql-9.0/lib/

# cat /etc/ld.so.conf.d/postgresql-9.1-libs.conf
/usr/pgsql-9.1/lib/



I'm pretty sure that removing the ld.so.conf.d files will fix this.   
I'm setting up a test in a spare VM now.


k, stock centos 6.2 with 9.0.7...

$ rpm -qa |grep postgres
postgresql90-libs-9.0.7-1PGDG.rhel6.x86_64
postgresql90-server-9.0.7-1PGDG.rhel6.x86_64
postgresql90-9.0.7-1PGDG.rhel6.x86_64
postgresql90-contrib-9.0.7-1PGDG.rhel6.x86_64
postgresql90-devel-9.0.7-1PGDG.rhel6.x86_64

$ ldd /usr/pgsql-9.0/bin/psql | grep libpq
libpq.so.5 = /usr/pgsql-9.0/lib/libpq.so.5 (0x7f340f956000)

so... I install 9.1 ...

# rpm -ivh 
http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-centos91-9.1-4.noarch.rpm
Retrieving 
http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-centos91-9.1-4.noarch.rpm
Preparing...### 
[100%]
   1:pgdg-centos91  ### 
[100%]


# yum install postgresql91{-server,-contrib}

  Installing : postgresql91-libs-9.1.3-1PGDG.rhel6.x86_64 1/4
  Installing : postgresql91-9.1.3-1PGDG.rhel6.x86_64  2/4
  Installing : postgresql91-server-9.1.3-1PGDG.rhel6.x86_64   3/4
  Installing : postgresql91-contrib-9.1.3-1PGDG.rhel6.x86_64  4/4

Installed:
  postgresql91-contrib.x86_64 0:9.1.3-1PGDG.rhel6
  postgresql91-server.x86_64 0:9.1.3-1PGDG.rhel6

Dependency Installed:
  postgresql91.x86_64 0:9.1.3-1PGDG.rhel6
  postgresql91-libs.x86_64 0:9.1.3-1PGDG.rhel6

Complete!


and

# /usr/pgsql-9.1/bin/psql
psql: invalid connection option client_encoding

# ldd /usr/pgsql-9.1/bin/psql |grep libpq
libpq.so.5 = /usr/pgsql-9.0/lib/libpq.so.5 (0x7f989aa2)

ps!  whats 9.0 doing here??

k, so thats the problem.

so... I got evil.   I mv'd /etc/ld.so.conf.d/postgres*.conf to backup.

# ldd /usr/pgsql-9.1/bin/psql |grep libpq
libpq.so.5 = not found

uh oh.   that means they weren't linked with -R or -rpath 
/usr/pgsql-9.1/lib, which they should have been if --prefix 
/usr/pgsql-9.1 was used



OH.

CONFIGURE = '--disable-rpath' '--prefix=/usr/pgsql-9.1' .


muh oh.   why is that?!?




--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_role vs. pg_shadow or pg_user

2012-03-15 Thread Alexander Reichstadt
The 8.1 version of the docu explicitly outlined the migration, the 9.1 version 
no longer covers the way things were before 8.1. In the meantime I also found 
http://www.postgresql.org/docs/9.0/interactive/role-membership.html which 
cleared things up exhaustively and by example.

Alex

 

Am 14.03.2012 um 22:52 schrieb Tom Lane:

 Alexander Reichstadt l...@mac.com writes:
 in the documentation of 8.1 the concept of roles is outlined compared
 to users and groups at
 http://www.postgresql.org/docs/8.1/static/user-manag.html.
 
 Um ... why are you reading 8.1 documentation while running 9.1?  There
 are likely to be some obsolete things in there.
 
 I also read today that pg_shadow is the real table containing the
 users as opposed to pg_user which is only a view and one never
 displaying anything but  for the password. I don't have the link
 where that was,
 
 Whereever it was, it was even more obsolete than the 8.1 docs.
 pg_shadow has been a view (on pg_authid) for quite a while now.
 Try \d+ pg_shadow in psql.
 
 The reason this is such a mess is that we've changed the catalog
 representation several times, each time leaving behind a view that
 was meant to emulate the old catalog.  For some time now, pg_authid
 has been the ground truth, but it stores entries for both login and
 non-login roles, which more or less correspond to what used to be
 users and groups.  pg_roles is the only non-protected view that
 shows you all the entries.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] copy in date string 00-00-00 00:00:00

2012-03-15 Thread Martin Gregorie
On Wed, 2012-03-14 at 21:52 -0700, Mark Phillips wrote:
 I am not familiar with sed, except for some trivial bits I nicked off
 the web. Enough to know it works, and to be dangerous. Nonetheless,
 using SED may be the way to go as there are two tables that contain a
 bit over 3,000,000 rows each. 
 
You should also consider using awk/gawk with the field separator (FS
variable) set to match the one in your input (','). The advantages in
this case are that it can be made to work on specific fields in the CSV
file and not look at the rest, something like:

BEGIN { FS = ','; }  # set field sep to comma
$5 == '-00-00 00:00:00' { $5 = '' }  # empty field 5 if it matches
{print } # output all lines

Disclaimer: this is untested example code


Martin






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Backups

2012-03-15 Thread Richard Harley

Hello all

Very simple question -  does pg_dump/dumpall hit the server in terms of 
database performance? We currently do nightly backups and I want to move 
to hourly backups but not at the expense of hogging all the resources 
for 5 mins.


Pg_dumpall is currently producing a 1GB file - that's the combined size 
of around 30 databases and it takes around 5 mins to run.


Thanks
Rich


Re: [GENERAL] Backups

2012-03-15 Thread Bèrto ëd Sèra
Hi Richard,

it's no easy answer. If your server has plenty of free resources there
won't be trouble, but I do have customers who cannot even imagine of
launching a dump in normal traffic hours. How loaded is your box, currently?

Cheerio
Bèrto

On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com wrote:

 **
 Hello all

 Very simple question -  does pg_dump/dumpall hit the server in terms of
 database performance? We currently do nightly backups and I want to move to
 hourly backups but not at the expense of hogging all the resources for 5
 mins.

 Pg_dumpall is currently producing a 1GB file - that's the combined size of
 around 30 databases and it takes around 5 mins to run.

 Thanks
 Rich




-- 
==
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


Re: [GENERAL] Backups

2012-03-15 Thread Richard Harley
Thanks for a quick reply. The server has 6 cores, 6GB ram and top gets 
to 2.3-2.5 load average when running the dumpall. So I assume we are 
nowhere near this causing performance issues for users?

Thanks
Rich




On 15/03/12 12:21, Bèrto ëd Sèra wrote:

Hi Richard,

it's no easy answer. If your server has plenty of free resources there 
won't be trouble, but I do have customers who cannot even imagine of 
launching a dump in normal traffic hours. How loaded is your box, 
currently?


Cheerio
Bèrto

On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com 
mailto:rich...@scholarpack.com wrote:


Hello all

Very simple question -  does pg_dump/dumpall hit the server in
terms of database performance? We currently do nightly backups and
I want to move to hourly backups but not at the expense of hogging
all the resources for 5 mins.

Pg_dumpall is currently producing a 1GB file - that's the combined
size of around 30 databases and it takes around 5 mins to run.

Thanks
Rich




--
==
If Pac-Man had affected us as kids, we'd all be running around in a 
darkened room munching pills and listening to repetitive music.




Re: [GENERAL] Backups

2012-03-15 Thread Bèrto ëd Sèra
Hi

yes, if you get like ~2.5 when running it in peak hour it should definitely
be harmless.

Bèrto

On 15 March 2012 12:37, Richard Harley rich...@scholarpack.com wrote:

 **
 Thanks for a quick reply. The server has 6 cores, 6GB ram and top gets to
 2.3-2.5 load average when running the dumpall. So I assume we are nowhere
 near this causing performance issues for users?
 Thanks
 Rich





 On 15/03/12 12:21, Bèrto ëd Sèra wrote:

 Hi Richard,

  it's no easy answer. If your server has plenty of free resources there
 won't be trouble, but I do have customers who cannot even imagine of
 launching a dump in normal traffic hours. How loaded is your box, currently?

  Cheerio
 Bèrto

 On 15 March 2012 12:15, Richard Harley rich...@scholarpack.com wrote:

  Hello all

 Very simple question -  does pg_dump/dumpall hit the server in terms of
 database performance? We currently do nightly backups and I want to move to
 hourly backups but not at the expense of hogging all the resources for 5
 mins.

 Pg_dumpall is currently producing a 1GB file - that's the combined size of
 around 30 databases and it takes around 5 mins to run.

 Thanks
 Rich




  --
 ==
 If Pac-Man had affected us as kids, we'd all be running around in a
 darkened room munching pills and listening to repetitive music.





-- 
==
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


[GENERAL] WHERE IN (subselect) versus WHERE IN (1,2,3,)

2012-03-15 Thread Kevin Goess
My apologies, I'm sure this question has been asked before but I couldn't
find anything on the list that meant anything to me.

We have a table contexts with 1.6 million rows, and a table articles
with 1.4 million rows, where an article is a particular kind of
context.  We want to select from a join on those two tables like this

SELECT COUNT(*)
FROM contexts
JOIN articles ON (articles.context_key=contexts.context_key)
WHERE contexts.context_key IN (...);
/* and some combination of columns from articles and contexts */

If IN(...) is a query, then this guy does a seq scan on the contexts
table, even if the subquery is select col_a from kgtest where kgtest has
one row.  If however I read the ids beforehand and write them into the
query, a la IN (111,222,333...), then the everything is happy, up to at
least 20,000 values written into the sql, at which point smaller machines
will take 2-5 minutes to parse the query.

I can certainly write the ids inline into the SQL, but when I do that I get
the distinct impression that I'm Doing It Wrong.  Is this expected
behavior?  It seems surprising to me.


To demonstrate:

/* nothing up my sleeve */
# select * from kgtest;
  cola
-
 1652729
(1 row)

/* inline, good query plan */
# explain (analyze, buffers) select count(*) from contexts JOIN articles ON
(articles.context_key=contexts.context_key) where contexts.context_key in
(1652729);
 QUERY
PLAN
--
 Aggregate  (cost=3.82..3.83 rows=1 width=0) (actual time=0.188..0.189
rows=1 loops=1)
   Buffers: shared hit=7
   -  Nested Loop  (cost=0.00..3.81 rows=1 width=0) (actual
time=0.181..0.181 rows=0 loops=1)
 Buffers: shared hit=7
 -  Index Scan using contexts_pkey on contexts  (cost=0.00..1.90
rows=1 width=4) (actual time=0.109..0.112 ro
   Index Cond: (context_key = 1652729)
   Buffers: shared hit=4
 -  Index Scan using articles_pkey on articles  (cost=0.00..1.90
rows=1 width=4) (actual time=0.060..0.060 ro
   Index Cond: (articles.context_key = 1652729)
   Buffers: shared hit=3
 Total runtime: 0.324 ms
(11 rows)

  /* subselect, query plan does seq scan on contexts */


# explain (analyze, buffers) select count(*)from contexts JOIN articles ON
(articles.context_key=contexts.context_key) where contexts.context_key in
(select cola from kgtest);
   QUERY
PLAN
--
 Aggregate  (cost=118505.72..118505.73 rows=1 width=0) (actual
time=0.274..0.275 rows=1 loops=1)
   Buffers: shared hit=5
   -  Hash Join  (cost=12512.61..116661.91 rows=737524 width=0) (actual
time=0.269..0.269 rows=0 loops=1)
 Hash Cond: (contexts.context_key = articles.context_key)
 Buffers: shared hit=5
 -  Seq Scan on contexts  (cost=0.00..64533.03 rows=1648203
width=4) (actual time=0.009..0.009 rows=1 loops=1
   Buffers: shared hit=1
 -  Hash  (cost=412.56..412.56 rows=737524 width=8) (actual
time=0.110..0.110 rows=0 loops=1)
   Buckets: 4096  Batches: 32  Memory Usage: 0kB
   Buffers: shared hit=4
   -  Nested Loop  (cost=40.00..412.56 rows=737524 width=8)
(actual time=0.107..0.107 rows=0 loops=1)
 Buffers: shared hit=4
 -  HashAggregate  (cost=40.00..42.00 rows=200
width=4) (actual time=0.069..0.071 rows=1 loops=1)
   Buffers: shared hit=1
   -  Seq Scan on kgtest  (cost=0.00..34.00
rows=2400 width=4) (actual time=0.048..0.050 rows
 Buffers: shared hit=1
 -  Index Scan using articles_pkey on articles
(cost=0.00..1.84 rows=1 width=4) (actual time=0.0
   Index Cond: (articles.context_key = kgtest.cola)
   Buffers: shared hit=3
 Total runtime: 0.442 ms

-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgo...@bepress.com

510-665-1200 x179
www.bepress.com

bepress: sustainable scholarly publishing


[GENERAL] Fetch from cursor with indexed sorting

2012-03-15 Thread Andrey Chursin
Hello

I have CURSOR for query SELECT a FROM table ORDER BY xyz, where table
a is extremely large and xyz is hard-to-evaluate, but indexable
expression(using gist).

As far as I understand, on
SELECT a FROM table ORDER BY xyz LIMIT X
query postgres do not fetch entire index or table, but it uses next
operation on index, fetching tuples one-by-one until X tuples are
fetched. This means, that such queries complexity depends on size of
output data, and not depends on size of table and index. Correct me if
this is not true.

In fact, I have more complicated case. I do not know value of X.
I need to create cursor, open it, fetch records one-by-one and
aggregate. Then on some condition I stop fetching and return
aggregated value. In fact in most cases my procedure scans a little
part of table.

But does postgres understand such usage of index and cursors? Will it
really use index and fetch only little amount of data, or will it read
entire table to memory, creating in-memory copy of sorted data?
There are two troubles with it - the table is large and expression in
ORDER BY is complex to evaluate.

-- 
Regards,
Andrey

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backups

2012-03-15 Thread Scott Marlowe
On Thu, Mar 15, 2012 at 6:15 AM, Richard Harley rich...@scholarpack.com wrote:
 Hello all

 Very simple question -  does pg_dump/dumpall hit the server in terms of
 database performance? We currently do nightly backups and I want to move to
 hourly backups but not at the expense of hogging all the resources for 5
 mins.

 Pg_dumpall is currently producing a 1GB file - that's the combined size of
 around 30 databases and it takes around 5 mins to run.

I'd suggest you look at using PITR backups.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict

2012-03-15 Thread Tom Lane
John R Pierce pie...@hogranch.com writes:
 CONFIGURE = '--disable-rpath' '--prefix=/usr/pgsql-9.1' .
 muh oh.   why is that?!?

It's distro policy on all Red Hat-based systems (and probably others as
well) not to use rpath; you're supposed to use ldconfig instead.  It's
thus totally unsurprising that removing a package's /etc/ld.so.conf.d
file breaks it.

However, it's not clear to me that this policy can support multiple
packages expecting to install different libraries with the same soname
and major version.  I'm not enough of a packaging weenie to know if
there's a standard workaround for that.  Devrim might have to resort to
something like shipping the latest (9.1.x) version of libpq.so in all of
his packages.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fetch from cursor with indexed sorting

2012-03-15 Thread Tom Lane
Andrey Chursin an...@danasoft.ws writes:
 I need to create cursor, open it, fetch records one-by-one and
 aggregate. Then on some condition I stop fetching and return
 aggregated value. In fact in most cases my procedure scans a little
 part of table.

 But does postgres understand such usage of index and cursors?

Not unless you tell it.  You may need to reduce the value of
cursor_tuple_fraction to inform the planner that you're only expecting
to fetch a small part of the cursor's theoretical output.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backups

2012-03-15 Thread Albe Laurenz
Richard Harley wrote:
 Very simple question -  does pg_dump/dumpall hit the server in terms
of database performance? We
 currently do nightly backups and I want to move to hourly backups but
not at the expense of hogging
 all the resources for 5 mins.
 
 Pg_dumpall is currently producing a 1GB file - that's the combined
size of around 30 databases and it
 takes around 5 mins to run.

pg_dump will be a performance hit, because it consumes disk I/O
capacity.
You could measure how the system is affected by your current backup.

On the other hand, instead of doing an hourly pg_dump, it might be
preferable to do a filesystem backup and PITR. That way you have to
do a backup only once a day (or week, depends how much traffic you have
and how fast you have to restore) and can still recover to an
arbitrary point in time.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using copy with a file containing blank rows

2012-03-15 Thread George Weaver

Hi All,

I am trying to use COPY to import postgresql logs into a postgresql database 
for further review and sorting.


The problem I'm encountering happens when COPY hits a blank row:

development=# COPY log
development-#  FROM 
'D:/PostgreSQL/9.1/pg_log/postgresql-2012-03-15_093730.log'

development-#  WITH DELIMITER '^' NULL AS '';

ERROR:  missing data for column ip_address
CONTEXT:  COPY log, line 2:

Line 2 is a blank line.

Is there any way to use COPY with blank lines or do the blank lines have to 
removed before using COPY?


Thanks,
George 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Did xpath_table get dropped.

2012-03-15 Thread Rob Sargent

On 03/14/2012 09:49 PM, Rob Sargent wrote:

On 03/14/2012 08:57 PM, Tom Lane wrote:

Rob Sargentrobjsarg...@gmail.com writes:

I don't see anything to that effect in the release notes I've looked at.


That's cause it's still there. Did you remember to build/install
contrib/xml2? The xpath functions you're listing are all core, I think.

regards, tom lane


Thanks. I blew it on the configure then. Arrrg. Now off to find the
contribs.

Cheers.


For closure, and for any poor slob in the same predicament:
I carefully put my big-boy pants back on and
cd postgres-9.1.2/contrib/xml2
make install

create extension xml2;
Joy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backups

2012-03-15 Thread Bret Stern
Perhaps a RAM DISK could be considered in the equation  



On Thu, 2012-03-15 at 16:30 +0100, Albe Laurenz wrote:
 Richard Harley wrote:
  Very simple question -  does pg_dump/dumpall hit the server in terms
 of database performance? We
  currently do nightly backups and I want to move to hourly backups but
 not at the expense of hogging
  all the resources for 5 mins.
  
  Pg_dumpall is currently producing a 1GB file - that's the combined
 size of around 30 databases and it
  takes around 5 mins to run.
 
 pg_dump will be a performance hit, because it consumes disk I/O
 capacity.
 You could measure how the system is affected by your current backup.
 
 On the other hand, instead of doing an hourly pg_dump, it might be
 preferable to do a filesystem backup and PITR. That way you have to
 do a backup only once a day (or week, depends how much traffic you have
 and how fast you have to restore) and can still recover to an
 arbitrary point in time.
 
 Yours,
 Laurenz Albe
 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] how to measure wal_buffer usage

2012-03-15 Thread Lonni J Friedman
After reading this interesting article on shared_buffers and wal_buffers:
http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html

it got me wondering if my settings were ideal.  Is there some way to
measure wal_buffer usage in real time, so that I could simply monitor
it for some period of time, and then come up with a way of determining
if the current setting is sufficient?

I tried googling, but every reference that I've found simply defaults
to the trial  error approach to performance tuning.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using copy with a file containing blank rows

2012-03-15 Thread Adrian Klaver
On 03/15/2012 09:17 AM, George Weaver wrote:
 Hi All,
 
 I am trying to use COPY to import postgresql logs into a postgresql 
 database for further review and sorting.

Are you using the CSV format to log to the Postgres log?:
http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG

 
 The problem I'm encountering happens when COPY hits a blank row:
 
 development=# COPY log
 development-# FROM 
 'D:/PostgreSQL/9.1/pg_log/postgresql-2012-03-15_093730.log'
 development-# WITH DELIMITER '^' NULL AS '';
 
 ERROR: missing data for column ip_address
 CONTEXT: COPY log, line 2:  
 
 Line 2 is a blank line.
 
 Is there any way to use COPY with blank lines or do the blank lines have 
 to removed before using COPY?

AFAIK, you have to remove blank lines.

 
 Thanks,
 George
 


-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to perform full text search

2012-03-15 Thread Andrus
User can enter any number of words as search string.
In shopping cart the following query is used to find products,
eq. if red cat is entered:

select * from products
where  productname ilike '%'||'red cat'||'%' 
  or productdescription ilike '%'||'red cat'||'%' 
limit 100

This does not find products like red or black cat.
How to change this query so that it returns 100 best matches for for given 
search string?
I read documentaton about full text search but havent found step by step 
solution for this.

Using PostgreSQL 8.2.7 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 
20061115 (prerelease) (Debian 4.1.1-21)

Should I install some contribs to 8.2 or is it better to upgrade server ?

Where to find step by step instructions making this work ?

Andrus.






Re: [GENERAL] psql latex and newlines

2012-03-15 Thread Alvaro Herrera

Excerpts from Wim Bertels's message of jue feb 23 12:46:29 -0300 2012:
 the problem arises when u have a field value that contains a newline
 character, when this field is not the first column, then all the data
 after this newline comes in the first column..
 
 u can try this out, writing a function or table,
 and then add 'enters' or newline in the COMMENT on this function or
 table.

 the option i came across is the usage of \pbox,
 so the suggestion would be to put every field/cell inside a pbox.

Did you come up with a patch for this?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] yum repository packages 9.0 and 9.1 libpq conflict

2012-03-15 Thread John R Pierce

On 03/15/12 7:27 AM, Tom Lane wrote:

It's distro policy on all Red Hat-based systems (and probably others as
well) not to use rpath; you're supposed to use ldconfig instead.  It's
thus totally unsurprising that removing a package's /etc/ld.so.conf.d
file breaks it.


ouch.  I didn't realize that.  -1 to RH for that policy.   -rpath is 
*EXACTLY* the right solution for this sort of problem, I've always 
*HATED* stuff that relies on ld.so or equivalent, or even worse, 
LD_LIBRARY_PATH    sigh.



However, it's not clear to me that this policy can support multiple
packages expecting to install different libraries with the same soname
and major version.  I'm not enough of a packaging weenie to know if
there's a standard workaround for that.  Devrim might have to resort to
something like shipping the latest (9.1.x) version of libpq.so in all of
his packages.


maybe the postgresql91-libs should supercede the ld.so.conf.d file from 
previous versions so there would be only one such .conf file, pointing 
everyone at the latest installed version.otoh, if you removed 9.1, I 
don't know how that would be able to put a previous version back, ugh.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] undo update

2012-03-15 Thread Ivan
Hi all.

Today an accident happened on one of my databases. I have a table named
payments with about 5400 rows. I have done a query update payments set
amount = 0; where id in (2354,2353,1232). Please note the semicolon inside
— I missed it =(

Now all my data is lost. And after this happened I realized that backups
script is broken (my fault, I know)

I googled a little and have found that there is a chance to restore my data
using files in pg_xlog directory. But everybody say about PITR and I don't
use it. Also there is a xlogviewer project (from old 2006) that I'm trying
to install on my Gentoo right now.

I copied all PGDATA directory and made a dump of all databases. Also I
turned off my webserver. Postgres is still running.

Please give me some step-by step guide what should I do next? Is there any
chance to restore my data?

I use postgresql 8.4 with default config (autovacuum is commented)

-- 
__
Yours sincerely, Ivan Kuznetsov aka Kuzma
mailto: kuzma...@gmail.com


Re: [GENERAL] undo update

2012-03-15 Thread Scott Marlowe
On Thu, Mar 15, 2012 at 8:22 AM, Ivan kuzma...@gmail.com wrote:
 Hi all.

 Today an accident happened on one of my databases. I have a table named
 payments with about 5400 rows. I have done a query update payments set
 amount = 0; where id in (2354,2353,1232). Please note the semicolon inside
 — I missed it =(

 Now all my data is lost. And after this happened I realized that backups
 script is broken (my fault, I know)

 I googled a little and have found that there is a chance to restore my data
 using files in pg_xlog directory. But everybody say about PITR and I don't
 use it. Also there is a xlogviewer project (from old 2006) that I'm trying
 to install on my Gentoo right now.

 I copied all PGDATA directory and made a dump of all databases. Also I
 turned off my webserver. Postgres is still running.

 Please give me some step-by step guide what should I do next? Is there any
 chance to restore my data?

 I use postgresql 8.4 with default config (autovacuum is commented)

PITR can't help you after the fact if you don't have a base backup and
archives of the pg_xlog dir etc.

You might be able to pg_resetxlog to make the old rows visible, but
I'm no expert on doing that.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Alexander.Bagerman
Hi,

We are looking to use Postgres 9 for the document storing and would like
to take advantage of the full text search capabilities. We have hard
time identifying MS/Open Office and PDF parsers to index stored
documents and make them available for text searching. Any advice would
be appreciated.

Regards,

-Alex


This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.


Re: [GENERAL] undo update

2012-03-15 Thread Ivan
I have installed xlogviewer and it gives me data like that:

[cur:0/5770E87C, xid:355075, rmid:10(Heap), len:88/116, prev:0/5770E840]
 update: s/d/r:1663/90693/107093 block 1 off 36 to block 107 off 30
 [cur:0/5770E8F0, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770E87C]
 insert_leaf: s/d/r:1663/90693/107099 tid 20/101
 [cur:0/5770E930, xid:355075, rmid:11(Btree), len:38/66, prev:0/5770E8F0]
 insert_leaf: s/d/r:1663/90693/107100 tid 42/146
 [cur:0/5770E974, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E930]
 insert_leaf: s/d/r:1663/90693/107101 tid 21/97
 [cur:0/5770E9B0, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E974]
 insert_leaf: s/d/r:1663/90693/107102 tid 28/7
 [cur:0/5770E9EC, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E9B0]
 insert_leaf: s/d/r:1663/90693/107103 tid 33/2
 [cur:0/5770EA28, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770E9EC]
 insert_leaf: s/d/r:1663/90693/107104 tid 18/232
 [cur:0/5770EA64, xid:355075, rmid:11(Btree), len:54/82, prev:0/5770EA28]
 insert_leaf: s/d/r:1663/90693/107105 tid 46/109
 [cur:0/5770EAB8, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EA64]
 insert_leaf: s/d/r:1663/90693/107106 tid 17/99
 [cur:0/5770EAF4, xid:355075, rmid:10(Heap), len:84/112, prev:0/5770EAB8]
 update: s/d/r:1663/90693/107093 block 1 off 37 to block 107 off 31
 [cur:0/5770EB64, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EAF4]
 insert_leaf: s/d/r:1663/90693/107099 tid 20/143
 [cur:0/5770EBA4, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EB64]
 insert_leaf: s/d/r:1663/90693/107100 tid 30/80
 [cur:0/5770EBE4, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EBA4]
 insert_leaf: s/d/r:1663/90693/107101 tid 21/132
 [cur:0/5770EC20, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EBE4]
 insert_leaf: s/d/r:1663/90693/107102 tid 28/7
 [cur:0/5770EC5C, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EC20]
 insert_leaf: s/d/r:1663/90693/107103 tid 33/2
 [cur:0/5770EC98, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EC5C]
 insert_leaf: s/d/r:1663/90693/107104 tid 18/232
 [cur:0/5770ECD4, xid:355075, rmid:11(Btree), len:50/78, prev:0/5770EC98]
 insert_leaf: s/d/r:1663/90693/107105 tid 40/100
 [cur:0/5770ED24, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770ECD4]
 insert_leaf: s/d/r:1663/90693/107106 tid 30/137
 [cur:0/5770ED60, xid:355075, rmid:10(Heap), len:84/112, prev:0/5770ED24]
 update: s/d/r:1663/90693/107093 block 1 off 38 to block 107 off 32
 [cur:0/5770EDD0, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770ED60]
 insert_leaf: s/d/r:1663/90693/107099 tid 20/187
 [cur:0/5770EE10, xid:355075, rmid:11(Btree), len:34/62, prev:0/5770EDD0]
 insert_leaf: s/d/r:1663/90693/107100 tid 31/43
 [cur:0/5770EE50, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE10]
 insert_leaf: s/d/r:1663/90693/107101 tid 21/152
 [cur:0/5770EE8C, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE50]
 insert_leaf: s/d/r:1663/90693/107102 tid 28/7
 [cur:0/5770EEC8, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EE8C]
 insert_leaf: s/d/r:1663/90693/107103 tid 33/2
 [cur:0/5770EF04, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EEC8]
 insert_leaf: s/d/r:1663/90693/107104 tid 18/232
 [cur:0/5770EF40, xid:355075, rmid:11(Btree), len:50/78, prev:0/5770EF04]
 insert_leaf: s/d/r:1663/90693/107105 tid 56/107
 [cur:0/5770EF90, xid:355075, rmid:11(Btree), len:30/58, prev:0/5770EF40]
 insert_leaf: s/d/r:1663/90693/107106 tid 18/28


Is there any way to use this data for undo?


On Thu, Mar 15, 2012 at 4:22 PM, Ivan kuzma...@gmail.com wrote:

 Hi all.

 Today an accident happened on one of my databases. I have a table named
 payments with about 5400 rows. I have done a query update payments set
 amount = 0; where id in (2354,2353,1232). Please note the semicolon inside
 — I missed it =(

 Now all my data is lost. And after this happened I realized that backups
 script is broken (my fault, I know)

 I googled a little and have found that there is a chance to restore my
 data using files in pg_xlog directory. But everybody say about PITR and I
 don't use it. Also there is a xlogviewer project (from old 2006) that I'm
 trying to install on my Gentoo right now.

 I copied all PGDATA directory and made a dump of all databases. Also I
 turned off my webserver. Postgres is still running.

 Please give me some step-by step guide what should I do next? Is there any
 chance to restore my data?

 I use postgresql 8.4 with default config (autovacuum is commented)

 --
 __
 Yours sincerely, Ivan Kuznetsov aka Kuzma
 mailto: kuzma...@gmail.com




-- 
__
Yours sincerely, Ivan Kuznetsov aka Kuzma
mailto: kuzma...@gmail.com


Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Jeff Davis
On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com
wrote:
 Hi,
 
 We are looking to use Postgres 9 for the document storing and would
 like to take advantage of the full text search capabilities. We have
 hard time identifying MS/Open Office and PDF parsers to index stored
 documents and make them available for text searching. Any advice would
 be appreciated.

The first step is to find a library that can parse such documents, or
convert them to a format that can be parsed.

After you do that, PostgreSQL allows you to load arbitrary code as
functions (in various languages), so that will allow you to make use of
the library. It's hard to give more specific advice until you've found
the library you'd like to work with.

Regards,
Jeff Davis



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Richard Huxton

On 15/03/12 21:12, Jeff Davis wrote:

On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com



We have
hard time identifying MS/Open Office and PDF parsers to index stored
documents and make them available for text searching.



The first step is to find a library that can parse such documents, or
convert them to a format that can be parsed.


I've used docx2txt and pdf2txt and friends to produce text files that I 
then index during the import process. An external script runs the whole 
process. All I cared about was extracting raw text though, this does 
nothing to identify headings etc.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread dennis jenkins
On Thu, Mar 15, 2012 at 4:12 PM, Jeff Davis pg...@j-davis.com wrote:
 On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com
 wrote:
 Hi,

 We are looking to use Postgres 9 for the document storing and would
 like to take advantage of the full text search capabilities. We have
 hard time identifying MS/Open Office and PDF parsers to index stored
 documents and make them available for text searching. Any advice would
 be appreciated.

 The first step is to find a library that can parse such documents, or
 convert them to a format that can be parsed.

I don't know about MS-Office document parsing, but the PoDoFo (pdf
parsing library) can strip text from PDFs.  Every now and then someone
posts to the podofo mailing list with questions related to extracting
text for the purposes of indexing it in FTS capable database.  Podofo
has excellent developer support.  The maintainer is quick to accept
patches, verify bugs, add features, etc...   Disclaimer: I'm not a pdf
nor podofo expert.  I can't help you accomplish what you want.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] undo update

2012-03-15 Thread Steve Crawford

On 03/15/2012 07:22 AM, Ivan wrote:

Hi all.

Today an accident happened on one of my databases. I have a table 
named payments with about 5400 rows. I have done a query update 
payments set amount = 0; where id in (2354,2353,1232). Please note 
the semicolon inside — I missed it =(


Now all my data is lost. And after this happened I realized that 
backups script is broken (my fault, I know)


I googled a little and have found that there is a chance to restore my 
data using files in pg_xlog directory. But everybody say about PITR 
and I don't use it. Also there is a xlogviewer project (from old 2006) 
that I'm trying to install on my Gentoo right now.


I copied all PGDATA directory and made a dump of all databases. Also I 
turned off my webserver. Postgres is still running.


I would first stop PostgreSQL and *then* copy your PGDATA directory. 
Given how PostgreSQL handles updates in a MVCC-safe way, there is a 
reasonable possibility that the data is still contained somewhere in the 
file(s) associated with that table as long as you don't cause it to be 
overwritten by a CLUSTER, VACUUM FULL or VACUUM followed by more 
updates. However I cannot speak to the steps or difficulty involved in 
recovering it.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Commit hits time-out before CommandTimeout

2012-03-15 Thread Ben Schalley
Hello,

I'm not sure if it is a Npgsql or a PostgreSQL issue which we run into at
the office. I have also posted this on the Npgsql forum (see
http://pgfoundry.org/forum/forum.php?thread_id=11381forum_id=519).

For a project we have a windows service which synchronizes an online
database from a supplier with a local database of a client. The queries
against the local database are executed within a transaction block. At the
end of the sync procedure we execute a commit on the transaction. This
commit can take a few minutes to finish so we set the CommandTimeout high
enough, and this works.

However for one client we have set the CommandTimout to 3600 seconds (1
hour) but the commit hits a timeout each time after 1410 seconds (23
minutes and 30 seconds).

What can be causing this premature timeout?

The version of PostgreSQL used is 8.2.9 and the version of Npgsql is 2.0.11

Thanx in advance for the help!

Ben


Re: [GENERAL] Temporal foreign keys

2012-03-15 Thread Jeff Davis
On Fri, 2012-02-03 at 07:58 +0100, Matthias wrote:
 Hey,
 
 how can I implement temporal foreign keys with postgresql? Is writing
 triggers the only way to enforce temporal referential integrity
 currently?

Yes, currently that's the only way. Look at CREATE CONSTRAINT TRIGGER.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

2012-03-15 Thread Filip Rembiałkowski
On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin 
atsaloli.t...@gmail.com wrote:


 Is there any way to consolidate the pages on the slave without taking
 replication offline?


maybe CLUSTER?

filip@dev= create table foobar (id serial primary key, load text);
CREATE TABLE
filip@dev= insert into foobar(load)select md5(random()::text) from
generate_series(1,10);
INSERT 0 10
filip@dev= delete from foobar where id%43; -- delete 75% of the table
DELETE 75000
filip@dev= select pg_relation_size('foobar');
 pg_relation_size
--
  6832128
(1 row)

filip@dev= CLUSTER foobar using foobar_pkey;
CLUSTER
filip@dev= select pg_relation_size('foobar');
 pg_relation_size
--
  1712128
(1 row)

Of course events destined to this table will be queued by Slony while the
table is locked.


Filip


Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Samba
Word documents can be processed by Abiword into any msword document into
html, latex, postscript, text formats with very simple commands; i guess it
also exposes some api which can be integrated into document
parsers/indexers.

Spreadsheets can be processed by utilizing *ExcelFormat *library
http://www.codeproject.com/Articles/42504/ExcelFormat-Library

or * BasicExcel *library
http://www.codeproject.com/Articles/13852/BasicExcel-A-Class-to-Read-and-Write-to-Microsoft

Or even the GNU GNumeric project has some api to process spreadsheets which
can be used to extract text and index.

Code to extract text from PDF
http://www.codeproject.com/Articles/7056/Code-to-extract-plain-text-from-a-PDF-file


Overall, I guess there are bits and pieces available over the internet and
some dedicated efforts are needed to assemble those and develop into a
finished product, namely document indexer.

Wish you success!


On Fri, Mar 16, 2012 at 2:51 AM, dennis jenkins dennis.jenkins...@gmail.com
 wrote:

 On Thu, Mar 15, 2012 at 4:12 PM, Jeff Davis pg...@j-davis.com wrote:
  On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com
  wrote:
  Hi,
 
  We are looking to use Postgres 9 for the document storing and would
  like to take advantage of the full text search capabilities. We have
  hard time identifying MS/Open Office and PDF parsers to index stored
  documents and make them available for text searching. Any advice would
  be appreciated.
 
  The first step is to find a library that can parse such documents, or
  convert them to a format that can be parsed.

 I don't know about MS-Office document parsing, but the PoDoFo (pdf
 parsing library) can strip text from PDFs.  Every now and then someone
 posts to the podofo mailing list with questions related to extracting
 text for the purposes of indexing it in FTS capable database.  Podofo
 has excellent developer support.  The maintainer is quick to accept
 patches, verify bugs, add features, etc...   Disclaimer: I'm not a pdf
 nor podofo expert.  I can't help you accomplish what you want.

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Optimise PostgreSQL for fast testing

2012-03-15 Thread Dmytrii Nagirniak
Hi all,

Just a follow-up.

I found the biggest bottleneck and now my specs run as fast as the SQLite ones.

TL;DR - the issue was the database cleanup that did the truncation. Apparently 
SQLite is way too fast there.

To fix it I open a transaction before each test and roll it back at the end.

Some numbers for ~700 tests.

- Truncation: SQLite - 34s, PG - 76s.
- Transaction: SQLite - 17s, PG - 18s.

2x speed increase for SQLite.
4x speed increase for PG.

Hope that'll help some of you.

Cheers,
Dmytrii
http://ApproachE.com






On 27/02/2012, at 10:57 AM, Dmytrii Nagirniak wrote:

 Hi Guys,
 
 Sorry for the late reply.
 
 Thanks to all of you for the help. Appreciate all your suggestions.
 
 So far (with my pretty limited knowledge of PG) I could speed it up a little 
 bit (~20% or so comparing to the original installation) only by tweaking 
 the settings.
 
 I think it is relatively good keeping in mind that no single line of code has 
 been changed.
 
 Just my quick summary. Not interested in query tuning for now, just the DB 
 tweaking:
 Best perf optimisation - `fsync=off`.
 Paralelisation should be considered as the 2nd option after `fsync=off`.
 All further optimisations might not be worth the effort unless you know PG 
 well.
 RAM Disk didn't improve perf much at all.
 As Craig Ringer replied to my question at SO, the PostgreSQL 9.0 High 
 Performance is worth the read.
 PG has awesome documentation, including Perf related: 
 http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
 
 So far this is my approach:
 Since SQLite has basic FTS support (which I totally missed; thanks for 
 pointing that out!) I can go a long way with it and probably won't need PG 
 soon. But when I do:
 Run most of the specs agains SQLite. Only run specs that rely on PG features 
 against PG (which should be minority).
 Run full acceptance tests (Cucumber) against a production DB (be it SQLite or 
 PG).
 Will parallelise both unit and acceptance tests in the future.
 
 
 Thanks a lot to all of you guys.
 Your suggestions, criticism and discussion was really healthy, helpful and to 
 the point.
 
 
 Cheers,
 Dmytrii
 http://www.ApproachE.com
 
 
 
 On 24/02/2012, at 9:25 PM, Simon Riggs wrote:
 
 On Fri, Feb 24, 2012 at 12:16 AM, Dmytrii Nagirniak dna...@gmail.com wrote:
 
 That's totally fine if PG can't beat SQLite on speed in **this particular
 case**.
 
 The point is that PG can beat SQLite in this test *easily* if you
 choose to use the main architectural difference as an advantage:
 running tests concurrently.
 
 -- 
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services