Re: [GENERAL] How to convert a binary filed to an integer field?

2010-08-25 Thread wei725
You are right. I typed the message in hurry before I left home this morning. It 
is the boolean type. Thanks for your suggestion. The NULL value may not work 
for jdbc. On the application level, a fixed set of constants is used to 
represent the three status, which are converted into an integer. 


- Original Message -
From: "Adrian Klaver" 
To: wei...@lycos.com
Cc: pgsql-general@postgresql.org
Sent: Wednesday, August 25, 2010 12:45:21 PM GMT -08:00 US/Canada Pacific
Subject: Re: [GENERAL] How to convert a binary filed to an integer field?

On 08/25/2010 12:29 PM, wei...@lycos.com wrote:
> I have an application in the product. Now, one status field needs to have 
> three statuses instead of two. How to make a such change in PostgreSQL?
>

I am going to assume that by binary you mean boolean. If so it already 
has three states NULL,True,False.

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread Craig Ringer

On 26/08/2010 1:06 AM, Steve Clark wrote:

On 08/25/2010 12:30 PM, Joshua D. Drake wrote:

On Wed, 2010-08-25 at 12:20 -0400, Eric Comeau wrote:



Without even changing any line of data or code in sql !

Incredible, isn't it ?



Curious- what postgresql.conf settings did you change to improve it?


The most obvious would be to turn fsync off, sychronous_commit off,
increase work_mem, increase checkpoint_timeout, increase wal_segments.

JD








can these be changed on the fly via set commands or does the config file
have to be changed and postgres stopped and restarted.


First: Many options can be changed by editing the config file then 
telling the postmaster to reload its configuration, rather that 
restarting the postmaster. See pg_ctl.


As for the specific options:

Checkpoint and WAL tuning is necessary and important in any real 
postgresql instance under load, and it's quite safe to adjust the 
checkpoint timeouts and wal segment counts to suit your needs. You'll 
need a restart to change the number of wal segments; I'm not so sure 
about the checkpoint timeout.


You can't change fsync without a config file edit and a restart. You 
should **NEVER** be using fsync=off with data you cannot afford to lose, 
so it's a good thing in a way. You might use it to help initially load a 
database with bulk data, but fsync should be turned back on and the 
database restarted before you start actually using it. fsync=off is 
**NOT SAFE**.


synchronous_commit also has effects on data safety. It permits the loss 
of transactions committed within the commit delay interval if the server 
crashes. If you turn it on, you need to decide how much recent work you 
can afford to lose if the database crashes. Not sure if it can be 
applied with a reload or whether it requires a full server restart.


So: if you don't know exactly what you're doing, leave fsync alone.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread John R Pierce

 On 08/25/10 11:47 AM, Wojciech Strzałka wrote:

  The data set is 9mln rows - about 250 columns


Having 250 columns in a single table sets off the 'normalization' alarm 
in my head.







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to setup PostgreSQL to work with libpam-pgsql/libnss-pgsql2?

2010-08-25 Thread PMC OS
Good evening,

> -Ursprüngliche Nachricht-
> Von: Craig Ringer 
> > 
> > Have now installed slapd on my OMAP L138 but now it has crashed the
> > kernel and I cna not more boot the server because it want o init
> > slapd and crash.
> That's ... surprising.
> Kernel panic? Or is it just that slapd is crashing?

First PostgreSQL is started and then it try to start slapd and the whole system 
panics.  I have the problem with several programs which want run on ARMEL 
architecture even if there are compiled for it.

I use the Debian standard  distribution Lenny and Squeeze but I am ongoing to 
recompile the whole system  for  EmDebian if my Shiva-Plug

> > apt-cache show ldap-auth-client
> > W: Kann Paket ldap-auth-client nicht finden
> > E: Keine Pakete gefunden
> My bad. Looks like it's an Ubuntu extension, just a metapackage that
> pulls in libnss-ldap and libpam-ldap and provides a bit of config
> support for them.

:-/

Greetings




freenetMail mobil – Alle E-Mails auf Ihrem Handy versenden und empfangen.
Jetzt kinderleicht und kostenlos einrichten. 
http://tls.freenet.de/tipp/handymail/index.html

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with dumps

2010-08-25 Thread Bill Christensen



Le 09/08/2010 20:04, Bill Christensen a écrit :

 Hi folks,

 I'm building a new server with postgres/phppgadmin, and having trouble
 getting the dumps to work properly.   This is my first time installing

 > postgres, so I very well may have missed something.
 >


Just to let y'all know, I got a response today 
from the phpPgAdmin help forum which solved the 
problem.


The exports in the current non-beta release of 
phpPgAdmin doesn't work properly with 
Postgres8.4.  See: 




--
Bill Christensen


Green Building Professionals Directory: 
Sustainable Building Calendar: 
Green Real Estate: 
Straw Bale Registry: 
Books/videos/software: 

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Tom Lane
Denis Papathanasiou  writes:
> alerts=> CREATE INDEX node_val_tsv_idx ON node USING 
> gin(to_tsvector('english', val));
> CREATE INDEX
> alerts=> explain analyze select item_pk from node where 
> tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited 
> Partnership');
> [ doesn't use the index ]

That index doesn't match the query, so it can't be used.  Try

select item_pk from node where 
tag='primaryIssuer.entityType' and
to_tsvector('english', val) @@ plainto_tsquery('Limited Partnership');

Note that seeing that the scan on the other index is pretty cheap,
it's not obvious that indexing the @@ clause is better anyway.
But it *can't* index this query using this index.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Oleg Bartunov

Try this select

alerts=> explain analyze select item_pk from node where
 tag='primaryIssuer.entityType' and val @@ plainto_tsquery('english','Limited
 Partnership');

Read 12.2.2. Creating Indexes at 
http://www.postgresql.org/docs/8.4/static/textsearch-tables.html

CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body));

Notice that the 2-argument version of to_tsvector is used. Only text search 
functions that specify a configuration name can be used in expression indexes 
(Section 11.7). This is because the index contents must be unaffected by 
default_text_search_config. If they were affected, the index contents might be 
inconsistent because different entries could contain tsvectors that were 
created with different text search configurations, and there would be no way to 
guess which was which. It would be impossible to dump and restore such an index 
correctly.



Oleg

On Wed, 25 Aug 2010, Denis Papathanasiou wrote:




we need examples of your explain analyze. I don't want to waste my time
reading theoretical reasoning :)


Here's an actual 'explain analyze' example:

alerts=> CREATE INDEX node_val_tsv_idx ON node USING 
gin(to_tsvector('english', val));

CREATE INDEX
alerts=> explain analyze select item_pk from node where 
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited 
Partnership');
QUERY PLAN 
-
Bitmap Heap Scan on node  (cost=204.26..5792.92 rows=4 width=16) (actual 
time=2.952..131.868 rows=953 loops=1)

  Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
  Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
  ->  Bitmap Index Scan on node_tag_idx  (cost=0.00..204.26 rows=3712 
width=0) (actual time=1.628..1.628 rows=3631 loops=1)

Index Cond: (tag = 'primaryIssuer.entityType'::text)
Total runtime: 133.345 ms
(6 rows)

alerts=> DROP INDEX node_val_tsv_idx;
DROP INDEX
alerts=> explain analyze select item_pk from node where 
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited 
Partnership');
QUERY PLAN 
-
Bitmap Heap Scan on node  (cost=204.26..5792.92 rows=4 width=16) (actual 
time=2.938..93.239 rows=953 loops=1)

  Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
  Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
  ->  Bitmap Index Scan on node_tag_idx  (cost=0.00..204.26 rows=3712 
width=0) (actual time=1.614..1.614 rows=3631 loops=1)

Index Cond: (tag = 'primaryIssuer.entityType'::text)
Total runtime: 94.696 ms
(6 rows)

The table this is run against is defined like this:

CREATE TABLE node (
   pk uuid primary key,
   item_pk uuid not null references item (pk),
   tag text not null,
   val text
);

In addition to the gin/ts_vector index on node.val shown above, there are two 
other explicit indices on this table:


CREATE INDEX node_tag_idx ON node (tag);
CREATE INDEX node_val_idx ON node (val);

The reason for the node_val_idx index is that there will be cases where the 
query phrase is known exactly, so the where clause in the select statement 
will be just "val = 'Limited Partnership'".



btw, Be sure you use the same search configuration as in create index or
index will not be used at all.


Is this indeed the problem here?

The explain output references "val @@ plainto_tsquery()" but as a filter, 
whereas the tag portion of the statement mentions node_tag_idx as the index 
it used.




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Denis Papathanasiou



we need examples of your explain analyze. I don't want to waste my time
reading theoretical reasoning :)


Here's an actual 'explain analyze' example:

alerts=> CREATE INDEX node_val_tsv_idx ON node USING 
gin(to_tsvector('english', val));

CREATE INDEX
alerts=> explain analyze select item_pk from node where 
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited 
Partnership');
 QUERY PLAN 


-
 Bitmap Heap Scan on node  (cost=204.26..5792.92 rows=4 width=16) 
(actual time=2.952..131.868 rows=953 loops=1)

   Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
   Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
   ->  Bitmap Index Scan on node_tag_idx  (cost=0.00..204.26 rows=3712 
width=0) (actual time=1.628..1.628 rows=3631 loops=1)

 Index Cond: (tag = 'primaryIssuer.entityType'::text)
 Total runtime: 133.345 ms
(6 rows)

alerts=> DROP INDEX node_val_tsv_idx;
DROP INDEX
alerts=> explain analyze select item_pk from node where 
tag='primaryIssuer.entityType' and val @@ plainto_tsquery('Limited 
Partnership');
 QUERY PLAN 


-
 Bitmap Heap Scan on node  (cost=204.26..5792.92 rows=4 width=16) 
(actual time=2.938..93.239 rows=953 loops=1)

   Recheck Cond: (tag = 'primaryIssuer.entityType'::text)
   Filter: (val @@ plainto_tsquery('Limited Partnership'::text))
   ->  Bitmap Index Scan on node_tag_idx  (cost=0.00..204.26 rows=3712 
width=0) (actual time=1.614..1.614 rows=3631 loops=1)

 Index Cond: (tag = 'primaryIssuer.entityType'::text)
 Total runtime: 94.696 ms
(6 rows)

The table this is run against is defined like this:

CREATE TABLE node (
pk uuid primary key,
item_pk uuid not null references item (pk),
tag text not null,
val text
);

In addition to the gin/ts_vector index on node.val shown above, there 
are two other explicit indices on this table:


CREATE INDEX node_tag_idx ON node (tag);
CREATE INDEX node_val_idx ON node (val);

The reason for the node_val_idx index is that there will be cases where 
the query phrase is known exactly, so the where clause in the select 
statement will be just "val = 'Limited Partnership'".



btw, Be sure you use the same search configuration as in create index or
index will not be used at all.


Is this indeed the problem here?

The explain output references "val @@ plainto_tsquery()" but as a 
filter, whereas the tag portion of the statement mentions node_tag_idx 
as the index it used.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Oleg Bartunov

Denis,

we need examples of your explain analyze. I don't want to waste my time
reading theoretical reasoning :)

btw, Be sure you use the same search configuration as in create index or
index will not be used at all.

Oleg

On Wed, 25 Aug 2010, Denis Papathanasiou wrote:


As a follow-up to my question from this past Saturday
(http://archives.postgresql.org/pgsql-general/2010-08/msg00624.php), I
experimented with adding two types of indices to the same text column:

CREATE INDEX item_eng_searchable_text_idx ON item USING
gin(to_tsvector('english', searchable_text));

and

CREATE INDEX item_searchable_text_idx ON item (searchable_text);

Running my queries under explain, I noticed that queries of the form:

select pk from item where searchable_text @@ plainto_tsquery('search
phrase');

Actually run *slower* with the item_eng_searchable_text_idx index applied.

But when I switch the query to use to_tsquery() instead, e.g. something
like this:

select pk from item where searchable_text @@ to_tsquery('phrase');

The performance is better.

Is this because the gin/to_tsvector() index works differently for
to_tsquery() compared to plainto_ts_query() ?

If so, how can I create an effective index for queries that will use
plainto_tsquery() ?

Note that I need the plainto_tsquery() form b/c my search phrases will
correspond to exact blocks of text, and therefore they will contain
spaces and punctuation, etc.





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to convert a binary filed to an integer field?

2010-08-25 Thread Peter Geoghegan
> I am going to assume that by binary you mean boolean. If so it already has
> three states NULL,True,False.

Uh, not really, no. NULL indicates the absence or indeterminacy of a
state, and has a number of properties that make it unsuitable for
representing anything other than that.

Use an enum.


-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to convert a binary filed to an integer field?

2010-08-25 Thread Adrian Klaver

On 08/25/2010 12:29 PM, wei...@lycos.com wrote:

I have an application in the product. Now, one status field needs to have three 
statuses instead of two. How to make a such change in PostgreSQL?



I am going to assume that by binary you mean boolean. If so it already 
has three states NULL,True,False.


--
Adrian Klaver
adrian.kla...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to convert a binary filed to an integer field?

2010-08-25 Thread Raymond O'Donnell
On 25/08/2010 20:29, wei...@lycos.com wrote:
> I have an application in the product. Now, one status field needs to
> have three statuses instead of two. How to make a such change in
> PostgreSQL?

Perhaps an enumerated type?

http://www.postgresql.org/docs/8.4/static/datatype-enum.html

Ray.


-- 
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to convert a binary filed to an integer field?

2010-08-25 Thread wei725
I have an application in the product. Now, one status field needs to have three 
statuses instead of two. How to make a such change in PostgreSQL? 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unexpected message type 0x58 during COPY from stdin

2010-08-25 Thread bricklen
On Wed, Aug 25, 2010 at 12:01 PM, Tom Lane  wrote:
>
> bricklen  writes:
> > I'm getting a strange error during reload of one of our databases.
>
> This appears to indicate that the server's first idea that there was
> trouble came when the client side disconnected partway through a COPY:
>
> > 2010-08-25 04:16:46 PDT [2401]: [1-1] (user=postgres) ERROR:  unexpected
> > message type 0x58 during COPY from stdin
>
> (0x58 = 'X' = Terminate message, implying client called PQfinish)
>
> So it would be useful to look at what pg_restore thinks happened.
> I rather suspect your cron setup sends pg_restore's output to
> /dev/null ... or at least you failed to show it.
>
>                        regards, tom lane


cron:

MAILTO=""

30 2 * * * /var/lib/pgsql/sync-db.sh 2>&1

I'll modify that to append to a log file and see what else turns up.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Denis Papathanasiou



Not per se, but maybe looking at the actual outputs of the two function
calls would be enlightening.  I suspect you'll find that the search
conditions you are getting are not equivalent.


Strictly speaking they're not, b/c the plainto_tsquery() is chaining 
several tokens together.


However, at the heart of the question is this: if I define the index on 
that column like this:


CREATE INDEX item_eng_searchable_text_idx ON item USING 
gin(to_tsvector('english', searchable_text));


since, unlike the example in the 
http://www.postgresql.org/docs/8.4/interactive/textsearch-tables.html 
doc, I cannot define the searchable

text column in the table as being being of type ts_vector(), because the
text is not exclusively in English;

then, will the index be effective?

Based on some empirical experiments, it seems not.

If that's indeed that case, then how should I create the index?


In that case you haven't understood how text search works at all.
It simply doesn't do that.  You could possibly follow up a text search
for component words with a LIKE or similar comparison to verify that
the matches actually contain the desired string.


I cannot use LIKE on each token of the phrase because the text in the 
column is unordered, and I would have to do an ILIKE '%'+token+'%' on 
each to be logically correct.


IIRC, wildcards of the form token+'%' can use an index efficiently, but 
wildcards on both ends such as '%'+token+'%' do not.


I did think about splitting the phrase tokens and doing a tsquery() join 
on each token, but it seemed that's why plainto_tsquery() was created, 
unless I misunderstood the docs on that specific point.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unexpected message type 0x58 during COPY from stdin

2010-08-25 Thread Tom Lane
bricklen  writes:
> I'm getting a strange error during reload of one of our databases.

This appears to indicate that the server's first idea that there was
trouble came when the client side disconnected partway through a COPY:

> 2010-08-25 04:16:46 PDT [2401]: [1-1] (user=postgres) ERROR:  unexpected
> message type 0x58 during COPY from stdin

(0x58 = 'X' = Terminate message, implying client called PQfinish)

So it would be useful to look at what pg_restore thinks happened.
I rather suspect your cron setup sends pg_restore's output to
/dev/null ... or at least you failed to show it.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] recovery fail

2010-08-25 Thread Selena Deckelmann
Hi!

2010/8/25 Łukasz Bieniek :
> I'have made a backup my DB with pg_dump its (' pg_dump -U username dbname 
> >backupfilename.sql.gz. When i want to restore it ( I drop the old database 
> ,creaete new and language to it and i type a command 'cat 
> backupfilename.sql.gz|gunzip|/usr/local/pgsql/bin/psql -U username dbname) 
> this backup restore my all tables but its loses relations to some tables. and 
> when it is restoring sysem send me messeges about missing relations . I think 
> it is about order in this recovery data from backup. How to resolve this 
> problem?
>

Try running psql with '-v "ON_ERROR_STOP=true"' so that you get the
first error. You may be having a problem with a dependency early in
the restore process.

-selena


-- 
http://chesnok.com/daily - me

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] recovery fail

2010-08-25 Thread Scott Marlowe
2010/8/25 Łukasz Bieniek :
> I'have made a backup my DB with pg_dump its (' pg_dump -U username dbname 
> >backupfilename.sql.gz. When i want to restore it ( I drop the old database 
> ,creaete new and language to it and i type a command 'cat 
> backupfilename.sql.gz|gunzip|/usr/local/pgsql/bin/psql -U username dbname) 
> this backup restore my all tables but its loses relations to some tables. and 
> when it is restoring sysem send me messeges about missing relations . I think 
> it is about order in this recovery data from backup. How to resolve this 
> problem?

Are you piping the pg_dump through gzip?  It doesn't look like it.  If
you are, replace your restore command with:

zcat backupfile.sql.gz | psql ...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Tom Lane
Denis Papathanasiou  writes:
> Is this because the gin/to_tsvector() index works differently for
> to_tsquery() compared to plainto_ts_query() ?

Not per se, but maybe looking at the actual outputs of the two function
calls would be enlightening.  I suspect you'll find that the search
conditions you are getting are not equivalent.

> Note that I need the plainto_tsquery() form b/c my search phrases will
> correspond to exact blocks of text, and therefore they will contain
> spaces and punctuation, etc.

In that case you haven't understood how text search works at all.
It simply doesn't do that.  You could possibly follow up a text search
for component words with a LIKE or similar comparison to verify that
the matches actually contain the desired string.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread Wojciech Strzałka

 Yea - I'll try to optimize as I had a plan to write to
 pgsql.performance for rescue anyway.

 I don't know exact hardware specification yet - known facts at the
 moment are:
 Sun Turgo?? (SPARC) with 32 cores
 17GB RAM (1GB for shared buffers)
 hdd - ?
 OS - Solaris 10 - the system is running in the zone (Solaris
 virtualization) - however during test nothing else is utilizing the
 machine.
 PostgreSQL 8.4.4 64bit
 
 The data set is 9mln rows - about 250 columns
 The result database size is ~9GB
 Load time ~2h 20min
 CPU utilization - 1,2% (half of the one core)
 iostat shows writes ~6MB/s,  20% busy
 when I run 2 loads in parallel the CPU is split to 2*0,6%, hdd write
 ~7MB (almost the same)

 postgresql.conf changes:
 checkpoint_segments - 128
 checkpoint_timeout - 30min
 shared_buffers - 1GB
 maintenance_work_mem - 128MB


 does it looks like my HDD is the problem? or maybe the Solaris
 virtualization?

 what's also interesting - table is empty when I start (by truncate)
 but while the COPY is working, I see it grows (by \d+ or
 pg_total_relation_size) about 1MB per second
 what I'd expect it should grow at checkpoints only, not all the
 time - am I wrong?

 
 

> On Wed, 2010-08-25 at 12:20 -0400, Eric Comeau wrote:
>>  
>> > Without even changing any line of data or code in sql !
>> > 
>> > Incredible, isn't it ?
>> > 
>> 
>> Curious- what postgresql.conf settings did you change to improve it?

> The most obvious would be to turn fsync off, sychronous_commit off,
> increase work_mem, increase checkpoint_timeout, increase wal_segments.

> JD

>> 
>> 
>> 




-- 
Pozdrowienia,
 Wojciech Strzałka


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread Steve Clark

On 08/25/2010 12:30 PM, Joshua D. Drake wrote:

On Wed, 2010-08-25 at 12:20 -0400, Eric Comeau wrote:



Without even changing any line of data or code in sql !

Incredible, isn't it ?



Curious- what postgresql.conf settings did you change to improve it?


The most obvious would be to turn fsync off, sychronous_commit off,
increase work_mem, increase checkpoint_timeout, increase wal_segments.

JD








can these be changed on the fly via set commands or does the config file have 
to be
changed and postgres stopped and restarted.

postgres 8.3.7 on freebsd.

--
Stephen Clark
NetWolves
Sr. Software Engineer III
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.cl...@netwolves.com
www.netwolves.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Optimizing queries that use multiple tables and many order by columns

2010-08-25 Thread Joshua Berry
On Wed, Aug 25, 2010 at 10:40 AM, Wappler, Robert wrote:

> On 2010-08-25, Joshua Berry wrote:
>
> > Here's my latest culprit:
> >
> > select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
> > JOB.CompanyCode, Anl.SampleName
> > from analysis anl join job on anl.job = job.job
> > order by job.companycode, anl.job, anl.lab
> > limit 10;
> >
>
> Could you try to remove the limit clause? I have seen several times,
> that it may slow down a query. Although I haven't tested it, that an
> OFFSET 0 clause can improve the situation, iirc.
>

The actual query uses a cursor, which seems to run the query and after the
entire set is ready to be fetched, it will be able to allow fetching. So,
I'm not using a limit in the actual queries, but something like this:


declare "SQL_CUR0453D910" cursor with hold for
select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
JOB.CompanyCode, Anl.SampleName
from analysis anl join job on anl.job = job.job
order by job.companycode, anl.job, anl.lab;
fetch 10 in "SQL_CUR0453D910";
close "SQL_CUR0453D910";

>From an algebraic point of view, I cannot see obvious inefficiencies.
> Others, which now the internals of pg better, might see more.
>


> That are estimated query plans, what does EXPLAIN ANALYZE say? The query
> plans above do not execute the query instead they just make a rough
> guess about the costs. Reality might be different. Also you may want to
> run VACUUM ANALYZE before.
>
>
The database is vacuum analyze'd and the stat target is the default of 100.
I'm also using PG 8.4.4 running on Centos 5.5 x86_64

--Here's what explain analyze says for the query
explain analyze
declare "SQL_CUR0453D910" cursor with hold for
select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
JOB.CompanyCode, Anl.SampleName
from analysis anl join job on anl.job = job.job
order by job.companycode, anl.job, anl.lab;

Sort  (cost=38047.92..38495.65 rows=179095 width=32) (actual
time=1890.796..2271.248 rows=178979 loops=1)
  Sort Key: job.companycode, anl.job, anl.lab
  Sort Method:  external merge  Disk: 8416kB
  ->  Hash Join  (cost=451.20..18134.05 rows=179095 width=32) (actual
time=8.239..260.848 rows=178979 loops=1)
Hash Cond: (anl.job = job.job)
->  Seq Scan on analysis anl  (cost=0.00..14100.95 rows=179095
width=23) (actual time=0.026..91.602 rows=178979 loops=1)
->  Hash  (cost=287.20..287.20 rows=13120 width=17) (actual
time=8.197..8.197 rows=13120 loops=1)
  ->  Seq Scan on job  (cost=0.00..287.20 rows=13120 width=17)
(actual time=0.007..4.166 rows=13120 loops=1)
Total runtime: 2286.224 ms



> Maybe, the planner decides for a Sort Join, if there are sorted indexes
> for anl.job and job.job. But the speed-up may vary depending on the
> data.
>

It seems to be reading the entire dataset, then sorting, right? There's not
much more that could be done to improve such queries, aside from increasing
memory and IO bandwidth.

But now that I've said that, there's the following query that deals with
exactly the same set of data, but the ordering involves only one of the two
joined tables.

explain analyze
declare "SQL_CUR0453D910" cursor with hold for
select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
JOB.CompanyCode, Anl.SampleName
from analysis anl join job on anl.job = job.job
order by job.companycode --, anl.job, anl.lab; --Only order by indexed
columns from job.

Nested Loop  (cost=0.00..65305.66 rows=179095 width=32) (actual
time=0.084..288.976 rows=178979 loops=1)
  ->  Index Scan using job_companycode on job  (cost=0.00..972.67 rows=13120
width=17) (actual time=0.045..7.328 rows=13120 loops=1)
  ->  Index Scan using analysis_job_lab on analysis anl  (cost=0.00..4.63
rows=22 width=23) (actual time=0.006..0.015 rows=14 loops=13120)
Index Cond: (anl.job = job.job)
Total runtime: 303.230 ms

If I order by columns from the other table, analysis only, I get the follow
query and results:
explain analyze
declare "SQL_CUR0453D910" cursor with hold for
select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
JOB.CompanyCode, Anl.SampleName
from analysis anl join job on anl.job = job.job
order by --job.companycode,
anl.job, anl.lab; --Only order by indexed columns from analysis.

Merge Join  (cost=0.56..44872.45 rows=179095 width=32) (actual
time=0.078..368.620 rows=178979 loops=1)
  Merge Cond: (anl.job = job.job)
  ->  Index Scan using analysis_job_lab on analysis anl
(cost=0.00..35245.47 rows=179095 width=23) (actual time=0.035..128.460
rows=178979 loops=1)
  ->  Index Scan using job_job_pk on job  (cost=0.00..508.53 rows=13120
width=17) (actual time=0.039..53.733 rows=179005 loops=1)
Total runtime: 388.884 ms


Notice that in these cases the query completes in <400 ms and the other
query that involves ordering on columns from both of the joined tables
completes in >2300ms.

In the application here, these queries are used by a client application to
fill a window's listbox that can be scrolled up or down. If the user changes
direction of the scroll, it initiates a new curso

Re: [GENERAL] Feature proposal

2010-08-25 Thread Joshua D. Drake
On Wed, 2010-08-25 at 12:20 -0400, Eric Comeau wrote:
>  
> > Without even changing any line of data or code in sql !
> > 
> > Incredible, isn't it ?
> > 
> 
> Curious- what postgresql.conf settings did you change to improve it?

The most obvious would be to turn fsync off, sychronous_commit off,
increase work_mem, increase checkpoint_timeout, increase wal_segments.

JD

> 
> 
> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb fails to allocate shared memory

2010-08-25 Thread Tom Lane
I wrote:
> Could we see the output of
>   sysctl -a | grep sysv

"ipcs -a" might be informative, too.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb fails to allocate shared memory

2010-08-25 Thread Tom Lane
"A.M."  writes:
> Then it seems that the error reporting could be improved to not mention 
> "shared_buffers" and "max_connections" neither of which I can touch during 
> initdb.

The error has to be phrased to cover the case where you hit it after
initdb.  It would be quite unhelpful to *not* mention those settings.

However, it's odd that you got this variant of the HINT and not the one
that suggests increasing SHMMAX.  Looking at the code, that means that
shmget returned ENOMEM, not EINVAL, which surprises me.  What did you
have your kernel settings at?  Could we see the output of
sysctl -a | grep sysv

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread Eric Comeau
On Wed, 2010-08-25 at 17:06 +0200, Denis BUCHER wrote:
> Le 25.08.2010 09:15, wstrzalka a crit :
> > I'm currently playing with very large data import using COPY from
> > file.
> >
> > As this can be extremely long operation (hours in my case) the nice
> > feature would be some option to show operation progress - how many
> > rows were already imported.
> >
> > Or maybe there is some way to do it? As long as postgres have no read-
> > uncommited I think I can estimate it only by destination table size ??
> 
> By the way, did you try to optimize your postgresql server ?
> 
> In my case I was able to reduce a big data update from :
> 1 hour 15 minutes
> to :
> 5 minutes
> 
> Without even changing any line of data or code in sql !
> 
> Incredible, isn't it ?
> 

Curious- what postgresql.conf settings did you change to improve it?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unexpected message type 0x58 during COPY from stdin

2010-08-25 Thread bricklen
On Wed, Aug 25, 2010 at 8:35 AM, Thom Brown  wrote:

>
>
> Just seeking the obvious, but is the file you've shown being output
> with pg_dump the same one you're using on pg_restore?  The paths are
> different, although if that's on 2 different machines, that would make
> perfect sense.
>
> --
> Thom Brown
> Registered Linux user: #516935
>

Yes, two different servers, same file. SCP is used to transfer that file
between the servers.


[GENERAL] Differences between to_tsquery() and plainto_tsquery() when it comes to applying an index?

2010-08-25 Thread Denis Papathanasiou

As a follow-up to my question from this past Saturday
(http://archives.postgresql.org/pgsql-general/2010-08/msg00624.php), I
experimented with adding two types of indices to the same text column:

CREATE INDEX item_eng_searchable_text_idx ON item USING
gin(to_tsvector('english', searchable_text));

and

CREATE INDEX item_searchable_text_idx ON item (searchable_text);

Running my queries under explain, I noticed that queries of the form:

select pk from item where searchable_text @@ plainto_tsquery('search
phrase');

Actually run *slower* with the item_eng_searchable_text_idx index applied.

But when I switch the query to use to_tsquery() instead, e.g. something
like this:

select pk from item where searchable_text @@ to_tsquery('phrase');

The performance is better.

Is this because the gin/to_tsvector() index works differently for
to_tsquery() compared to plainto_ts_query() ?

If so, how can I create an effective index for queries that will use
plainto_tsquery() ?

Note that I need the plainto_tsquery() form b/c my search phrases will
correspond to exact blocks of text, and therefore they will contain
spaces and punctuation, etc.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb fails to allocate shared memory

2010-08-25 Thread A.M.

On Aug 25, 2010, at 11:31 AM, Joshua D. Drake wrote:

> On Wed, 2010-08-25 at 11:15 -0400, A.M. wrote:
>> I am using pgsql9.0b4 (but pgsql8.4 exhibits the same behavior) on MacOS 
>> 10.6.4 and initdb fails:
>> initdb: removing data directory "/Volumes/Data/pgsql90b"
>> 
>> I would like to create the database space and then reduce the shared memory 
>> requirements in postgresql.conf, but this situation seems to create a 
>> chicken-and-egg problem. How can I reduce shared_buffers or max_connections 
>> prior to running initdb?
> 
> If you don't have enough shared memory to initdb, you don't have enough
> to run postgresql. You need to increase your shared memory for MacOS
> per:
> 
> http://www.postgresql.org/docs/8.4/static/kernel-resources.html
> 
> And then initdb.

Then it seems that the error reporting could be improved to not mention 
"shared_buffers" and "max_connections" neither of which I can touch during 
initdb.

"creating template1 database in /Volumes/Data/pgsql90b/base/1 ... FATAL:  could 
not create shared memory segment: Cannot allocate memory
DETAIL:  Failed system call was shmget(key=1, size=1703936, 03600).
HINT:  This error usually means that PostgreSQL's request for a shared memory 
segment exceeded available memory or swap space. To reduce the request size 
(currently 1703936 bytes), reduce PostgreSQL's shared_buffers parameter 
(currently 50) and/or its max_connections parameter (currently 14).
The PostgreSQL documentation contains more information about shared 
memory configuration."

Cheers,
M
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Optimizing queries that use multiple tables and many order by columns

2010-08-25 Thread Wappler, Robert
On 2010-08-25, Joshua Berry wrote:
 
> Hi Group,
> 
> I've never really learned how to optimize queries that join
> several tables and have order by clauses that specify columns
> from each table. Is there documentation that could help me
> optimize and have the proper indexes in place? I've read
> through the PG Docs Chapter 11 on Indexes yet still lack the
> needed understanding.
> 
> Here's my latest culprit:
> 
> select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
> JOB.CompanyCode, Anl.SampleName
> from analysis anl join job on anl.job = job.job
> order by job.companycode, anl.job, anl.lab
> limit 10;
> 

Could you try to remove the limit clause? I have seen several times,
that it may slow down a query. Although I haven't tested it, that an
OFFSET 0 clause can improve the situation, iirc.

From an algebraic point of view, I cannot see obvious inefficiencies.
Others, which now the internals of pg better, might see more.

> Here's the query plan using PG 8.4.4:
> Limit  (cost=21990.24..21990.27 rows=10 width=32)
>   ->  Sort  (cost=21990.24..22437.69 rows=178979 width=32)
> Sort Key: job.companycode, anl.lab
> ->  Hash Join  (cost=451.20..18122.57 rows=178979 width=32)
>   Hash Cond: (anl.job = job.job) ->  Seq Scan on analysis
>   anl (cost=0.00..14091.79 rows=178979 width=23) ->  Hash 
>   (cost=287.20..287.20 rows=13120 width=17)
> ->  Seq Scan on job  (cost=0.00..287.20
> rows=13120 width=17)
> 
> 
> If I change the above query to only order by one of the
> tables, I get better results:
> select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn,
> JOB.CompanyCode, Anl.SampleName
> from analysis anl join job on anl.job = job.job
> order by job.companycode --, anl.job, anl.lab
> limit 10;
> Limit  (cost=0.00..3.65 rows=10 width=32)
>   ->  Nested Loop  (cost=0.00..65269.13 rows=178979 width=32)
> ->  Index Scan using job_companycode on job (cost=0.00..972.67
> rows=13120 width=17) ->  Index Scan using analysis_job_lab on
> analysis anl
>  (cost=0.00..4.63 rows=22 width=23)
>   Index Cond: (anl.job = job.job)
> 

That are estimated query plans, what does EXPLAIN ANALYZE say? The query
plans above do not execute the query instead they just make a rough
guess about the costs. Reality might be different. Also you may want to
run VACUUM ANALYZE before.

> Any idea on how I can improve this? In the past I would tend
> to create a cached copy of the query as a table that would be
> utilized, but I suspect that there is a better way to go
> about this. I'm using a system (Clarion) which heavily uses
> cursors via the ODBC driver (I use the psqlODBC latest
> version) to get a handful of records at a time, so no actual
> LIMITs would be used in the production queries; I've added
> the LIMITs here to try to simulate the performance
> differences that I find when browsing the data while ordering
> by the above columns.
> 
> 
> Here are the relevant tables and indexes:
> 
> 
> CREATE TABLE job
> (
>   job bigint NOT NULL, -- Job #
>   companycode character(4), -- Company Code
>   recdbycode character(3), -- Initials of who checked in sample(s)
>   datein date, -- Date sample was received
>   project character varying, -- Project or Site name
>   remarks text, -- Remarks
>   --[CONSTRAINTs etc]
> )
> 
> CREATE INDEX job_companycode
>   ON job
>   USING btree
>   (companycode);
> CREATE INDEX job_companycode_job
>   ON samples.job
>   USING btree
>   (companycode, job);
> 

Index job_companycode is not used in the plans. Additionally, it can be
constructed from the second index, as companycode is the primary sort
key.

> CREATE TABLE analysis
> (
>   lab bigint NOT NULL, -- Lab number
>   job bigint, -- Job number
>   sampletype character varying(5), -- General class of sample
>   priority character(1), -- Priority level
>   samplename character varying, -- Sample or Well name
>   CONSTRAINT rel_joblabk_to_jobk FOREIGN KEY (job)
>   REFERENCES job (job) MATCH SIMPLE
>   ON UPDATE CASCADE ON DELETE RESTRICT,
>   --[CONSTRAINTs etc]
> )
> 
> CREATE INDEX analysis_companycode_job_lab
>   ON analysis
>   USING btree
>   (companycode, job, lab);
> 
> CREATE INDEX analysis_job_lab
>   ON analysis
>   USING btree
>   (job, lab);
> 

Maybe, the planner decides for a Sort Join, if there are sorted indexes
for anl.job and job.job. But the speed-up may vary depending on the
data.

> Thanks for any insights and tips you can provide!
> 
> Kind Regards,
> -Joshua Berry
> 
> 
>
 
HTH

-- 
Robert...
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unexpected message type 0x58 during COPY from stdin

2010-08-25 Thread Thom Brown
On 25 August 2010 16:25, bricklen  wrote:
> Hello,
>
> I'm getting a strange error during reload of one of our databases. For the
> past several weeks, we have been seeing the following error regularly, but
> not necessarily with the same table. The pg_restore aborts after this error,
> resulting in an incomplete reload. At first I thought it might be related to
> our upgrade to 8.4.4 several weeks ago so we downgraded to 8.4.2, but the
> error still occurs. Interestingly, if I run the restore command manually
> (from the command line), it works fine -- only in cron does it fail.
>
> uname -a
> Linux ike.nbinternal.com 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST
> 2009 x86_64 x86_64 x86_64 GNU/Linux
>
> version()
> PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
> 20071124 (Red Hat 4.1.2-42), 64-bit
>
> -- dump file is created with:
> pg_dump --format=custom $DATABASE $EXCLUDE_TABLES >
> ~/backups/daily-backup.dump
>
> -- restore command:
>
> pg_restore -d nbaffnet -F c -j 4 -v -C --no-data-for-failed-tables
> /var/lib/pgsql/daily-backup.dump
>
>
> -- error:
> 2010-08-25 04:16:46 PDT [2401]: [1-1] (user=postgres) ERROR:  unexpected
> message type 0x58 during COPY from stdin
> 2010-08-25 04:16:46 PDT [2401]: [2-1] (user=postgres) CONTEXT:  COPY
> afffraud, line 1
> 2010-08-25 04:16:46 PDT [2401]: [3-1] (user=postgres) STATEMENT:  COPY
> afffraud (id, idaff, maxkey, maxvalue) FROM stdin;
> 2010-08-25 04:16:46 PDT [2401]: [4-1] (user=postgres) LOG:  could not send
> data to client: Broken pipe
> 2010-08-25 04:16:46 PDT [2401]: [5-1] (user=postgres) STATEMENT:  COPY
> afffraud (id, idaff, maxkey, maxvalue) FROM stdin;
> 2010-08-25 04:16:46 PDT [2401]: [6-1] (user=postgres) LOG:  unexpected EOF
> on client connection
>
> Note: we have been doing these backups and restores for at least a year and
> a half without any problems, and we can't think of anything that might have
> changed.
>

Just seeking the obvious, but is the file you've shown being output
with pg_dump the same one you're using on pg_restore?  The paths are
different, although if that's on 2 different machines, that would make
perfect sense.

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb fails to allocate shared memory

2010-08-25 Thread Joshua D. Drake
On Wed, 2010-08-25 at 11:15 -0400, A.M. wrote:
> I am using pgsql9.0b4 (but pgsql8.4 exhibits the same behavior) on MacOS 
> 10.6.4 and initdb fails:
> initdb: removing data directory "/Volumes/Data/pgsql90b"
> 
> I would like to create the database space and then reduce the shared memory 
> requirements in postgresql.conf, but this situation seems to create a 
> chicken-and-egg problem. How can I reduce shared_buffers or max_connections 
> prior to running initdb?

If you don't have enough shared memory to initdb, you don't have enough
to run postgresql. You need to increase your shared memory for MacOS
per:

http://www.postgresql.org/docs/8.4/static/kernel-resources.html

And then initdb.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] unexpected message type 0x58 during COPY from stdin

2010-08-25 Thread bricklen
Hello,

I'm getting a strange error during reload of one of our databases. For the
past several weeks, we have been seeing the following error regularly, but
not necessarily with the same table. The pg_restore aborts after this error,
resulting in an incomplete reload. At first I thought it might be related to
our upgrade to 8.4.4 several weeks ago so we downgraded to 8.4.2, but the
error still occurs. Interestingly, if I run the restore command manually
(from the command line), it works fine -- only in cron does it fail.

uname -a
Linux ike.nbinternal.com 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST
2009 x86_64 x86_64 x86_64 GNU/Linux

version()
PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20071124 (Red Hat 4.1.2-42), 64-bit

-- dump file is created with:
pg_dump --format=custom $DATABASE $EXCLUDE_TABLES >
~/backups/daily-backup.dump

-- restore command:

pg_restore -d nbaffnet -F c -j 4 -v -C --no-data-for-failed-tables
/var/lib/pgsql/daily-backup.dump


-- error:
2010-08-25 04:16:46 PDT [2401]: [1-1] (user=postgres) ERROR:  unexpected
message type 0x58 during COPY from stdin
2010-08-25 04:16:46 PDT [2401]: [2-1] (user=postgres) CONTEXT:  COPY
afffraud, line 1
2010-08-25 04:16:46 PDT [2401]: [3-1] (user=postgres) STATEMENT:  COPY
afffraud (id, idaff, maxkey, maxvalue) FROM stdin;
2010-08-25 04:16:46 PDT [2401]: [4-1] (user=postgres) LOG:  could not send
data to client: Broken pipe
2010-08-25 04:16:46 PDT [2401]: [5-1] (user=postgres) STATEMENT:  COPY
afffraud (id, idaff, maxkey, maxvalue) FROM stdin;
2010-08-25 04:16:46 PDT [2401]: [6-1] (user=postgres) LOG:  unexpected EOF
on client connection

Note: we have been doing these backups and restores for at least a year and
a half without any problems, and we can't think of anything that might have
changed.


Cheers,

Bricklen


[GENERAL] initdb fails to allocate shared memory

2010-08-25 Thread A.M.
I am using pgsql9.0b4 (but pgsql8.4 exhibits the same behavior) on MacOS 10.6.4 
and initdb fails:

/usr/local/pgsql90beta/bin/initdb -D /Volumes/Data/pgsql90b/ -E UTF8
The files belonging to this database system will be owned by user "agentm".
This user must also own the server process.

The database cluster will be initialized with locale en_US.UTF-8.
The default text search configuration will be set to "english".

creating directory /Volumes/Data/pgsql90b ... ok
creating subdirectories ... ok
selecting default max_connections ... 10
selecting default shared_buffers ... 400kB
creating configuration files ... ok
creating template1 database in /Volumes/Data/pgsql90b/base/1 ... FATAL:  could 
not create shared memory segment: Cannot allocate memory
DETAIL:  Failed system call was shmget(key=1, size=1703936, 03600).
HINT:  This error usually means that PostgreSQL's request for a shared memory 
segment exceeded available memory or swap space. To reduce the request size 
(currently 1703936 bytes), reduce PostgreSQL's shared_buffers parameter 
(currently 50) and/or its max_connections parameter (currently 14).
The PostgreSQL documentation contains more information about shared 
memory configuration.
child process exited with exit code 1
initdb: removing data directory "/Volumes/Data/pgsql90b"

I would like to create the database space and then reduce the shared memory 
requirements in postgresql.conf, but this situation seems to create a 
chicken-and-egg problem. How can I reduce shared_buffers or max_connections 
prior to running initdb?

Cheers,
M


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature proposal

2010-08-25 Thread Denis BUCHER

Le 25.08.2010 09:15, wstrzalka a écrit :

I'm currently playing with very large data import using COPY from
file.

As this can be extremely long operation (hours in my case) the nice
feature would be some option to show operation progress - how many
rows were already imported.

Or maybe there is some way to do it? As long as postgres have no read-
uncommited I think I can estimate it only by destination table size ??


By the way, did you try to optimize your postgresql server ?

In my case I was able to reduce a big data update from :
1 hour 15 minutes
to :
5 minutes

Without even changing any line of data or code in sql !

Incredible, isn't it ?

Denis

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] recovery fail

2010-08-25 Thread Denis BUCHER

Le 25.08.2010 15:15, Łukasz Bieniek a écrit :

I'have made a backup my DB with pg_dump its (' pg_dump -U username 
dbname>backupfilename.sql.gz. When i want to restore it ( I drop the old 
database ,creaete new and language to it and i type a command 'cat 
backupfilename.sql.gz|gunzip|/usr/local/pgsql/bin/psql -U username dbname) this 
backup restore my all tables but its loses relations to some tables. and when it 
is restoring sysem send me messeges about missing relations . I think it is about 
order in this recovery data from backup. How to resolve this problem?


Same version of postgresql (backup and restore) ?
If not, some small adaptations are always necessary...

What do you get as errors or warnings ?

Denis

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] recovery fail

2010-08-25 Thread Łukasz Bieniek
I'have made a backup my DB with pg_dump its (' pg_dump -U username dbname 
>backupfilename.sql.gz. When i want to restore it ( I drop the old database 
,creaete new and language to it and i type a command 'cat 
backupfilename.sql.gz|gunzip|/usr/local/pgsql/bin/psql -U username dbname) this 
backup restore my all tables but its loses relations to some tables. and when 
it is restoring sysem send me messeges about missing relations . I think it is 
about order in this recovery data from backup. How to resolve this problem?


-
Wez kredyt przez internet. Atrakcyjne oprocentowanie, bez zaswiadczen!
http://linkint.pl/f27d6


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Optimizing queries that use multiple tables and many order by columns

2010-08-25 Thread Joshua Berry
Hi Group,

I've never really learned how to optimize queries that join several tables
and have order by clauses that specify columns from each table. Is there
documentation that could help me optimize and have the proper indexes in
place? I've read through the PG Docs Chapter 11 on Indexes yet still lack
the needed understanding.

Here's my latest culprit:

select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn, JOB.CompanyCode,
Anl.SampleName
from analysis anl join job on anl.job = job.job
order by job.companycode, anl.job, anl.lab
limit 10;

Here's the query plan using PG 8.4.4:
Limit  (cost=21990.24..21990.27 rows=10 width=32)
  ->  Sort  (cost=21990.24..22437.69 rows=178979 width=32)
Sort Key: job.companycode, anl.lab
->  Hash Join  (cost=451.20..18122.57 rows=178979 width=32)
  Hash Cond: (anl.job = job.job)
  ->  Seq Scan on analysis anl  (cost=0.00..14091.79 rows=178979
width=23)
  ->  Hash  (cost=287.20..287.20 rows=13120 width=17)
->  Seq Scan on job  (cost=0.00..287.20 rows=13120
width=17)


If I change the above query to only order by one of the tables, I get better
results:
select Anl.Priority, Anl.Lab, Anl.Job, JOB.DateIn, JOB.CompanyCode,
Anl.SampleName
from analysis anl join job on anl.job = job.job
order by job.companycode --, anl.job, anl.lab
limit 10;
Limit  (cost=0.00..3.65 rows=10 width=32)
  ->  Nested Loop  (cost=0.00..65269.13 rows=178979 width=32)
->  Index Scan using job_companycode on job  (cost=0.00..972.67
rows=13120 width=17)
->  Index Scan using analysis_job_lab on analysis anl
(cost=0.00..4.63 rows=22 width=23)
  Index Cond: (anl.job = job.job)


Any idea on how I can improve this? In the past I would tend to create a
cached copy of the query as a table that would be utilized, but I suspect
that there is a better way to go about this. I'm using a system (Clarion)
which heavily uses cursors via the ODBC driver (I use the psqlODBC latest
version) to get a handful of records at a time, so no actual LIMITs would be
used in the production queries; I've added the LIMITs here to try to
simulate the performance differences that I find when browsing the data
while ordering by the above columns.


Here are the relevant tables and indexes:


CREATE TABLE job
(
  job bigint NOT NULL, -- Job #
  companycode character(4), -- Company Code
  recdbycode character(3), -- Initials of who checked in sample(s)
  datein date, -- Date sample was received
  project character varying, -- Project or Site name
  remarks text, -- Remarks
  --[CONSTRAINTs etc]
)

CREATE INDEX job_companycode
  ON job
  USING btree
  (companycode);

CREATE INDEX job_companycode_job
  ON samples.job
  USING btree
  (companycode, job);


CREATE TABLE analysis
(
  lab bigint NOT NULL, -- Lab number
  job bigint, -- Job number
  sampletype character varying(5), -- General class of sample
  priority character(1), -- Priority level
  samplename character varying, -- Sample or Well name
  CONSTRAINT rel_joblabk_to_jobk FOREIGN KEY (job)
  REFERENCES job (job) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE RESTRICT,
  --[CONSTRAINTs etc]
)

CREATE INDEX analysis_companycode_job_lab
  ON analysis
  USING btree
  (companycode, job, lab);


CREATE INDEX analysis_job_lab
  ON analysis
  USING btree
  (job, lab);


Thanks for any insights and tips you can provide!

Kind Regards,
-Joshua Berry


Re: [GENERAL] Feature proposal

2010-08-25 Thread Wappler, Robert
On 2010-08-25, wstrzalka wrote:
 
> I'm currently playing with very large data import using COPY from
file.
> 
> As this can be extremely long operation (hours in my case) the nice
> feature would be some option to show operation progress - how many
> rows were already imported.
> 
> Or maybe there is some way to do it? As long as postgres have no read-
> uncommited I think I can estimate it only by destination table size ??
> 
> -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To
> make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Hi,
 
You can use tools like pv for a progress bar and pipe the output into
psql.

HTH
-- 
Robert...
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to setup PostgreSQL to work with libpam-pgsql/libnss-pgsql2?

2010-08-25 Thread Craig Ringer
On 25/08/10 14:18, PMC OS wrote:
> Good morning,
> 
>> -Ursprüngliche Nachricht-
>> Von: Craig Ringer 
>> Honestly, in most cases you'll be much better off managing
>> authentication with LDAP. It's a better design for the nature of
>> authentication and user data management, where it has to handle lots
>> of
>> small read queries and only very rare writes. It also has better
>> replication.
> 
> We are only 20 persones in total and do not have the need to handel several 
> 100 or 1000 requests in a short time
> 
> Also since we do much more with the database we need it anyway and LDAP would 
> get its data from PostgreSQL... because I do not like to maintain two systems 
> at once which can do the same job.

Well, fair enough then. Personally with that many people I'd certainly
want to use LDAP (for lower response latencies if nothing else), but
each to their own.

You'll probably want to use nscd on the client machine(s) to take some
of the load off Pg.

> Have now installed slapd on my OMAP L138 but now it has crashed the kernel 
> and I cna not more boot the server because it want o init slapd and crash.

That's ... surprising.

Kernel panic? Or is it just that slapd is crashing?

> How does this manage the user accountts and there homes?
> It does not seem to create $HOME and copy the files from /etc/skel which I 
> have already prepared...

Most likely the same way you'll be doing it with pam auth against
postgresql: pam_mkhomedir . It has a decent man page.

> I have not found this package 
> 
> apt-cache show ldap-auth-client
> W: Kann Paket ldap-auth-client nicht finden
> E: Keine Pakete gefunden

My bad. Looks like it's an Ubuntu extension, just a metapackage that
pulls in libnss-ldap and libpam-ldap and provides a bit of config
support for them.

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Win32 Backend Cash - pre-existing shared memory block is still in use

2010-08-25 Thread Magnus Hagander
On Wed, Aug 25, 2010 at 07:34, Tom Lane  wrote:
> I wrote:
>> Cutting his value for shared_buffers (currently about 800MB) might be
>> wise too.  I'm not sure what the effectively available address space
>> for a win32 process is, but if there's any inefficiency in the way
>> the address space is laid out, those numbers could be enough to be
>> trouble.
>
> Actually, a bit of googling turns up this:
>
> http://msdn.microsoft.com/en-us/library/aa366778(VS.85).aspx
>
> which says that the available userspace address range for a win32
> process is only *two* gig (although you can get to three using tricks
> that I doubt are in his PG build).  Take 800M+500M off the top, and it's

Correct, we don't set ourselves as large address aware.

Hmm. I wonder if we even do that with the 64-bit build. I'm pretty
sure I tried with shared_buffers > 4Gb, but now that i see that page,
I think I need to re-verify that :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Feature proposal

2010-08-25 Thread wstrzalka
I'm currently playing with very large data import using COPY from
file.

As this can be extremely long operation (hours in my case) the nice
feature would be some option to show operation progress - how many
rows were already imported.

Or maybe there is some way to do it? As long as postgres have no read-
uncommited I think I can estimate it only by destination table size ??

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] IBATIS support for postgres cursors

2010-08-25 Thread Guy Rouillier

On 8/23/2010 12:11 PM, atul.g...@globaldatapoint.com wrote:

Hi,

Can cursors be returned to IBATIS. If so how. This is what I have written.









This does not work. Please help.


There is a dedicated mailing list for MyBatis (renamed from iBATIS) over 
at http://www.mybatis.org.  This question would probably be better 
addressed there.  I use both PG and MyBatis so I saw your question.  REF 
is not an acceptable MyBatis jdbcType.  I'm trying with type OTHER as 
documented here:


http://jdbc.postgresql.org/documentation/83/callproc.html#callproc-resultset-setof

But I'm running into an issue which I'll pursue on the MyBatis mailing 
list.  Note that with the current refcursor implementation, there is no 
advantage over returning SETOF from your function.  Both materialize the 
entire result set before returning to the caller.  So, if you can't get 
refcursor to work, I'd suggest switching to SETOF.


I'll get back to you when I find out why OTHER is not working.  Or else 
you can join the MyBatis mailing list.


--
Guy Rouillier

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general