date:20090720

Re: [GENERAL] timestamp with time zone tutorial

2009-07-20 Thread Dennis Gearon


Hey Tom,
 I was trying to use 'US/Pacific-New' as my long, unabbreviated timezone 
and it wasn't working. I thought postgres wasn't accepting the unabbreviated, 
geopolitical, daylight savings time, time zones. Turns out, the server that I 
was on, (not my own box), didn't have that in the '/usr/share/zoneinfo/US' 
directory. My bad.

 I just have to read more on how to get it out relative to a different time 
zone than it went in. I'll find it.


Dennis Gearon

Signature Warning

EARTH has a Right To Life

I agree with Bolivian President Evo Morales

# The right to life: The right for no ecosystem to be eliminated by the 
irresponsible acts of human beings.

# The right of biosystems to regenerate themselves: Development cannot be 
infinite. There's a limit on everything.

# The right to a clean life: The right for Mother Earth to live without 
contamination, pollution. Fish and animals and trees have rights.

# The right to harmony and balance between everyone and everything: We are all 
interdependent.


See the movie - 'Inconvenient Truth'
See the movie - 'Syriana'


--- On Sun, 7/19/09, Tom Lane t...@sss.pgh.pa.us wrote:

 From: Tom Lane t...@sss.pgh.pa.us
 Subject: Re: [GENERAL] timestamp with time zone tutorial
 To: Dennis Gearon gear...@sbcglobal.net
 Cc: pgsql-general@postgresql.org
 Date: Sunday, July 19, 2009, 8:00 PM
 Dennis Gearon gear...@sbcglobal.net
 writes:
       What I want is to be able to
 insert into my project's database, times given by anybody
 anywhere on the planet (the SUBMITTER), add the appropriate
 timezone in the insert statement so that it in 'GMT/UMT'
 neutral'. I believe this is the way Postgres does it,
 storing times in GMT time.
 
 So just do it.  What do you need to_timestamp
 for?  The basic
 timestamptz input function is far more likely to get it
 right
 for input that's not in a precise predetermined format.
 
            
 regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] First query very slow. Solutions: memory, or settings, or SQL?

2009-07-20 Thread Peter Eisentraut

On Monday 20 July 2009 06:45:40 Phoenix Kiula wrote:
 explain analyze select * from sites where user_id = 'phoenix' order by
 id desc limit 10;

 QUERY PLAN
 ---
--- Limit 
 (cost=344.85..344.87 rows=10 width=262) (actual
 time=5879.069..5879.167 rows=10 loops=1)
-  Sort  (cost=344.85..345.66 rows=323 width=262) (actual
 time=5879.060..5879.093 rows=10 loops=1)
  Sort Key: id
  -  Index Scan using new_idx_sites_userid on sites
 (cost=0.00..331.39 rows=323 width=262) (actual time=44.408..5867.557
 rows=2178 loops=1)
Index Cond: ((user_id)::text = 'phoenix'::text)
  Total runtime: 5879.414 ms
 (6 rows)

The row estimate for the index scan is off.  Try reanalyzing, or increase the 
statistics target.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] timestamp with time zone tutorial

2009-07-20 Thread Karsten Hilbert

  I just have to read more on how to get it out relative to a different
 time zone than it went in. I'll find it.

Sounds like a job for SELECT ... AT TIME ZONE ...;

Karsten
-- 
Neu: GMX Doppel-FLAT mit Internet-Flatrate + Telefon-Flatrate
für nur 19,99 Euro/mtl.!* http://portal.gmx.net/de/go/dsl02

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best practices for moving UTF8 databases

2009-07-20 Thread Albe Laurenz

Phoenix Kiula wrote:
 Really, PG absolutely needs a way to upgrade the database without so
 much data related downtime and all these silly woes. Several competing
 database systems are a cinch to upgrade.

I'd call it data corruption, not a silly woe.

I know that Oracle for example would not make that much fuss about
your data: they would be imported without even a warning, and
depending on your encoding settings the bad bytes would either be
imported as-is or tacitly changed to inverted (or normal) question
marks.

It's basically a design choice that PostgreSQL made: we think that
an error is preferrable to clandestinely modifying the user's data
or accepting input that cannot possibly make any sense when it is
retrieved at a future time.

 Anyway this is the annoying error I see as always:
 
   ERROR:  invalid byte sequence for encoding UTF8: 0x80
 
 I think my old DB is all utf8. If there are a few characters that are
 not, how can I work with this? I've done everything I can to take care
 of the encoding and such. This code was used to initdb:
 
  initdb --locale=en_US.UTF-8 --encoding=UTF8
 
 Locale environment variables are all en_US.UTF-8 too.

0x80 makes me think of the following:
The data originate from a Windows system, where 0x80 is a Euro
sign. Somehow these were imported into PostgreSQL without the
appropriate translation into UTF-8 (how I do not know).

I wonder: why do you spend so much time complaining instead of
simply locating the buggy data and fixing them?

This does not incur any downtime (you can fix the data in the old
database before migrating), and it will definitely enhance the fun
your users have with your database (if they actually see Euros where
they should be).

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-20 Thread Albe Laurenz

Craig Ringer wrote:
  The drawback is that some of the side effects of the INSERT occur
  before the constraint check fails, so it seems to me that I still need
  to perform the select.
 
 If you really can't afford the INSERT side effects and can't redesign
 your code to be tolerant of them, you can always lock the table before
 an INSERT.

I wonder what could be the side effects of an INSERT that causes an error
that is caught in a plpgsql exception block.

What happens behind the scenes when an exception is caught is that the
transaction is rolled back to a savepoint that was set at the beginning
of the block.

So all changes to the database that were caused as side effects of the INSERT,
for example triggers, will be rolled back as well.

The only side effects that would remain could be caused by things that
go outside the limits of the database, e.g. modify files on the database
server or perform trigger based replication to another database.
Everything that is not under MVCC control, for short.

Is that the problem here?


On another line:
The original question asked was how can I tell an error that is caused
by incomplete isolation from another error?

If you have a code segment like
   SELECT COUNT(id) INTO i2 FROM a WHERE id = i;
   IF i2 = 0 THEN
  INSERT INTO a (id) VALUES (i);
   END IF;

Then you can be certain that any unique_violation thrown here must
be a serialization problem (if the only unique contraint is on id).

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-20 Thread Florian Weimer

* Craig Ringer:

 The test program, attached, demonstrates what I should've known in the
 first place. In SERIALIZABLE isolation, the above is *guaranteed* to
 fail every time there's conflict, because concurrent transactions cannot
 see changes committed by the others. So is a SELECT test then separate
 INSERT, by the way. 

Yes, I forgot to mention that you can't use SERIALIZABLE if you use
this approach.

 Given that, it seems to me you'll have to rely on Pg's internal
 lower-level synchonization around unique indexes. Try the insert and see
 if it fails, then ROLLBACK TO SAVEPOINT (or use a PL/PgSQL exception
 block). As you noted, this does mean that certain side-effects may
 occur, including:

- advancement of sequences due to nextval(...) calls

- triggers that've done work that can't be rolled back, eg
  dblink calls, external file writes, inter-process communication etc

It's also the cost of producing the input data for the INSERT.

 (You might want to use the two-argument form of the advisory locking
 calls if your IDs are INTEGER size not INT8, and use the table oid for
 the first argument.)

Locking on a hash value could also be an option (it's how concurrent
hash tables are sometimes implemented).

 Also: Is this really a phantom read? Your issue is not that you read a
 record that then vanishes or no longer matches your filter criteria;
 rather, it's that a record is created that matches your criteria after
 you tested for it.

It's the INSERT which performs the phantom read.

And is SQL's definition of serializability really different from the
textbook one?

 Certainly that wouldn't be possible if the concurrent transactions were
 actually executed serially, but does the standard actually require that
 this be the case? If it does, then compliant implementations would have
 to do predicate locking. Ouch. Does anybody do that?

You don't need predicate locking here.  You just have to lock on the
gap in the index you touched.  I think some implementations do this
(InnoDB calls it next-key locking).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [EDIT] Timestamp indicies not being used!

2009-07-20 Thread Alban Hertroys


On 19 Jul 2009, at 12:39, Pedro Doria Meunier wrote:


Hash: SHA1

Hi All,

I *really* need some help with this one...

I have a table ~18M rows with a 'timestamp with time zone' column.  
It's

indexed thus:

CREATE INDEX my_table_timestamp_idx
 ON my_table
 USING btree
 (zulu_timestamp);

whenever I issue a command like:
SELECT speed, digital_input_1, digital_input_2, digital_input_3,
digital_input_4, priority FROM my_table WHERE id='123456789012345'
AND zulu_timestamp  '2009-07-10 15:24:45+01'
ORDER BY zulu_timestamp DESC LIMIT 1

it takes FOREVER in instances where there's only 1 row or 0 rows in  
the

table EDITfor a date equal to the one being compared/EDIT

the column id is also indexed.

this is the query plan:

Limit  (cost=0.00..83.88 rows=1 width=20) (actual
time=810784.212..810784.212 rows=0 loops=1)
  -  Index Scan Backward using my_table_timestamp_idx on my_table
(cost=0.00..3706639.95 rows=44192 width=20) (actual
time=810784.210..810784.210 rows=0 loops=1)
Index Cond: (zulu_timestamp  '2009-07-10
15:24:45+01'::timestamp with time zone)
Filter: (id = '123456789012345'::bpchar)
Total runtime: 810808.298 ms



From this plan it appears the planner statistics aren't up to date or  
the statistics size on the timestamp column is too small, as the  
expected number of rows (44192) doesn't match the actual number (0) at  
all. Some experimenting with ANALYSE and column statistics should tell  
whether this is indeed the problem.
That said statistics are most useful for common cases, they're usually  
not very accurate for exceptions so playing around with those may not  
give the desired results.


What happens in above query plan is that the planner scans a large  
part of rows referred to from the timestamp index (namely all those  
before the specified timestamp) to find any rows matching the id.  
There are a few things you could do about that:


1) You could specify a lower boundary for the timestamps. The way  
you're going about it the longer your application runs the more rows  
will match your zulu_timestamp  '2009-07-10 15:24:45+01' expression.  
It seems likely that you know that the timestamp is at least in  
2009-10 for example, reducing the matching rows by a lot once your  
application is running for several months.


2) You could define a multi-column index instead of two separate  
indexes. Which column should be first depends on which column you  
query on most frequently, but I expect it to be (id, zulu_timestamp).  
With such an index the matching rows are known to be in the index and  
thus looking them up should be a lot faster.


Finally, sending every message as urgent is not going to help you.  
It's like this: http://en.wikipedia.org/wiki/The_boy_who_cried_wolf


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a6437be10131991414558!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [EDIT] Timestamp indicies not being used!

2009-07-20 Thread Pedro Doria Meunier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thank you Alban for your considerations.
The issue has already been marked as [SOLVED] following Sam's suggestion.

And FYI the issue *was* urgent and the wolf *was* biting my leg! :]

BR,

Pedro Doria Meunier
GSM: +351 96 17 20 188
Skype: pdoriam
 



Alban Hertroys wrote:
 On 19 Jul 2009, at 12:39, Pedro Doria Meunier wrote:

 Hash: SHA1

 Hi All,

 I *really* need some help with this one...

 I have a table ~18M rows with a 'timestamp with time zone' column.
 It's
 indexed thus:

 CREATE INDEX my_table_timestamp_idx
  ON my_table
  USING btree
  (zulu_timestamp);

 whenever I issue a command like:
 SELECT speed, digital_input_1, digital_input_2, digital_input_3,
 digital_input_4, priority FROM my_table WHERE id='123456789012345'
 AND zulu_timestamp  '2009-07-10 15:24:45+01'
 ORDER BY zulu_timestamp DESC LIMIT 1

 it takes FOREVER in instances where there's only 1 row or 0 rows in
 the
 table EDITfor a date equal to the one being compared/EDIT

 the column id is also indexed.

 this is the query plan:

 Limit  (cost=0.00..83.88 rows=1 width=20) (actual
 time=810784.212..810784.212 rows=0 loops=1)
   -  Index Scan Backward using my_table_timestamp_idx on my_table
 (cost=0.00..3706639.95 rows=44192 width=20) (actual
 time=810784.210..810784.210 rows=0 loops=1)
 Index Cond: (zulu_timestamp  '2009-07-10
 15:24:45+01'::timestamp with time zone)
 Filter: (id = '123456789012345'::bpchar)
 Total runtime: 810808.298 ms


 From this plan it appears the planner statistics aren't up to date
 or the statistics size on the timestamp column is too small, as the
 expected number of rows (44192) doesn't match the actual number (0)
 at all. Some experimenting with ANALYSE and column statistics should
 tell whether this is indeed the problem.
 That said statistics are most useful for common cases, they're
 usually not very accurate for exceptions so playing around with
 those may not give the desired results.

 What happens in above query plan is that the planner scans a large
 part of rows referred to from the timestamp index (namely all those
 before the specified timestamp) to find any rows matching the id.
 There are a few things you could do about that:

 1) You could specify a lower boundary for the timestamps. The way
 you're going about it the longer your application runs the more rows
 will match your zulu_timestamp  '2009-07-10 15:24:45+01'
 expression. It seems likely that you know that the timestamp is at
 least in 2009-10 for example, reducing the matching rows by a lot
 once your application is running for several months.

 2) You could define a multi-column index instead of two separate
 indexes. Which column should be first depends on which column you
 query on most frequently, but I expect it to be (id,
 zulu_timestamp). With such an index the matching rows are known to
 be in the index and thus looking them up should be a lot faster.

 Finally, sending every message as urgent is not going to help you.
 It's like this: http://en.wikipedia.org/wiki/The_boy_who_cried_wolf

 Alban Hertroys

 --
 If you can't see the forest for the trees,
 cut the trees and you'll see there is no forest.


 !DSPAM:737,4a6437be10131991414558!



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFKZDtl2FH5GXCfxAsRAq4BAKCz6J8+ellx1DsaXLeznV6E4z7OkACgqwjK
RbZ0c+jvNYD+vxJi2ucneCg=
=D6re
-END PGP SIGNATURE-


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best practices for moving UTF8 databases

2009-07-20 Thread Martijn van Oosterhout

On Mon, Jul 20, 2009 at 10:32:15AM +0800, Phoenix Kiula wrote:
 Thanks Martin. I tried searching through the archives and could only
 come with something like this:
 
 http://docs.moodle.org/en/UTF-8_PostgreSQL
 
 But this only has the usual iconv stuff suggested.
 
 Could you pls suggest some specific scripts or URLs? Or let me know
 the keywords I should look for?

I was thinking of this one:

http://www.mail-archive.com/pgsql-hack...@postgresql.org/msg68541.html

You can use it to find the bogus strings in your existing database and
fix them. It's not really fast, but maybe it'll help.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-20 Thread Florian Weimer

* Albe Laurenz:

 The original question asked was how can I tell an error that is caused
 by incomplete isolation from another error?

 If you have a code segment like
SELECT COUNT(id) INTO i2 FROM a WHERE id = i;
IF i2 = 0 THEN
   INSERT INTO a (id) VALUES (i);
END IF;

 Then you can be certain that any unique_violation thrown here must
 be a serialization problem (if the only unique contraint is on id).

I want to put this into a library, so I'd like something foolproof.
Right now, user code sets a flag which basically says that the wrapper
should retry the transaction a few times if a unique_violation is
detected, but I'd like to get rid of that because it's one thing less
the programmer needs to worry about.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Full text search in PostgreSQL 8.4

2009-07-20 Thread Andreas Wenk


Hello,


I recently upgraded to version 8.4 and now full text search with russian 
configuration is not working:



template1=# create database test encoding='win1251';

test=# create table test (test varchar(255));

test=# insert into test values ('тест');

test=# select * from test where to_tsvector('russian', test) @@ 
to_tsquery('russian', 'тест');




In 8.3 version I have result:

 test

--

тест

(1 запись)


 


In 8.4 I have this notice and 0 rows with any table values:

 NOTICE: text-search query contains only stop words or doesn't contain 
lexemes, ignored


test

--

(0 rows)


Why it may not working in 8.4 version?




Thanks




Hi Konstantin,

I ran your tests with 8.3 and 8.4. I have the expected result:

postgres=# \c test
psql (8.4.0)
You are now connected to database test.
test=# create table test (test varchar(255));
CREATE TABLE
test=# insert into test values ('тест');
INSERT 0 1
test=# select * from test where to_tsvector('russian', test) @@ 
to_tsquery('russian', 'тест');
 test
--
 тест
(1 row)

I have a clean installation - means the dictionarys are not edited. After insterting тест 
into the russian.stop file, I can reproduce your case:


test=# select * from test where to_tsvector('russian', test) @@ 
to_tsquery('russian', 'тест');
NOTICE:  text-search query contains only stop words or doesn't contain lexemes, 
ignored
 test
--
(0 rows)

If you would have given the column test the data type tsvector, probably no value would 
have been inserted. Just try it with another column and see if you can insert тест into 
that column.


This is just an idea ...

Cheers Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: PostgreSQL Databse Migration to the Latest Version and Help for Database Replication.

2009-07-20 Thread Arndt Lehmann

On Jul 20, 9:29 am, dmag...@gmail.com (Chris) wrote:
  *I have tried *RubyRep, Bucardo, *_but none of these have a stable rpm
  release for CentOS version of Linux Operation System_.
Just to mention this: rubyrep comes with a bundled package.
 http://www.rubyrep.org/installation.html
Just unzip and it will work*. No need to hunt down libraries or
something like this.

*: rubyrep requires JDK = 1.5. But that's it.

Best Regards,
  Arndt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding sequential versus index scans.

2009-07-20 Thread Robert James

Yes, I had done UNION.  UNION ALL achives the expected plan and speed! Thank
you!
BTW, this is interesting, because there are only about 5 or 6 rows max
returned from both queries - but I guess the planner expects more and hence
changes the plan to remove duplicates.

On Sun, Jul 19, 2009 at 9:05 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Sun, Jul 19, 2009 at 6:10 PM, Robert Jamessrobertja...@gmail.com
 wrote:
  UNION was better, but still 5 times as slow as either query done
  individually.
  set enable_seqscan=off didn't help at all - it was totally ignored
  Is there anything else I can do?

 Did you try union, or union all?

[GENERAL] Documentation Improvement suggestions

2009-07-20 Thread Robert James

Two small suggestions that might make it easier for newcomers to take
advantage of the wonderful database:
1. Googling Postgres docs produces links for all different versions.  This
is because incoming links are to different versions.  Besides being
confusing, it pushes the pages lower in Google, and makes it harder to find
them.
Could the website offer a link to the 'current' version, whichever it is.
 Eg instead of just :
http://www.postgresql.org/docs/8.1/static/creating-cluster.html
Have:
http://www.postgresql.org/docs/current/static/creating-cluster.html
which would keep all incoming links pointed to the current page.
2. The 'SQL' in 'PostgresSQL' is hard to say and type.  Everyone drops it
(even this list!).  Why not change the official name? Again, it would make
googling and naming things easier.

Re: [GENERAL] timestamp with time zone tutorial

2009-07-20 Thread Adrian Klaver

On Sunday 19 July 2009 10:59:24 pm Dennis Gearon wrote:
 Hey Tom,
  I was trying to use 'US/Pacific-New' as my long, unabbreviated
 timezone and it wasn't working. I thought postgres wasn't accepting the
 unabbreviated, geopolitical, daylight savings time, time zones. Turns out,
 the server that I was on, (not my own box), didn't have that in the
 '/usr/share/zoneinfo/US' directory. My bad.

  I just have to read more on how to get it out relative to a different
 time zone than it went in. I'll find it.


 Dennis Gearon


For me it easier to think of the time zones as a formatting option rather than 
a 
data storage attribute. The simple explanation is that it always goes in as 
UTC. The more complicated explanation follows. The tz data types are stored as 
UTC. The time zone information is used on input to make the correct offset from 
the specified tz to UTC. On output the procedure is reversed. Since the data is 
stored as UTC you get to choose whatever time zone you want it displayed as by 
using the appropriate setting or function.  The issue is if it is important to 
know the tz that was used for the input offset. That would require a separate 
field. You can search the archives for the many discussions that have occurred 
on this point.



-- 
Adrian Klaver
akla...@comcast.net

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Understanding INNER JOIN versus IN subquery

2009-07-20 Thread Robert James

I have two queries which should be equivalent.  The Planner plans them
differently, although they are both about the same time.  Can someone
explain why?
select word from dict
where
 word in
 (select substr('moon', 0, generate_series(3,length('moon'

select * from dict
inner join (select substr('moon', 0, generate_series(3,length('moon' as
m
on dict.word = m.substr

Is one preferred?

Re: [GENERAL] Documentation Improvement suggestions

2009-07-20 Thread Ivan Sergio Borgonovo

On Mon, 20 Jul 2009 09:34:00 -0400
Robert James srobertja...@gmail.com wrote:

 Two small suggestions that might make it easier for newcomers to
 take advantage of the wonderful database:
 1. Googling Postgres docs produces links for all different
 versions.  This is because incoming links are to different
 versions.  Besides being confusing, it pushes the pages lower in
 Google, and makes it harder to find them.
 Could the website offer a link to the 'current' version, whichever
 it is. Eg instead of just :
 http://www.postgresql.org/docs/8.1/static/creating-cluster.html
 Have:
 http://www.postgresql.org/docs/current/static/creating-cluster.html
 which would keep all incoming links pointed to the current page.

Did you actually tried the link you proposed?
There is a page behind it... ant it actually links to 8.4 docs.
People still need old docs so you can't just make docs for previous
versions unavailable... and once they are available, they will be
indexed.
Unfortunately current version doesn't mean most used.

Actually what I'd enjoy is a link to the interactive version and
maybe aggregating all the comments of previous versions in
the new version (indicating from which versions they come from).

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding sequential versus index scans.

2009-07-20 Thread Greg Stark

On Mon, Jul 20, 2009 at 2:22 PM, Robert Jamessrobertja...@gmail.com wrote:
 BTW, this is interesting, because there are only about 5 or 6 rows max
 returned from both queries - but I guess the planner expects more and hence
 changes the plan to remove duplicates.


If you sent the plans for the various attempts we might be able to
explain what's going on.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding INNER JOIN versus IN subquery

2009-07-20 Thread Greg Stark

On Mon, Jul 20, 2009 at 2:37 PM, Robert Jamessrobertja...@gmail.com wrote:
 I have two queries which should be equivalent.  The Planner plans them
 differently, although they are both about the same time.  Can someone
 explain why?


Uhm, please post the two plans and the server version. I know you've
posted them before but that was in another thread that I don't have
handy.

The short answer is that the server doesn't know that the subquery is
definitely going to produce distinct results so it doesn't know the
inner join won't produce duplicates. So it can't tell that these
queries might be equivalent.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding INNER JOIN versus IN subquery

2009-07-20 Thread Grzegorz Jaśkiewicz

On Mon, Jul 20, 2009 at 2:37 PM, Robert Jamessrobertja...@gmail.com wrote:
 I have two queries which should be equivalent.  The Planner plans them
 differently, although they are both about the same time.  Can someone
 explain why?
 select word from dict
 where
  word in
  (select substr('moon', 0, generate_series(3,length('moon'

 select * from dict
 inner join (select substr('moon', 0, generate_series(3,length('moon' as
 m
 on dict.word = m.substr

 Is one preferred?

it is hard to say from your example. But in my general expierence, I
noticed similar thing many times.
Needless to say, I prefer personally to choose JOINs instead of IN(),
because those tend to be faster on postgresql.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Documentation Improvement suggestions

2009-07-20 Thread Alvaro Herrera

Robert James escribió:
 Two small suggestions that might make it easier for newcomers to take
 advantage of the wonderful database:
 1. Googling Postgres docs produces links for all different versions.  This
 is because incoming links are to different versions.  Besides being
 confusing, it pushes the pages lower in Google, and makes it harder to find
 them.
 Could the website offer a link to the 'current' version, whichever it is.
  Eg instead of just :
 http://www.postgresql.org/docs/8.1/static/creating-cluster.html
 Have:
 http://www.postgresql.org/docs/current/static/creating-cluster.html
 which would keep all incoming links pointed to the current page.

Maybe we should offer a robots.txt file that told crawlers to only index
the current version of the docs, not the version-specific ones.

 2. The 'SQL' in 'PostgresSQL' is hard to say and type.  Everyone drops it
 (even this list!).  Why not change the official name? Again, it would make
 googling and naming things easier.

This is a taboo topic which has created the largest holy wars I've seen
in this project.  Please don't raise it.  If you're interested, search
the archives.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Documentation Improvement suggestions

2009-07-20 Thread pepone.onrez

On Mon, Jul 20, 2009 at 5:14 PM, Alvaro
Herreraalvhe...@commandprompt.com wrote:
Robert James escribió:
Two small suggestions that might make it easier for newcomers to take
advantage of the wonderful database:
1. Googling Postgres docs produces links for all different versions. This
is because incoming links are to different versions. Besides being
confusing, it pushes the pages lower in Google, and makes it harder to find
them.
Could the website offer a link to the 'current' version, whichever it is.
Eg instead of just :
http://www.postgresql.org/docs/8.1/static/creating-cluster.html
Have:
http://www.postgresql.org/docs/current/static/creating-cluster.html
which would keep all incoming links pointed to the current page.

Maybe we should offer a robots.txt file that told crawlers to only index
the current version of the docs, not the version-specific ones.

I don't see this is a good idea, if you want to google only for a
specific version
you can use ''site'' expression to filter your results

site:http://www.postgresql.org/docs/8.4/ foo

2. The 'SQL' in 'PostgresSQL' is hard to say and type. Everyone drops it
(even this list!). Why not change the official name? Again, it would make
googling and naming things easier.

This is a taboo topic which has created the largest holy wars I've seen
in this project. Please don't raise it. If you're interested, search
the archives.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Documentation Improvement suggestions

2009-07-20 Thread Martijn van Oosterhout

On Mon, Jul 20, 2009 at 11:14:27AM -0400, Alvaro Herrera wrote:
 Robert James escribió:
  Could the website offer a link to the 'current' version, whichever it is.
   Eg instead of just :
  http://www.postgresql.org/docs/8.1/static/creating-cluster.html
  Have:
  http://www.postgresql.org/docs/current/static/creating-cluster.html
  which would keep all incoming links pointed to the current page.
 
 Maybe we should offer a robots.txt file that told crawlers to only index
 the current version of the docs, not the version-specific ones.

I know it's not easy, but a nice option to me would be if the 8.1 docs
page linked to the equivalent page in the other versions. That would
avoid the need to manually edit the URL after a google search.

Oh, and +10 for the Up link at the top of the page also.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [GENERAL] Checkpoint Tuning Question

2009-07-20 Thread Dan Armbrust

On Mon, Jul 13, 2009 at 3:53 PM, Dan
Armbrustdaniel.armbrust.l...@gmail.com wrote:
 So this thought leads to a couple of other things Dan could test.
 First, see if turning off full_page_writes makes the hiccup go away.
 If so, we know the problem is in this area (though still not exactly
 which reason); if not we need another idea.  That's not a good permanent
 fix though, since it reduces crash safety.  The other knobs to
 experiment with are synchronous_commit and wal_sync_method.  If the
 stalls are due to commits waiting for additional xlog to get written,
 then async commit should stop them.  I'm not sure if changing
 wal_sync_method can help, but it'd be worth experimenting with.

                        regards, tom lane



 All of my testing to date has been done with synchronous_commit=off

 I just tried setting full_page_writes=off - and like magic, the entire
 hiccup went away.

 Thanks,

 Dan

I haven't done any other testing for this issue since discovering that
setting full_page_writes=off makes my performance hiccup go away.
I'm about to move on to some different tests and reset this setup that
I was using to test out this issue - is there anything else that
anyone would like to have gathered before I axe this system?

I can redo it again later too, it will just take a fair bit longer for
me to reconfigure the test environment.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] xml to string, ascii x utf8 conversion

2009-07-20 Thread Radek Novotný

Hello,
having problem with conversion while doing function query_to_xml.

query_to_xml('select Nazev as TITLE, Datum as DATE, Autor_Akce as meta 
rel=''action_author'', 

gave me

table xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
row
  TITLETest/TITLE
  DATE2009-07-20/DATE
  meta_x0020_rel_x003D__x0027_action_author_x0027_test 
/meta_x0020_rel_x003D__x0027_action_author_x0027_

How can i transcode this well generated xml(ascii) to string(UTF8) without the 
escape characters?

I found nothing on google...

Regards,
Radek Novotny


-- 

www.publicstream.cz - vytvořili jsme produkt pro živé přenosy (online 
streaming) 
a videoarchív ve formátu FLASH.

Bc. Radek Novotný
jednatel Mediawork group s.r.o.

tel.: +420 724 020 361
email: radek.novo...@mediawork.cz
http://www.mediawork.cz

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-   
Hash: RIPEMD160  


 In my experience, I've found that enabling full logging for a short time
 (perhaps a few hours) gathers enough data to run through tools like 
 pgFouine and find problem areas.

 It is not possible for us. Logging millions of statements take too much time.

Too much time to do what? Where is the bottleneck?

 But you dont run the real use cases with automated tests. There so many 
 factors involved in real time: caching, concurrency, data, peaktime,
 deadlocks, doubleclicks, robots etc. that you just can't reproduce it on a 
 development system without lots of effort. 

Agreed.

 Additionally, we make
 sure our production systems have enough hardware behind them that we can
 add additional tasks without it affecting production use.   

 that's nice, but not everybody can afford it. Of course i would love to log 
 every statement. But do you really log every statement in production? I guess 
 not.  

We have clients that do, and we always recommend it for all of our clients. It
can get very expensive (hint: ship them off with syslog over an internal
network to a dedicated box), but being able to see exactly what happened
in your database when something goes wrong after the fact is invaluable.
Plus we can then run tools like pgsi (http://bucardo.org/pgsi/) to get
nice statistics and find weak spots in the application code.

 Again: for my use case it makes sense to have a log_sample feature.

Perhaps, but I don't think you've quite overcome the 'log everything'
counter-argument. I don't think log_sample is necessarily a bad idea, but I
doubt it is going to happen soon, and certainly not in time to put
on your production system in the next year or so. Perhaps you can attack
this from another angle, such as using some sort of filter or script
to grab every X lines from the logs and discard the rest while in
full logging mode. That would have the advantage of being fully
backwards-compatible and usable today.


- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200907201256
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkpkoh4ACgkQvJuQZxSWSsi5swCfSgxE/3Vs+kCmfqSERL6u84XJ
nNQAoO5oOs/Nwuhe27FQ+THZEUVcdULO
=7JPQ
-END PGP SIGNATURE-



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Bill Moran

In response to Greg Sabino Mullane g...@turnstep.com:
 
 -BEGIN PGP SIGNED MESSAGE-   
 Hash: RIPEMD160  
 
 
  In my experience, I've found that enabling full logging for a short time
  (perhaps a few hours) gathers enough data to run through tools like 
  pgFouine and find problem areas.
 
  It is not possible for us. Logging millions of statements take too much 
  time.

This is a ridiculous statement.  In actual practice, full query logging
is 1/50 the amount of disk I/O as the actual database activity.  If your
systems are so stressed that they can't handle another 2% increase, then
you've got bigger problems lurking.

Have you benchmarked the load it creates under your workload?

  But you dont run the real use cases with automated tests. There so many 
  factors involved in real time: caching, concurrency, data, peaktime,
  deadlocks, doubleclicks, robots etc. that you just can't reproduce it on a 
  development system without lots of effort. 

Logging a sample query every arbitrary number of queries isn't a real use
case either, thus your counter argument makes no sense.

I suggested the test system because it's a good compromise.

  Additionally, we make
  sure our production systems have enough hardware behind them that we can
  add additional tasks without it affecting production use.   
 
  that's nice, but not everybody can afford it. Of course i would love to log
  every statement. But do you really log every statement in production? I 
  guess
  not.

Try reading my responses instead of guessing.  We run full query logging on
occasion to get a sample of real usage on our production systems.  As a
result, we know that our servers can handle the load.  We're working on
the infrastructure to manage the amount of data so we can do it all the
time (we don't currently have enough disk space).

Overall, it seems like you've decided that you want this feature and nothing
else will do.  If that's the case, then just go ahead and write it.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Joshua D. Drake

On Mon, 2009-07-20 at 13:24 -0400, Bill Moran wrote:
 In response to Greg Sabino Mullane g...@turnstep.com:
  
  -BEGIN PGP SIGNED MESSAGE-   
  Hash: RIPEMD160  
  
  
   In my experience, I've found that enabling full logging for a short time
   (perhaps a few hours) gathers enough data to run through tools like 
   pgFouine and find problem areas.
  
   It is not possible for us. Logging millions of statements take too much 
   time.
 
 This is a ridiculous statement.

No it isn't.

   In actual practice, full query logging
 is 1/50 the amount of disk I/O as the actual database activity.  If your
 systems are so stressed that they can't handle another 2% increase, then
 you've got bigger problems lurking.

It depends on the system. I have seen even big systems take a huge hit
by full logging due to transactional velocity.

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Steve Atkins



On Jul 20, 2009, at 10:24 AM, Bill Moran wrote:


In response to Greg Sabino Mullane g...@turnstep.com:


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


In my experience, I've found that enabling full logging for a  
short time

(perhaps a few hours) gathers enough data to run through tools like
pgFouine and find problem areas.


It is not possible for us. Logging millions of statements take too  
much time.


This is a ridiculous statement.


No, it isn't.


 In actual practice, full query logging
is 1/50 the amount of disk I/O as the actual database activity.  If  
your
systems are so stressed that they can't handle another 2% increase,  
then

you've got bigger problems lurking.


That's really not true. Ever, probably, but certainly in my situation.

A lot of my inserts are large text fields (10k - 1M). Many of those  
are never

read again.

If I want to log all statements or slow statements then a lot of
what's logged is those large inserts. There's currently no way
to log the statement without logging all the data inserted, so
the log traffic produced by each insert is huge.

Logging via syslog on Solaris I've had reports of that slowing
the machine down to unusability. (I'm fairly sure I know why,
and I suspect you can guess too, but this is on customer machines).
The hit due to logging can be huge, even on fairly overpowered
systems.

There are a lot of limitations in the current logging system when
it comes to capturing data for performance analysis, and that's
one of them. There's certainly significant room for improvement
in the logging system - some of that can be added externally,
but some of it really needs to be done within the core code.

Cheers,
  Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Bill Moran

In response to Joshua D. Drake j...@commandprompt.com:

 On Mon, 2009-07-20 at 13:24 -0400, Bill Moran wrote:
  In response to Greg Sabino Mullane g...@turnstep.com:
   
   -BEGIN PGP SIGNED MESSAGE-   
   Hash: RIPEMD160  
   
   
In my experience, I've found that enabling full logging for a short 
time
(perhaps a few hours) gathers enough data to run through tools like
 
pgFouine and find problem areas.   
 
   
It is not possible for us. Logging millions of statements take too much 
time.
  
  This is a ridiculous statement.
 
 No it isn't.
 
In actual practice, full query logging
  is 1/50 the amount of disk I/O as the actual database activity.  If your
  systems are so stressed that they can't handle another 2% increase, then
  you've got bigger problems lurking.
 
 It depends on the system. I have seen even big systems take a huge hit
 by full logging due to transactional velocity.

Perhaps I'm just in a foul mood today, but I feel like people are picking
my emails apart to make me the bad guy.  Note that you trimmed out a key
part of what I said here:

Have you benchmarked the load it creates under your workload?

Perhaps I should have explicitly said, There are likely some cases where
this statement isn't true, so you should benchmark your specific load
case, but you don't mention that you've done that.

Or, perhaps I should just abstain from posting to mailing lists when I'm
in a foul mood ...

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] suggestion: log_statement = sample

2009-07-20 Thread Joshua D. Drake

On Mon, 2009-07-20 at 13:46 -0400, Bill Moran wrote:
 In response to Joshua D. Drake j...@commandprompt.com:

  It depends on the system. I have seen even big systems take a huge hit
  by full logging due to transactional velocity.
 
 Perhaps I'm just in a foul mood today, but I feel like people are picking

Possibly. :)

 my emails apart to make me the bad guy.  Note that you trimmed out a key
 part of what I said here:

Right. I didn't have issue with the rest of your email. Its just the
idea what the person said was ridiculous.

 
 Have you benchmarked the load it creates under your workload?

Which is certainly a valid question.

 
 Perhaps I should have explicitly said, There are likely some cases where
 this statement isn't true, so you should benchmark your specific load
 case, but you don't mention that you've done that.
 
 Or, perhaps I should just abstain from posting to mailing lists when I'm
 in a foul mood ...

It has certainly gotten me in trouble more than once. There is a reason
they say, Don't go JD.

Joshua D. Drake




 
-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Log timings on Windows 64

2009-07-20 Thread Scott Mead

Hey all,
   I'm looking at windows, and a couple of quick google's didn't give me
anything,

If I set my log_min_duration_statement  16 ms, I get one of 2 values
for my pgbench runs (pretty quick statements).  0, or 16 ms (sometimes
15.999).  If I benchmark some other way (via my app) things look the same.
 If I set my log_line_prefix to %m vs. %t, then the time increments in 16ms
chunks.  I'm guessing this is an OS limitation on windows.  Does anybody
else have experience here (or aware of a previous bug)?

Thanks

--Scott

Re: [GENERAL] best practice transitioning from one datatype to another

2009-07-20 Thread CG

I'm stumped-- at least for an easy way to do this.

When referencing the uniqueidentifier data type in PostgreSQL 8.1 
(now uuid, because of an UPDATE to the pg_ tables) in function definitions in 
schemas not in the search path, one must reference the data type as 
public.uniqueidentifier (ahem. public.uuid). This was done because the 
query planner couldn't with 100% certainty determine type equality (i.e. for 
foreign key constraints, comparisons in WHERE clauses...), so it decided to use 
sequential scans where index scans would have been most appropriate.

When I reload to PostgreSQL 8.4, it doesn't understand public.uuid. So, the 
only way I can think of to get those functions back into the database is to use 
a restore list, and keep them from being inserted in the first place. Then, 
one-at-a-time, re-create them manually. There must be a better way, though!

I'm sure this only my second of several more hurdles to overcome before I'm 
finished with the transition.

Your wisdom will be appreciated!

CG




 




From: Tom Lane t...@sss.pgh.pa.us
To: CG cgg...@yahoo.com
Cc: pgsql-general@postgresql.org
Sent: Wednesday, July 15, 2009 7:05:47 PM
Subject: Re: [GENERAL] best practice transitioning from one datatype to another

CG cgg...@yahoo.com writes:
 While transitioning from 8.1 to 8.4, I need to transition to the internal 
 UUID type in place of the contrib/uniqueidentifier module. I've built the 
 database around uniqueidentifier, so nearly every table has one column of 
 that data type. It's going to be tedious to 

 ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE uuid;
 ...repeat 600 times...

 I'll also have to drop and reload the views and the rules on tables. It'll be 
 tedious even if the tables have no data in them. 

 Can anyone recommend a better/faster way to make the transition?

Couldn't you rename the type to uuid in the 8.1 database before you
dump?

            regards, tom lane

[GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread APseudoUtopia

Hey,

I'm writing a backup script. Right now, I only have one database on my
postgresql server. I'm deciding if I should use pg_dump or pg_dumpall when
backing up the server. As far as I can tell, pg_dumpall cannot compress the
dumps automatically and it only dumps data in the standard SQL text file
format. This means that I would not be able to use pg_restore to selectively
restore the database, correct?

What is *NOT* included in a pg_dump that IS in pg_dumpall (Other than all
the databases)? Things like user-defined functions and datatypes? Roles?
Views?

I was leaning towards pg_dumpall, but then I realized that it only dumps in
the standard SQL text file format, and it cannot be compressed
automatically.

Thanks for any advice.

Help needed for reading postgres log : RE: [GENERAL] Concurrency issue under very heay loads

2009-07-20 Thread Raji Sridar (raji)

I would like some help in reading the postgres logs.
Here is a snippet of the log.
Auto commit seems to be set to false.
But still the logs shows CommitTransactionCommand in debug mode.
The same order number is given for multiple clients.
Please see CommitTransactionCommand below for both select ...for
update and update... SQLs and let me know if I am reading correctly
that auto commit is actually effective.
Thanks
Raji
-
2009-07-17 14:10:59 4736 970134 DEBUG:  parse unnamed: SELECT
nextEntityID FROM tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  StartTransactionCommand
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  parse tree:
2009-07-17 14:10:59 4736 970134 DETAIL: {QUERY 
:commandType 1 
:querySource 0 
:canSetTag true 
:utilityStmt  
:resultRelation 0 
:into  
:intoOptions  
:intoOnCommit 0 
:intoTableSpaceName  
:hasAggs false 
:hasSubLinks false 
:rtable (
   {RTE 
   :alias  
   :eref 
  {ALIAS 
  :aliasname tableentityid 
  :colnames (entitytype nextentityid)
  }
   :rtekind 0 
   :relid 16420 
   :inh true 
   :inFromCl true 
   :requiredPerms 6 
   :checkAsUser 0
   }
)
:jointree 
   {FROMEXPR 
   :fromlist (
  {RANGETBLREF 
  :rtindex 1
  }
   )
   :quals 
  {OPEXPR 
  :opno 98 
  :opfuncid 0 
  :opresulttype 16 
  :opretset false 
  :args (
 {RELABELTYPE 
 :arg 
{VAR 
:varno 1 
:varattno 1 
:vartype 1043 
:vartypmod 68 
:varlevelsup 0 
:varnoold 1 
:varoattno 1
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
 {RELABELTYPE 
 :arg 
{PARAM 
:paramkind 0 
:paramid 1 
:paramtype 1043
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
  )
  }
   }
:targetList (
   {TARGETENTRY 
   :expr 
  {VAR 
  :varno 1 
  :varattno 2 
  :vartype 1043 
  :vartypmod 132 
  :varlevelsup 0 
  :varnoold 1 
  :varoattno 2
  }
   :resno 1 
   :resname nextentityid 
   :ressortgroupref 0 
   :resorigtbl 16420 
   :resorigcol 2 
   :resjunk false
   }
)
:returningList  
:groupClause  
:havingQual  
:distinctClause  
:sortClause  
:limitOffset  
:limitCount  
:rowMarks (
   {ROWMARKCLAUSE 
   :rti 1 
   :forUpdate true 
   :noWait false
   }
)
:setOperations  
:resultRelations  
:returningLists 
}
 
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  rewritten parse tree:
2009-07-17 14:10:59 4736 970134 DETAIL:  (
{QUERY 
:commandType 1 
:querySource 0 
:canSetTag true 
:utilityStmt  
:resultRelation 0 
:into  
:intoOptions  
:intoOnCommit 0 
:intoTableSpaceName  
:hasAggs false 
:hasSubLinks false 
:rtable (
   {RTE 
   :alias  
   :eref 
  {ALIAS 
  :aliasname tableentityid 
  :colnames (entitytype nextentityid)
  }
   :rtekind 0 
   :relid 16420 
   :inh true 
   :inFromCl true 
   :requiredPerms 6 
   :checkAsUser 0
   }
)
:jointree 
   {FROMEXPR 
   :fromlist (
  {RANGETBLREF 
  :rtindex 1
  }
   )
   :quals 
  {OPEXPR 
  :opno 98 
  :opfuncid 0 
  :opresulttype 16 
  :opretset false 
  :args (
 {RELABELTYPE 
 :arg 
{VAR 
:varno 1 
:varattno 1 
:vartype 1043 
:vartypmod 68 
:varlevelsup 0 
:varnoold 1 
:varoattno 1
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
 {RELABELTYPE 
 :arg 
{PARAM 
:paramkind 0 
:paramid 1 
:paramtype 1043
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
  )
  }
   }
:targetList (
   {TARGETENTRY

Re: [GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread Greg Williamson


Hi --

 I'm writing a backup script. Right now, I only have one database on my 
 postgresql server. I'm
 deciding if I should use pg_dump or pg_dumpall when backing up the server. As 
 far as I can tell,
 pg_dumpall cannot compress the dumps automatically and it only dumps data in 
 the standard SQL
 text file format. This means that I would not be able to use pg_restore to 
 selectively restore the
 database, correct?
 
 What is *NOT* included in a pg_dump that IS in pg_dumpall (Other than all the 
 databases)?
 Things like user-defined functions and datatypes? Roles? Views?

The pg_dumpall gets users and groups that are not dumped in the single database 
versions; I used
to use pg_dump on each of several databases on a given server and then 
pg_dumpall with suitable
parameters to get only those bits and nothing else. User defined stuff, views 
etc. are all in a given
database so they will be in the pg_dump.

HTH,

Greg Williamson



  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] killing processes

2009-07-20 Thread David Kerr

What's the generally accepted method for killing processes that went 'all 
wacky' in postgres?

I think i've seen in this group that kill -INT would be the way to go.

I'm playing around with different options for a median function. this one got 
out of hand
and was taking too long, so i wanted to kill it:

test=# select array_median(array(select t1 from test2 order by 1));
^CCancel request sent

It just sits there, it's been trying to die for 1/2 an hour.

At the OS it's taking up 100% of the CPU.

I tried kill -INT pid but that didn't help.

It's not updating anything, and i'm the only one in the database.

Fortunatly it's not production, so I don't really care. But if it was 
production, what would
be the method to kill it?  (I know about kill -9, i'm assuming that == bad)

If this were production, I'd need to end the process, force a rollback (if 
necessary) and get 
my CPU back so just waiting for it to die really isn't an option...

(PostgreSQL 8.3.5, linux/SLES11)

Thanks

Dave

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread Scott Marlowe

On Mon, Jul 20, 2009 at 4:23 PM, APseudoUtopiaapseudouto...@gmail.com wrote:
 Hey,

 I'm writing a backup script. Right now, I only have one database on my
 postgresql server. I'm deciding if I should use pg_dump or pg_dumpall when
 backing up the server. As far as I can tell, pg_dumpall cannot compress the
 dumps automatically and it only dumps data in the standard SQL text file
 format. This means that I would not be able to use pg_restore to selectively
 restore the database, correct?

 What is *NOT* included in a pg_dump that IS in pg_dumpall (Other than all
 the databases)? Things like user-defined functions and datatypes? Roles?
 Views?

 I was leaning towards pg_dumpall, but then I realized that it only dumps in
 the standard SQL text file format, and it cannot be compressed
 automatically.

I use pg_dump on databases and pg_dumpall --globals to get the user
accounts and such.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Concurrency issue under very heay loads

2009-07-20 Thread Raji Sridar (raji)

Here is a snippet of the postgres log.
Auto commit seems to be set to false.
But still the logs shows CommitTransactionCommand in debug mode.
The same order number is given for multiple clients.
Please see CommitTransactionCommand below for both select ...for
update and update... SQLs and let me know if I am reading correctly
that auto commit is actually effective.

Thanks
Raji
-
2009-07-17 14:10:59 4736 970134 DEBUG:  parse unnamed: SELECT
nextEntityID FROM tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  StartTransactionCommand
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  parse tree:
2009-07-17 14:10:59 4736 970134 DETAIL: {QUERY 
:commandType 1 
:querySource 0 
:canSetTag true 
:utilityStmt  
:resultRelation 0 
:into  
:intoOptions  
:intoOnCommit 0 
:intoTableSpaceName  
:hasAggs false 
:hasSubLinks false 
:rtable (
   {RTE 
   :alias  
   :eref 
  {ALIAS 
  :aliasname tableentityid 
  :colnames (entitytype nextentityid)
  }
   :rtekind 0 
   :relid 16420 
   :inh true 
   :inFromCl true 
   :requiredPerms 6 
   :checkAsUser 0
   }
)
:jointree 
   {FROMEXPR 
   :fromlist (
  {RANGETBLREF 
  :rtindex 1
  }
   )
   :quals 
  {OPEXPR 
  :opno 98 
  :opfuncid 0 
  :opresulttype 16 
  :opretset false 
  :args (
 {RELABELTYPE 
 :arg 
{VAR 
:varno 1 
:varattno 1 
:vartype 1043 
:vartypmod 68 
:varlevelsup 0 
:varnoold 1 
:varoattno 1
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
 {RELABELTYPE 
 :arg 
{PARAM 
:paramkind 0 
:paramid 1 
:paramtype 1043
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
  )
  }
   }
:targetList (
   {TARGETENTRY 
   :expr 
  {VAR 
  :varno 1 
  :varattno 2 
  :vartype 1043 
  :vartypmod 132 
  :varlevelsup 0 
  :varnoold 1 
  :varoattno 2
  }
   :resno 1 
   :resname nextentityid 
   :ressortgroupref 0 
   :resorigtbl 16420 
   :resorigcol 2 
   :resjunk false
   }
)
:returningList  
:groupClause  
:havingQual  
:distinctClause  
:sortClause  
:limitOffset  
:limitCount  
:rowMarks (
   {ROWMARKCLAUSE 
   :rti 1 
   :forUpdate true 
   :noWait false
   }
)
:setOperations  
:resultRelations  
:returningLists 
}
 
2009-07-17 14:10:59 4736 970134 STATEMENT:  SELECT nextEntityID FROM
tableEntityID WHERE entityType = $1 FOR UPDATE
2009-07-17 14:10:59 4736 970134 DEBUG:  rewritten parse tree:
2009-07-17 14:10:59 4736 970134 DETAIL:  (
{QUERY 
:commandType 1 
:querySource 0 
:canSetTag true 
:utilityStmt  
:resultRelation 0 
:into  
:intoOptions  
:intoOnCommit 0 
:intoTableSpaceName  
:hasAggs false 
:hasSubLinks false 
:rtable (
   {RTE 
   :alias  
   :eref 
  {ALIAS 
  :aliasname tableentityid 
  :colnames (entitytype nextentityid)
  }
   :rtekind 0 
   :relid 16420 
   :inh true 
   :inFromCl true 
   :requiredPerms 6 
   :checkAsUser 0
   }
)
:jointree 
   {FROMEXPR 
   :fromlist (
  {RANGETBLREF 
  :rtindex 1
  }
   )
   :quals 
  {OPEXPR 
  :opno 98 
  :opfuncid 0 
  :opresulttype 16 
  :opretset false 
  :args (
 {RELABELTYPE 
 :arg 
{VAR 
:varno 1 
:varattno 1 
:vartype 1043 
:vartypmod 68 
:varlevelsup 0 
:varnoold 1 
:varoattno 1
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
 {RELABELTYPE 
 :arg 
{PARAM 
:paramkind 0 
:paramid 1 
:paramtype 1043
}
 :resulttype 25 
 :resulttypmod -1 
 :relabelformat 2
 }
  )
  }
   }
:targetList (
   {TARGETENTRY 
   :expr 
  {VAR 
  :varno

Re: [GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread Ben Chobot


APseudoUtopia wrote:
I'm writing a backup script. Right now, I only have one database on my 
postgresql server. I'm deciding if I should use pg_dump or pg_dumpall 
when backing up the server. As far as I can tell, pg_dumpall cannot 
compress the dumps automatically and it only dumps data in the 
standard SQL text file format. This means that I would not be able to 
use pg_restore to selectively restore the database, correct?


What is *NOT* included in a pg_dump that IS in pg_dumpall (Other than 
all the databases)? Things like user-defined functions and datatypes? 
Roles? Views?


I was leaning towards pg_dumpall, but then I realized that it only 
dumps in the standard SQL text file format, and it cannot be 
compressed automatically.


Things that span all databases in your cluster (like your roles) do not 
get saved with pg_dump. So yes, if you only have one database, 
pg_dumpall is probably what you should be using. You can always pipe the 
output of pg_dumpall to your compression program of choice.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] killing processes

2009-07-20 Thread Scott Marlowe

On Mon, Jul 20, 2009 at 4:44 PM, David Kerrd...@mr-paradox.net wrote:
 What's the generally accepted method for killing processes that went 'all 
 wacky' in postgres?

 I think i've seen in this group that kill -INT would be the way to go.

 I'm playing around with different options for a median function. this one got 
 out of hand
 and was taking too long, so i wanted to kill it:

 test=# select array_median(array(select t1 from test2 order by 1));
 ^CCancel request sent

 It just sits there, it's been trying to die for 1/2 an hour.

 At the OS it's taking up 100% of the CPU.

 I tried kill -INT pid but that didn't help.

 It's not updating anything, and i'm the only one in the database.

 Fortunatly it's not production, so I don't really care. But if it was 
 production, what would
 be the method to kill it?  (I know about kill -9, i'm assuming that == bad)

 If this were production, I'd need to end the process, force a rollback (if 
 necessary) and get
 my CPU back so just waiting for it to die really isn't an option...

 (PostgreSQL 8.3.5, linux/SLES11)

What's most likely happening is that it's stuck in a tight loop that
doesn't check for interrupts, so it just keeps on going.

You can kill -9 a process.  It'll cause the postmaster to kill all
backends and flush the buffers if I remember correctly.  Yeah, not the
ideal solution in production but on a non-prod machine it's an ok way
to get out of these issues.  And even in production, it's often much
faster to kill -9 a single process than to wait for it to finish.  I
think there might be choices other than -9 here, but I can't recall
them off the top of my head.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread John R Pierce


APseudoUtopia wrote:

Hey,

I'm writing a backup script. Right now, I only have one database on my 
postgresql server. I'm deciding if I should use pg_dump or pg_dumpall 
when backing up the server. As far as I can tell, pg_dumpall cannot 
compress the dumps automatically and it only dumps data in the 
standard SQL text file format. This means that I would not be able to 
use pg_restore to selectively restore the database, correct?


What is *NOT* included in a pg_dump that IS in pg_dumpall (Other than 
all the databases)? Things like user-defined functions and datatypes? 
Roles? Views?


views are part of a database, so they are included, so are user defined 
functions.whats not included are, primarily, roles/users.


I was leaning towards pg_dumpall, but then I realized that it only 
dumps in the standard SQL text file format, and it cannot be 
compressed automatically.


pgdumpall    | gzip  dumpfile.sql.gz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] killing processes

2009-07-20 Thread Merlin Moncure

On Mon, Jul 20, 2009 at 6:48 PM, Scott Marlowescott.marl...@gmail.com wrote:
 On Mon, Jul 20, 2009 at 4:44 PM, David Kerrd...@mr-paradox.net wrote:
 What's the generally accepted method for killing processes that went 'all 
 wacky' in postgres?

 I think i've seen in this group that kill -INT would be the way to go.

 I'm playing around with different options for a median function. this one 
 got out of hand
 and was taking too long, so i wanted to kill it:

 test=# select array_median(array(select t1 from test2 order by 1));
 ^CCancel request sent

 It just sits there, it's been trying to die for 1/2 an hour.

 At the OS it's taking up 100% of the CPU.

 I tried kill -INT pid but that didn't help.

 It's not updating anything, and i'm the only one in the database.

 Fortunatly it's not production, so I don't really care. But if it was 
 production, what would
 be the method to kill it?  (I know about kill -9, i'm assuming that == bad)

 If this were production, I'd need to end the process, force a rollback (if 
 necessary) and get
 my CPU back so just waiting for it to die really isn't an option...

 (PostgreSQL 8.3.5, linux/SLES11)

 What's most likely happening is that it's stuck in a tight loop that
 doesn't check for interrupts, so it just keeps on going.

 You can kill -9 a process.  It'll cause the postmaster to kill all
 backends and flush the buffers if I remember correctly.

That's right -- and if there is any case where built in functions,
sql, or pgsql functions get stuck in such a way that a kill -9 is
required, it should be reported so that it can be fixed.  Trust me,
you should avoid kill -9 if at all possible.  Backend C functions you
write should check interrupts at appropriate places so you can respond
to cancels appropriately with CHECK_FOR_INTERRUPTS();

A good example if how this is done, take a look at ITAGAKI's proposed
enhancement of dblink and the attached patch here:

http://www.nabble.com/query-cancel-issues-in-contrib-dblink-td24214133.html

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] killing processes

2009-07-20 Thread Tom Lane

Scott Marlowe scott.marl...@gmail.com writes:
 On Mon, Jul 20, 2009 at 4:44 PM, David Kerrd...@mr-paradox.net wrote:
 I'm playing around with different options for a median function. this one 
 got out of hand
 and was taking too long, so i wanted to kill it:
 
 test=# select array_median(array(select t1 from test2 order by 1));
 ^CCancel request sent
 
 It just sits there, it's been trying to die for 1/2 an hour.

 What's most likely happening is that it's stuck in a tight loop that
 doesn't check for interrupts, so it just keeps on going.

Yeah, that was my first thought.  What is array_median()?  If it's
custom C code, you probably forgot to put CHECK_FOR_INTERRUPTS in
suitable place(s).

 You can kill -9 a process.  It'll cause the postmaster to kill all
 backends and flush the buffers if I remember correctly.  Yeah, not the
 ideal solution in production but on a non-prod machine it's an ok way
 to get out of these issues.  And even in production, it's often much
 faster to kill -9 a single process than to wait for it to finish.  I
 think there might be choices other than -9 here, but I can't recall
 them off the top of my head.

Kill -9 on the postmaster is disrecommended for a number of reasons,
but kill -9 on an individual backend is quite predictable.  What it
will do is cause *all* of your open sessions to get aborted.  If you
can tolerate that then do it.

If you can't tolerate that, you can try a SIGTERM, but I suspect
that if the thing is too hosed to notice a cancel (a/k/a SIGINT)
then it won't notice SIGTERM either.  SIGTERM on a single backend
isn't as safe as the other options anyhow.  There used to be bugs
in that code path (still ARE known bugs, if you're not running a
pretty recent PG release) and I wouldn't trust it totally.

One thing you should definitely not try is SIGQUIT.  That will
kill the process all right (unless it's wedged a lot worse than
I think) but it will not do anything about cleaning up its shared
memory state, and that almost certainly will cause problems later.
Better to find a time when it's okay to use SIGKILL.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] killing processes

2009-07-20 Thread David Kerr

On Mon, Jul 20, 2009 at 07:18:07PM -0400, Merlin Moncure wrote:
- On Mon, Jul 20, 2009 at 6:48 PM, Scott Marlowescott.marl...@gmail.com wrote:
-  On Mon, Jul 20, 2009 at 4:44 PM, David Kerrd...@mr-paradox.net wrote:
-  What's most likely happening is that it's stuck in a tight loop that
-  doesn't check for interrupts, so it just keeps on going.
- 
-  You can kill -9 a process.  It'll cause the postmaster to kill all
-  backends and flush the buffers if I remember correctly.
- 
- That's right -- and if there is any case where built in functions,
- sql, or pgsql functions get stuck in such a way that a kill -9 is
- required, it should be reported so that it can be fixed.  Trust me,
- you should avoid kill -9 if at all possible.  Backend C functions you
- write should check interrupts at appropriate places so you can respond
- to cancels appropriately with CHECK_FOR_INTERRUPTS();
- 
- A good example if how this is done, take a look at ITAGAKI's proposed
- enhancement of dblink and the attached patch here:
- 
- http://www.nabble.com/query-cancel-issues-in-contrib-dblink-td24214133.html

Thanks guys.

I wonder if this is a case for a bug, of course, it's probably not 
reproduceable.
But, i don't see any coded loop or way for me to insert a signal check. (I'm 
not much of a 
programmer) the function was just:

CREATE OR REPLACE FUNCTION array_median(anyarray)
RETURNS anyelement AS
$$
SELECT CASE 
WHEN array_upper($1,1) = 0 THEN null
WHEN mod(array_upper($1,1),2) = 1 THEN
asorted[ceiling(array_upper(asorted,1)/2.0)]
ELSE
((asorted[ceiling(array_upper(asorted,1)/2.0)] + 
asorted[ceiling(array_upper(asorted,1)/2.0)+1])/2.0) END
FROM (SELECT ARRAY(SELECT ($1)[n] FROM  
generate_series(1, array_upper($1, 1)) AS n 
WHERE ($1)[n] IS NOT NULL   
ORDER BY ($1)[n]
) As asorted) As foo ;  
$$
LANGUAGE 'sql' IMMUTABLE; 

(got it from here - thanks to the author =) )
http://www.postgresonline.com/journal/index.php?/archives/67-Build-Median-Aggregate-Function-in-SQL.html

Thanks

Dave

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server Backup: pg_dump vs pg_dumpall

2009-07-20 Thread Steve Crawford


John R Pierce wrote:

APseudoUtopia wrote:
...


I was leaning towards pg_dumpall, but then I realized that it only 
dumps in the standard SQL text file format, and it cannot be 
compressed automatically.


pgdumpall    | gzip  dumpfile.sql.gz



That deals with compression. But if you want to use the new 
parallel-restore feature in 8.4 pg_restore which can be *way* faster if 
you have multiple cores available then you will need to backup using 
pg_dump with the custom format (-Fc).


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Help needed for reading postgres log : RE: [GENERAL] Concurrency issue under very heay loads

2009-07-20 Thread Alvaro Herrera

Raji Sridar (raji) wrote:
 I would like some help in reading the postgres logs.
 Here is a snippet of the log.
 Auto commit seems to be set to false.
 But still the logs shows CommitTransactionCommand in debug mode.
 The same order number is given for multiple clients.
 Please see CommitTransactionCommand below for both select ...for
 update and update... SQLs and let me know if I am reading correctly
 that auto commit is actually effective.

CommitTransactionCommand is an internal function that has nothing to do
with a SQL-level COMMIT.  If there were a true transaction commit you'd
see a debug entry saying CommitTransaction.

You seem to be barking up the wrong tree here.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] First query very slow. Solutions: memory, or settings, or SQL?

2009-07-20 Thread Phoenix Kiula

On Mon, Jul 20, 2009 at 2:04 PM, Peter Eisentrautpete...@gmx.net wrote:
 On Monday 20 July 2009 06:45:40 Phoenix Kiula wrote:
 explain analyze select * from sites where user_id = 'phoenix' order by
 id desc limit 10;

                                                         QUERY PLAN
 ---
--- Limit
 (cost=344.85..344.87 rows=10 width=262) (actual
 time=5879.069..5879.167 rows=10 loops=1)
    -  Sort  (cost=344.85..345.66 rows=323 width=262) (actual
 time=5879.060..5879.093 rows=10 loops=1)
          Sort Key: id
          -  Index Scan using new_idx_sites_userid on sites
 (cost=0.00..331.39 rows=323 width=262) (actual time=44.408..5867.557
 rows=2178 loops=1)
                Index Cond: ((user_id)::text = 'phoenix'::text)
  Total runtime: 5879.414 ms
 (6 rows)

 The row estimate for the index scan is off.  Try reanalyzing, or increase the
 statistics target.



How did you tell that the row estimate is off? I mean which numbers?

Also, my statistics are already set pretty high. On USER_ID they are
at 100. Which columns should I increase the stats on, those in the
WHERE clause or those in the SELECT bit? The USER_ID and the
USER_KNOWN, which are in the WHERE clause, both have statistics of
100! And USER_KNOWN is just a binary value (0 or 1) so I wonder what
purpose stats would serve?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best practices for moving UTF8 databases

2009-07-20 Thread Phoenix Kiula

 0x80 makes me think of the following:
 The data originate from a Windows system, where 0x80 is a Euro
 sign. Somehow these were imported into PostgreSQL without the
 appropriate translation into UTF-8 (how I do not know).

 I wonder: why do you spend so much time complaining instead of
 simply locating the buggy data and fixing them?


I'd love to fix them. But if I do a search for

SELECT * FROM xyz WHERE col like '%0x80%'

it doesn't work. How should I search for these characters?

Thanks much

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] commercial adaptation of postgres

2009-07-20 Thread Dennis Gearon


I once talked to a company that made a custome version of Postgres. It split 
tables up on columns and also by rows, had some other custome features. It was 
enormously faster from what I gathered.

I could of sworn it began with the letter 'T', but maybe not. I don't see 
anything like that on the commercial page of the posgres site.

Does anyone know what it is out there in enterprise commercially modified 
postgres servers? (on 64 bit machines, preferably)


Dennis Gearon

Signature Warning

EARTH has a Right To Life

I agree with Bolivian President Evo Morales

# The right to life: The right for no ecosystem to be eliminated by the 
irresponsible acts of human beings.

# The right of biosystems to regenerate themselves: Development cannot be 
infinite. There's a limit on everything.

# The right to a clean life: The right for Mother Earth to live without 
contamination, pollution. Fish and animals and trees have rights.

# The right to harmony and balance between everyone and everything: We are all 
interdependent.


See the movie - 'Inconvenient Truth'
See the movie - 'Syriana'

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] commercial adaptation of postgres

2009-07-20 Thread Christophe



On Jul 20, 2009, at 6:56 PM, Dennis Gearon wrote:



I once talked to a company that made a custome version of Postgres.  
It split tables up on columns and also by rows, had some other  
custome features. It was enormously faster from what I gathered.


I could of sworn it began with the letter 'T', but maybe not. I  
don't see anything like that on the commercial page of the posgres  
site.


Truviso?

http://truviso.com/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] commercial adaptation of postgres

2009-07-20 Thread Ries van Twisk



On Jul 20, 2009, at 8:56 PM, Dennis Gearon wrote:



I once talked to a company that made a custome version of Postgres.  
It split tables up on columns and also by rows, had some other  
custome features. It was enormously faster from what I gathered.


I could of sworn it began with the letter 'T', but maybe not. I  
don't see anything like that on the commercial page of the posgres  
site.


Does anyone know what it is out there in enterprise commercially  
modified postgres servers? (on 64 bit machines, preferably)


Yahoo did that, they where planning to open source it, that's all I  
know











--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] killing processes

2009-07-20 Thread Tom Lane

David Kerr d...@mr-paradox.net writes:
 But, i don't see any coded loop or way for me to insert a signal check. (I'm 
 not much of a 
 programmer) the function was just:

 CREATE OR REPLACE FUNCTION array_median(anyarray)
 RETURNS anyelement AS
 $$
 SELECT CASE 
 WHEN array_upper($1,1) = 0 THEN null
 WHEN mod(array_upper($1,1),2) = 1 THEN
 asorted[ceiling(array_upper(asorted,1)/2.0)]
 ELSE
 ((asorted[ceiling(array_upper(asorted,1)/2.0)] + 
 asorted[ceiling(array_upper(asorted,1)/2.0)+1])/2.0) END
 FROM (SELECT ARRAY(SELECT ($1)[n] FROM  
 generate_series(1, array_upper($1, 1)) AS n 
 WHERE ($1)[n] IS NOT NULL   
 ORDER BY ($1)[n]
 ) As asorted) As foo ;  
 $$
 LANGUAGE 'sql' IMMUTABLE; 

Huh.  How big an array were you trying to invoke it on?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] memory leak occur when disconnect database

2009-07-20 Thread tanjunhua


Because of the three-day break, my response is late.


Valgrind is a great tool, but you must learn how to identify false
positives and tell the difference between a leak that matters (say 1kb
allocated and not freed in a loop that runs once per second) and a leak
that doesn't.


I get the memory leak scenario not only from Valgrind, but also from the 
output of  top command.
At first I think the memory leak occur when I disconnect database by 
Valgrind, then I write a test sample that just connect and disconnect 
database in a infinite loop to check it. after 1 hour performation, 7M 
memory is leaked.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] memory leak occur when disconnect database

2009-07-20 Thread tanjunhua


Because of the three-day break, my response is late.


8.1.8 is pretty old.
Also you'll have better luck getting help if you actually include the
output
from Valgrind.


the output from Valgrind is not stored. from now on, I will do it again and 
get the result from Valgrind.


PS: the memory leak scenario is not only got from Valgrind, but also from 
the period output of top command.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

55 matches

Mail list logo