Re: [GENERAL] tcp_keepalives_idle ignored

2008-01-13 Thread henry


On Sat, January 12, 2008 6:50 pm, Tom Lane wrote:
 henry [EMAIL PROTECTED] writes:
 I have tried setting tcp_keepalives_idle = 120 (eg), then restarting PG,
 but SHOW ALL; shows tcp_keepalives_idle=0 (ignoring my setting).

 Just FYI, this is the expected behavior on platforms where the kernel
 doesn't allow adjustment of the TCP keepalive parameters from
 userspace.  You didn't say what you are running the server on, but
 the reference to /proc/sys makes me think it's Linux ... which seems
 odd, because modern Linuxen do allow these things to be adjusted.

Yes, this is on Linux - adjusting the param under /proc/sys works as
expected, but doesn't seem to have an effect on my prob.

 [ thinks... ]  Maybe you were inspecting the value in a Unix-socket
 connection instead of a TCP connection?

See below (not sure I understand when you ask 'inspecting').

 This is all irrelevant to your real problem, to judge by the rest of
 the thread, but I'm curious.

I did in fact find a leak in long-lived procs (some of which can run for
days) - but squashing that did not make my problem go away.  In fact,
these procs are connecting to port TCP 5432 - not a socket
(/tmp/.s.PGSQL.5432), TCP connections to 5432 come and go nicely in sync
with the number of active procs.

The number of /tmp/.s.PGSQL.5432 connections just keep growing...  I have
no idea what's causing it.

lsof doesn't tell me what's talking to PG through /tmp/.s.PGSQL.5432
either.  Maybe I'm not understanding exactly how /tmp/.s.PGSQL.5432 is
used - what would connect to PG via a domain socket?  procs which don't
explicitly use -p5432, or some other mechanism which I'm ignorant of?

If I could just figure out what the hell's using /tmp/.s.PGSQL.5432 then I
could get a handle on the problem.

Increasing max_connections to cope without understanding what's happening
is an irritation (although that could be what's required in the final
analysis anyway).  It's currently set to 2048, which gets masticated in
several hours (a result which doesn't make sense considering the number of
active procs) - and I haven't really started using the cluster/s to the
full potential yet.

Any suggestions/pointers are greatly appreciated.

Regards
Henry


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] 8.2.4 serious slowdown

2008-01-13 Thread Clodoaldo
2008/1/13, Sim Zacks [EMAIL PROTECTED]:
 How would you rewrite something like:
WHERE (COALESCE(b.quantity, 0) - COALESCE(b.deliveredsum, 0))  0;
 I could write:
 where case when b.quantity is null then 0 else b.quantity end - case when 
 b.deliveredsum is null then 0 else b.deliveredsum end  0

 It is butt ugly, but is that the most efficient way to write it in 8.2.4?

I don't know if the plan would be the same but this is a bit clearer:

WHERE COALESCE(b.quantity, 0)  COALESCE(b.deliveredsum, 0)

Regards, Clodoaldo Pinto Neto

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] 8.2.4 serious slowdown

2008-01-13 Thread Sim Zacks

 I don't know if the plan would be the same but this is a bit clearer:

 WHERE COALESCE(b.quantity, 0)  COALESCE(b.deliveredsum, 0)

That should be true, but sometimes we get deliveries of greater quantity then 
we ordered.
I just want to know the times when I haven't gotten the complete order yet.
If we get more then we ordered, I don't want it to be in this query.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] tcp_keepalives_idle ignored

2008-01-13 Thread Martijn van Oosterhout
On Sun, Jan 13, 2008 at 08:40:34AM +0200, henry wrote:
  This is all irrelevant to your real problem, to judge by the rest of
  the thread, but I'm curious.
 
 I did in fact find a leak in long-lived procs (some of which can run for
 days) - but squashing that did not make my problem go away.  In fact,
 these procs are connecting to port TCP 5432 - not a socket
 (/tmp/.s.PGSQL.5432), TCP connections to 5432 come and go nicely in sync
 with the number of active procs.
 
 The number of /tmp/.s.PGSQL.5432 connections just keep growing...  I have
 no idea what's causing it.

Do you have any kind of logging? At the very least pg_stat_activity
should tell you if they're doing anything.

 lsof doesn't tell me what's talking to PG through /tmp/.s.PGSQL.5432
 either.  Maybe I'm not understanding exactly how /tmp/.s.PGSQL.5432 is
 used - what would connect to PG via a domain socket?  procs which don't
 explicitly use -p5432, or some other mechanism which I'm ignorant of?

Maintainence programs? lsof will tell you what's doing it. Try (as
root):

lsof |grep '.s.PGSQL'

That will list a lot of postgres processes, you're looking for the
other ones.

Connecting to unix domain socket happens if you don't specify a host.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Those who make peaceful revolution impossible will make violent revolution 
 inevitable.
  -- John F Kennedy


signature.asc
Description: Digital signature


Re: [GENERAL] How to safely compare transaction id?

2008-01-13 Thread alphax

Marko Kreen wroted:

By the way, Can I think that the value of system column ctid of an
record is the logical current version of that record, and used to
compare with txid returned by FUNCTIONS-TXID-SNAPSHOT?



No, it is just physical location of the row.  

Thanks, I just done some simple tested,
//
// initialize
//
create table test (last_update_tx_id);
insert into test(txid_current());



/
// Every time I do an update, I found the last_update_tx_id is equal to 
the xmin


begin
update test set last_update_tx_id = txid_current();
commit;

select *, test.xmin from test;
//

So, it seems the system column cmin is the logical current version 
of that record, isn't it?



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] ECPG problem with 8.3

2008-01-13 Thread Michael Meskes
On Fri, Jan 11, 2008 at 11:51:08PM +, Peter Wilson wrote:
 I've just tried compiling our project against the 8.3RC1 code. This is 
 the first time I've tried any release of 8.3.
 ...
 crbembsql.pgC:254: error: invalid conversion from `int' to 
 `ECPG_statement_type'
 crbembsql.pgC:254: error:   initializing argument 6 of `bool ECPGdo(int, 
 int, int, const char*, char, ECPG_statement_type, const char*, ...)'

It seems that some compilers don't like int/enum aliasing here. I
changed this in CVS could you please re-try? 

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] ECPG problem with 8.3

2008-01-13 Thread Peter Wilson

Michael Meskes wrote:

On Fri, Jan 11, 2008 at 11:51:08PM +, Peter Wilson wrote:
I've just tried compiling our project against the 8.3RC1 code. This is 
the first time I've tried any release of 8.3.

...
crbembsql.pgC:254: error: invalid conversion from `int' to `ECPG_statement_type'
crbembsql.pgC:254: error:   initializing argument 6 of `bool ECPGdo(int, 
int, int, const char*, char, ECPG_statement_type, const char*, ...)'


It seems that some compilers don't like int/enum aliasing here. I
changed this in CVS could you please re-try? 


Michael

Thank you Michael,
that fixes that problem. My build now gets further, but I get an error and a 
seg-fault later in the build.


I have a file that contains the following line :


EXEC SQL FETCH ABSOLUTE :count SEARCHCURSOR INTO
   :db.contact_id, :db.uname, :db.type, :db.parent,
   :db.name, :db.phone, :db.fax, :db.email, :db.description,
   :db.custom_data, :db.value, :db.relevance,
   :db.parentName :vl_parentName,
   :db.keywords :vl_keywords,
   :membOfRecordCount;

this has worked in every version of ECPG since 7.4 (when we started using 
Postgres). I now get the following error :


$ /usr/local/pgsql/bin/ecpg -t -o contactRecord.cxx -I 
/usr/local/pgsql/pgsql/include contactRecord.pgC


Starting program: /usr/local/pgsql.8.3.rc1.patch/bin/ecpg -t -o 
contactRecord.cxx -I /usr/local/pgsql/include contactRecord.pgC

contactRecord.pgC:1338: ERROR: fetch/move count must not be a variable.
gmake[1]: *** [contactRecord.cxx] Segmentation fault
gmake[1]: *** Deleting file `contactRecord.cxx'
gmake[1]: Leaving directory 
`/var/build/whitebeam/templates/pgsql/contacts-pgsql'
gmake: *** [all] Error 2

-
Running under GDB gives a stack trace as :
Program received signal SIGSEGV, Segmentation fault.
0x00bd0da3 in strlen () from /lib/tls/libc.so.6
(gdb) i s 5
#0  0x00bd0da3 in strlen () from /lib/tls/libc.so.6
#1  0x080494b1 in cat2_str (str1=0x969bae0 fetch, str2=0x0) at preproc.y:105
#2  0x0804955e in cat_str (count=4) at preproc.y:128
#3  0x0805027e in base_yyparse () at preproc.y:2299
#4  0x08067f12 in main (argc=7, argv=0xfef93284) at ecpg.c:457
(gdb) i s
#0  0x00bd0da3 in strlen () from /lib/tls/libc.so.6
#1  0x080494b1 in cat2_str (str1=0x969bae0 fetch, str2=0x0) at preproc.y:105
#2  0x0804955e in cat_str (count=4) at preproc.y:128
#3  0x0805027e in base_yyparse () at preproc.y:2299
#4  0x08067f12 in main (argc=7, argv=0xfef93284) at ecpg.c:457

---
Apart from the seg-fault, is there any particular reason I can't use a variable 
in the FETCH anymore? It's always worked in the past and would seem to be an 
important capability.


Pete

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] tcp_keepalives_idle ignored

2008-01-13 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes:
 On Sun, Jan 13, 2008 at 08:40:34AM +0200, henry wrote:
 lsof doesn't tell me what's talking to PG through /tmp/.s.PGSQL.5432
 either.  Maybe I'm not understanding exactly how /tmp/.s.PGSQL.5432 is
 used - what would connect to PG via a domain socket?

 Connecting to unix domain socket happens if you don't specify a host.

Specifically, a local connection goes through the socket file by
default, and via TCP only if you say -h localhost or equivalent.

Now that we know the problem connections are local ones, setting a TCP
timeout would be 100% useless anyway.  As Martijn says, the other ends
of the socket connections *must* be on that box someplace --- I'm
inclined to think that lsof didn't show them to you because you didn't
run it as root and they belong to some other userid.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] How to safely compare transaction id?

2008-01-13 Thread Tom Lane
alphax [EMAIL PROTECTED] writes:
 So, it seems the system column cmin is the logical current version 
 of that record, isn't it?

No.  Have you read
http://www.postgresql.org/docs/8.2/static/ddl-system-columns.html

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] tcp_keepalives_idle ignored

2008-01-13 Thread henry


On Sun, January 13, 2008 7:25 pm, Tom Lane wrote:
 Martijn van Oosterhout [EMAIL PROTECTED] writes:
 On Sun, Jan 13, 2008 at 08:40:34AM +0200, henry wrote:
 lsof doesn't tell me what's talking to PG through /tmp/.s.PGSQL.5432
 either.  Maybe I'm not understanding exactly how /tmp/.s.PGSQL.5432 is
 used - what would connect to PG via a domain socket?

 Connecting to unix domain socket happens if you don't specify a host.

 Specifically, a local connection goes through the socket file by
 default, and via TCP only if you say -h localhost or equivalent.

 Now that we know the problem connections are local ones, setting a TCP
 timeout would be 100% useless anyway.  As Martijn says, the other ends
 of the socket connections *must* be on that box someplace --- I'm
 inclined to think that lsof didn't show them to you because you didn't
 run it as root and they belong to some other userid.

Yes, it was run as root, but all lsof tells you is that postgres has that
socket open - not the name/id of the other end of the connection.

A genuine thanks to Tom and Martijn for your helpful suggestions.  It's
exactly what I needed to identify what was going on.

The problem was the following (to ensure googlability):

To achieve DB replication from a master to quote a few slaves, in multiple
clusters, we're using PostgreSQL, Slony and PgPool.  By default, PgPool is
configured to connect to a local domain socket (backend_host_name = ''). 
This is fine, but it's also configured by default to never time out when
idle (connection_life_time = 0)...

Changing those two (localhost and [eg] 300) has resolved my problem of
ever-increasing /tmp/.s.PGSQL.5432 domain socket connections.

Once again, thanks.  Now, back to the lounge to catch late-night Sunday TV.

Regards
Henry


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Satisfactory Query Time

2008-01-13 Thread x asasaxax
Hi,

   I have a query that takes 0.450 ms. Its a xml query. Is that a good time
for a query? If a have multiple connections on the database, will this time
makes my db slow? How much time is good for a xml query?

Thanks


[GENERAL] tsearch2 install on Fedora Core 5 problems

2008-01-13 Thread Satch Jones
Hello - I can't get tsearch2 running in a long-functioning instance of
PostgreSQL 8.1.9 on Fedora Core 5, and could use some help.

 

When I place the tsearch2 source files under the contrib folder in a
standard Fedora Core 5 postgres install (/usr/share/pgsql/contrib) and run
make, I get the following

 

# make

Makefile:33: ../../src/Makefile.global: No such file or directory

Makefile:34: /contrib/contrib-global.mk: No such file or directory

make: *** No rule to make target `/contrib/contrib-global.mk'.  Stop.

 

I changed the top_builddir location in Makefile from ../.. to the standard
Fedora location for the Makefile.global file (usr/lib/pgsql/pgxs), then run
make again:

 

# make

Makefile:33: /usr/lib/pgsql/pgxs/contrib/contrib-global.mk: No such file or
directory

make: *** No rule to make target
`/usr/lib/pgsql/pgxs/contrib/contrib-global.mk'.  Stop.

 

I searched for 'contrib-global.mk' and it's nowhere on the system. I'm
stumped, and have spent half a day trying various things. I'd like to avoid
doing a fresh PostgreSQL install from source, with all the migration issues
that implies.

 

Come to find out, postgresql is installed in Fedora according to the
elaborate directory structure below, which varies from locations described
in PostgreSQL documentation, as described in:

 
http://pgfoundry.org/docman/view.php/148/1338/PostgreSQL-RPM-Installatio
n-PGDG.pdf

 

any ideas welcome..

 



++

Fedora PostgreSQL locations:

Executables

/usr/bin

Libraries

/usr/lib

Documentation

/usr/share/doc/postgresqlx.y.z  /usr/share/doc/postgresqlx.y.z/contrib

Contrib

/usr/share/pgsql/contrib

Data

/var/lib/pgsql/data

Backup area

/var/lib/pgsql/backup

Templates

/usr/share/pgsql

Procedural Languages

/usr/lib/pgsql

Development Headers

/usr/include/pgsql

Other shared data

/usr/share/pgsql

Regression tests

/usr/lib/pgsql/test/regress (in the -test package)

Executables /usr/bin

Documentation SGML

/usr/share/doc/postgresqldocsx.y.z

 

 



Re: [GENERAL] tsearch2 install on Fedora Core 5 problems

2008-01-13 Thread Tom Lane
Satch Jones [EMAIL PROTECTED] writes:
 Hello - I can't get tsearch2 running in a long-functioning instance of
 PostgreSQL 8.1.9 on Fedora Core 5, and could use some help.

Rather than trying to compile it yourself, why don't you just install
the postgresql-contrib RPM that goes with the postgresql version you're
using?

 When I place the tsearch2 source files under the contrib folder in a
 standard Fedora Core 5 postgres install (/usr/share/pgsql/contrib) and run
 make, I get the following

That's not going to work --- they expect to be in a configured source
tree.

[ thinks... ]  Actually, it looks like this would work in 8.1:
make USE_PGXS=1 all install
but I still wonder why you'd want to compile from source when you're
using an RPM distribution.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] What pg_restore does to a non-empty target database

2008-01-13 Thread Ken Winter
I need to understand, in as much detail as possible, the results that will
occur when pg_restore restores from an archive file into a target database
that already contains some database objects.  I can't find any reference
that spells this out.  (The PG manual isn't specific enough.)

Instead of just asking questions, I decided to make my best guess about the
answers (below), and ask you to confirm, refute, and correct my guesses
until this becomes a complete and accurate statement.  If I have left out
any conditions that would affect the results, please add them into the
account.  A definitive version of this story might be worth posting in some
more permanent and visible place than an e-mail thread.

In case it matters, I'm currently working with PostgreSQL 8.0.  I don't know
if the truth I'm seeking here is version-dependent.  Also, I'm assuming the
archive is in compressed format.  I don't know how different the story would
be if the archive were in a different format.

~ TIA
~ Ken



Given a pg_restore command (possibly filtered and reordered by a ToC file),
where:
 * A is the source archive file (as filtered and reordered by the ToC file, 
   if any)
 * T is the target database
 * O is a database object (table, function, etc) that exists in A 
   and/or in C

The following are the changes that the pg_restore will produce in T.

If object O exists in both A and T:
  If the command says --clean:
T's version of O is dropped
A's version of O is created
Else:
T's version of O is left unchanged
If object O exists in T but not in A:
T's version of O is left unchanged
If object O exists in A but not in T:
A's version is created

Suppose in addition that O is a data object (a table or sequence) that is
defined by the database schema and contains data in both A and T.

If the command says --data-only:
T's schema definition of O is left unchanged
T's O data are deleted
A's O data are inserted
If the command says --schema-only:
T's schema definition of O is dropped
T's O data are deleted (as a side-product of the drop)
A's schema definition of O is created
No O data are inserted
If the command says --data-only and --schema-only:
T's schema definition of O is left unchanged
T's O data are left unchanged
In other words, nothing changes
If the command says neither --data-only nor --schema-only:
T's schema definition of O is dropped
T's O data are deleted (as a side-product of the drop)
A's schema definition of O is created
A's O data are inserted
In other words, A's version of O entirely replaces T's version

Suppose in addition that the command says --data-only, it doesn't say
--exit-on-error, and T's schema definition of O is different from A's.

If T's schema includes a column O.C that does not exist in A's schema:
A's O data are inserted, and O.C is Null in all rows
If A's schema includes a column O.C that does not exist in T's schema:
A's O data are inserted, but A's values of O.C are lost
If T's schema includes a constraint K that does not exist in A's schema:
A's O data are inserted, except for those that violate K
If A's schema includes a constraint K that does not exist in T's schema:
A's O data are all inserted




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] How to safely compare transaction id?

2008-01-13 Thread Trevor Talbot
On 1/11/08, alphax [EMAIL PROTECTED] wrote:

 I want to determines a given record which visible to current transaction
 whether or not be updated after some time point, that time point is
 indicated by aother transaction id started and committed in past time.

I'm not sure I understand, but maybe this thread will help?
http://archives.postgresql.org/pgsql-general/2007-10/msg00503.php

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] What pg_restore does to a non-empty target database

2008-01-13 Thread Tom Lane
Ken Winter [EMAIL PROTECTED] writes:
 I need to understand, in as much detail as possible, the results that will
 occur when pg_restore restores from an archive file into a target database
 that already contains some database objects.  I can't find any reference
 that spells this out.  (The PG manual isn't specific enough.)

It's not nearly as clean as you are hoping, unless you use --clean.

When you don't, the restore will just try to issue the creation
commands.  Obviously the initial CREATE for a conflicting object will
fail due to a name collision, but in some cases the script involves a
CREATE followed by various ALTERs, which might or might not succeed
against the object that was there before.  And don't forget that any
data to be inserted into a table will be appended to what was there
before, assuming that the existing table is close enough to the schema
of the source table for this to succeed.

You could run the restore in exit-on-error mode, which would ensure
no damage to existing objects, but then any new objects following the
first error wouldn't get loaded.

We don't (and won't) specify either the order in which pg_dump dumps
unrelated objects or the exact breakdown of CREATE/ALTER operations,
so the details of what would happen are very subject to change across
releases.

Bottom line: best use --clean if you want at-all-predictable behavior
for this scenario.

 If the command says --data-only:
   T's schema definition of O is left unchanged
   T's O data are deleted
   A's O data are inserted

In no case (other than --clean) will the restore attempt to remove any
existing data.  None of the other fancy stuff you have invented out of
thin air will happen, either.

One control you have over what happens in a data-only restore is to dump
the data as COPY (one command per table) or as INSERTs (one command per
row).  If you use COPY, and there's any error in the incoming data for a
table, then none of it gets inserted; while with INSERTs each row
succeeds or fails independently.  I can see uses for each behavior.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] query performance

2008-01-13 Thread pepone . onrez
I have this query in a table with 150 thowsand tuples and it takes to long

t_documentcontent._id AS _id
FROM t_documentcontent LIMIT 50 OFFSET 8

here is the explain output

Limit  (cost=100058762.30..100058799.02 rows=50 width=58) (actual time=
19433.474..19433.680 rows=50 loops=1)
  -  Seq Scan on t_documentcontent
(cost=1.00..100110772.07rows=150807 width=58) (actual time=
53.934..19402.030 rows=80050 loops=1)
Total runtime: 19433.748 ms

here the table structure

CREATE TABLE t_documentcontent(
_id varchar(60) NOT NULL,
_filesystem varchar(60) NOT NULL,
_parent varchar(60) NOT NULL,
_document varchar(60) NOT NULL,
_title varchar NOT NULL,
_resume varchar,
_content varchar,
_lang integer NOT NULL,
_creationdate timestamp NOT NULL DEFAULT now(),
_updatedate timestamp NOT NULL DEFAULT now(),
_indexeddate timestamp NOT NULL DEFAULT now(),
CONSTRAINT documentcontent_pkey PRIMARY KEY (_id),
CONSTRAINT documentcontent_filesystem_fkey
FOREIGN KEY (_filesystem) REFERENCES t_filesystem(_id) ON UPDATE
CASCADE ON DELETE NO ACTION,
CONSTRAINT documentcontent_parent_fkey FOREIGN KEY (_parent) REFERENCES
t_node(_id) ON DELETE NO ACTION,
CONSTRAINT documentcontent_document_fkey
FOREIGN KEY (_document) REFERENCES t_node(_id) ON UPDATE CASCADE ON
DELETE NO ACTION,
CONSTRAINT documentcontent_lang_lang_fkey FOREIGN KEY (_lang) REFERENCES
t_lang(_id)) WITHOUT OIDS;

Any ideas for improve this query performance.


Re: [GENERAL] query performance

2008-01-13 Thread Alex Turner
If you haven't already, make sure you've done a vacuum full recently.  When
in doubt, pg_dump the db, and reload it, and see if that helps, but this
works for me:

create table overview as select _id from t_documentcontent;
alter table overview add constraint overview_pkey primary key (_id);

select * from t_documentcontent where _id in (select _id
FROM overview LIMIT 50 OFFSET 8);

create a trigger on insert/delete/update to keep this table in sync, or if
you don't need to then just re-run the create every so often depending on
your needs (I'll be happy to demonstrate the required triggers if you need
it).

make sure that you have adequate RAM available for file cache, hitting the
disk everytime you query will suck no matter what you do.

Alex

On Jan 13, 2008 9:58 PM, pepone. onrez [EMAIL PROTECTED] wrote:

 I have this query in a table with 150 thowsand tuples and it takes to long

 t_documentcontent._id AS _id
 FROM t_documentcontent LIMIT 50 OFFSET 8

 here is the explain output

 Limit  (cost= 100058762.30..100058799.02 rows=50 width=58) (actual time=
 19433.474..19433.680 rows=50 loops=1)
   -  Seq Scan on t_documentcontent  
 (cost=1.00..100110772.07rows=150807 width=58) (actual time=
 53.934..19402.030 rows=80050 loops=1)
 Total runtime: 19433.748 ms

 here the table structure

 CREATE TABLE t_documentcontent(
 _id varchar(60) NOT NULL,
 _filesystem varchar(60) NOT NULL,
 _parent varchar(60) NOT NULL,
 _document varchar(60) NOT NULL,
 _title varchar NOT NULL,
 _resume varchar,
 _content varchar,
 _lang integer NOT NULL,
 _creationdate timestamp NOT NULL DEFAULT now(),
 _updatedate timestamp NOT NULL DEFAULT now(),
 _indexeddate timestamp NOT NULL DEFAULT now(),
 CONSTRAINT documentcontent_pkey PRIMARY KEY (_id),
 CONSTRAINT documentcontent_filesystem_fkey
 FOREIGN KEY (_filesystem) REFERENCES t_filesystem(_id) ON UPDATE
 CASCADE ON DELETE NO ACTION,
 CONSTRAINT documentcontent_parent_fkey FOREIGN KEY (_parent)
 REFERENCES t_node(_id) ON DELETE NO ACTION,
 CONSTRAINT documentcontent_document_fkey
 FOREIGN KEY (_document) REFERENCES t_node(_id) ON UPDATE CASCADE
 ON DELETE NO ACTION,
 CONSTRAINT documentcontent_lang_lang_fkey FOREIGN KEY (_lang)
 REFERENCES t_lang(_id)) WITHOUT OIDS;

 Any ideas for improve this query performance.



Re: [GENERAL] query performance

2008-01-13 Thread Scott Marlowe
On Jan 13, 2008 8:58 PM, pepone. onrez [EMAIL PROTECTED] wrote:
 I have this query in a table with 150 thowsand tuples and it takes to long

 t_documentcontent._id AS _id
 FROM t_documentcontent LIMIT 50 OFFSET 8

 here is the explain output

 Limit  (cost= 100058762.30..100058799.02 rows=50 width=58) (actual
 time=19433.474..19433.680 rows=50 loops=1)
   -  Seq Scan on t_documentcontent  (cost=1.00..100110772.07
 rows=150807 width=58) (actual time=53.934..19402.030 rows=80050 loops=1)
 Total runtime: 19433.748 ms

looks like you've set enable_seqscan=off.  When looking for help on
queries it's a good idea to mention such things...

with no order by, and possibly no index on t_documentcontent._id,
there's no choice but a seq scan.

try adding both.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] query performance

2008-01-13 Thread Tom Lane
Scott Marlowe [EMAIL PROTECTED] writes:
 On Jan 13, 2008 8:58 PM, pepone. onrez [EMAIL PROTECTED] wrote:
 t_documentcontent._id AS _id
 FROM t_documentcontent LIMIT 50 OFFSET 8

 with no order by, and possibly no index on t_documentcontent._id,
 there's no choice but a seq scan.

More to the point: a query with a huge OFFSET is *always* going to suck,
because there is no choice but to read through all those records before
getting to the ones you want.  You need to fundamentally rethink how you
are going about this.  I'm assuming that this is actually just one query
in a series that are intended to eventually fetch the whole table.

One solution is to set up a cursor and FETCH 50 rows at a time from it.
However that requires holding a transaction open, which might not work
well in your environment.

Another possibility, if you have a primary key on the table, is to do
something like

SELECT ... FROM ... WHERE id  ? ORDER BY id LIMIT 50

where you leave out the WHERE clause on the first call, and on
subsequent calls '?' is the last id value seen in the prior call.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] query performance

2008-01-13 Thread pepone . onrez
Sorry Alex i forget mention that i have setscan of in my last test.

now I have set seqscan on  and indexscan on and added order by _id

The table has an index in the _id field

CREATE INDEX i_documentcontent_document
  ON t_documentcontent
  USING btree
  (_document);

The database was rencently vacum analyze , but not vacun full

here is the explain of 2 diferent queries , when i put a large OFFSET

 EXPLAIN ANALYZE SELECT
t_documentcontent._id AS _id
FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 5

Limit  (cost=137068.24..137068.36 rows=50 width=58) (actual time=
41119.702..41119.792 rows=50 loops=1)
  -  Sort  (cost=136943.24..137320.26 rows=150807 width=58) (actual time=
41064.802..41100.424 rows=50050 loops=1)
Sort Key: _id
-  Seq Scan on t_documentcontent  (cost=
0.00..110772.07rows=150807 width=58) (actual time=
106.679..33267.194 rows=150807 loops=1)
Total runtime: 41120.015 ms

 EXPLAIN ANALYZE SELECT
t_documentcontent._id AS _id
FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 1

Limit  (cost=39839.37..40038.56 rows=50 width=58) (actual time=
1172.969..1194.228 rows=50 loops=1)
  -  Index Scan using i_documentcontent_id on t_documentcontent  (cost=
0.00..600805.54 rows=150807 width=58) (actual time=
0.077..1189.688rows=10050 loops=1)
Total runtime: 1194.316 ms

Tom
 i using uuid for the _id field that is the primary key  add a WHERE id  ?
don 't apply
the cursor aproach is also not suitable for same of my queries

I use this query for paginate contents of a filesysstem with lots of
documents avoid offset is not posible always


Re: [GENERAL] query performance

2008-01-13 Thread Alex Turner
If you have to access the data this way (with no where clause at all - which
sometimes you do) then I have already provided a solution that will work
reasonably well.  If you create what is essentially a materialized view of
just the id field, the sequence scan will return much fewer pages than when
you do it on the main table.  Then you join it to the indexed main table,
and page in just the rows you need.  Voila - much faster result.  Of course
we haven't really talked about how that will affect insert speed and delete
speed if you trigger then up, but you haven't really talked about any
requirements there.

Alex

On Jan 13, 2008 11:27 PM, pepone. onrez [EMAIL PROTECTED] wrote:

 Sorry Alex i forget mention that i have setscan of in my last test.

 now I have set seqscan on  and indexscan on and added order by _id

 The table has an index in the _id field

 CREATE INDEX i_documentcontent_document
   ON t_documentcontent
   USING btree
   (_document);

 The database was rencently vacum analyze , but not vacun full

 here is the explain of 2 diferent queries , when i put a large OFFSET

  EXPLAIN ANALYZE SELECT
 t_documentcontent._id AS _id
 FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 5

 Limit  (cost=137068.24..137068.36 rows=50 width=58) (actual time=
 41119.702..41119.792 rows=50 loops=1)
   -  Sort  (cost=136943.24..137320.26 rows=150807 width=58) (actual
 time=41064.802..41100.424 rows=50050 loops=1)
 Sort Key: _id
 -  Seq Scan on t_documentcontent  (cost= 0.00..110772.07rows=150807 
 width=58) (actual time=
 106.679..33267.194 rows=150807 loops=1)
 Total runtime: 41120.015 ms

  EXPLAIN ANALYZE SELECT
 t_documentcontent._id AS _id
 FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 1

 Limit  (cost=39839.37..40038.56 rows=50 width=58) (actual time=
 1172.969..1194.228 rows=50 loops=1)
   -  Index Scan using i_documentcontent_id on t_documentcontent  (cost=
 0.00..600805.54 rows=150807 width=58) (actual time= 0.077..1189.688rows=10050 
 loops=1)
 Total runtime: 1194.316 ms

 Tom
  i using uuid for the _id field that is the primary key  add a WHERE id 
 ?  don 't apply
 the cursor aproach is also not suitable for same of my queries

 I use this query for paginate contents of a filesysstem with lots of
 documents avoid offset is not posible always


Re: [GENERAL] query performance

2008-01-13 Thread Alex Turner
Here is a table I threw together to demonstrate the approximate speed of a
materialized view in this case:

trend=# explain analyze select property_id from overview order by
property_id limit 50 offset 5;
 QUERY
PLAN

 Limit  (cost=19112.75..19112.88 rows=50 width=8) (actual time=
446.048..446.125 rows=50 loops=1)
   -  Sort  (cost=18987.75..19400.49 rows=165094 width=8) (actual time=
384.788..424.433 rows=50050 loops=1)
 Sort Key: property_id
 -  Seq Scan on overview  (cost=0.00..2501.94 rows=165094 width=8)
(actual time=0.012..88.691 rows=173409 loops=1)
 Total runtime: 447.578 ms
(5 rows)

trend=# select count(*) from overview;
 count

 173409
(1 row)

trend=#

It's not great - but it's better than 47 seconds (The machine I'm running it
on is far from big iron, so these results should be fairly typical for any
modern x86 box - also this materialized view is almost certainly in RAM, and
therefore IO speed is irrelevant).

Tom lane has already suggested another approach, whereby you order your
results, then select the next 10 from the set where the id is greater than
the greatest of the last one:

select id from overview order by id limit 50;

x = get row['id']  // for row 1
do something
x=get row['id']  // for row 2
do something
...
x=get row['id']  // for row 50

select id from overview where idx order by id limit 50.

The order by is relevant and infact imperative, because you must order your
results somehow, otherwise your pagination will produce different results
each time you try it as database updates will affect the order the rows come
back by default without an order by clause.

Let me say that again to be clear:  The order rows come back if you don't
specify an order by can change!  so pulling rows without an order by is a
REALLY bad idea.  This will break your pagination if a database update
happens between someone viewing a page and hitting next to view the next
page.

Alex

On Jan 13, 2008 11:43 PM, Alex Turner [EMAIL PROTECTED] wrote:

 If you have to access the data this way (with no where clause at all -
 which sometimes you do) then I have already provided a solution that will
 work reasonably well.  If you create what is essentially a materialized view
 of just the id field, the sequence scan will return much fewer pages than
 when you do it on the main table.  Then you join it to the indexed main
 table, and page in just the rows you need.  Voila - much faster result.  Of
 course we haven't really talked about how that will affect insert speed and
 delete speed if you trigger then up, but you haven't really talked about any
 requirements there.

 Alex


 On Jan 13, 2008 11:27 PM, pepone. onrez [EMAIL PROTECTED] wrote:

  Sorry Alex i forget mention that i have setscan of in my last test.
 
  now I have set seqscan on  and indexscan on and added order by _id
 
  The table has an index in the _id field
 
  CREATE INDEX i_documentcontent_document
ON t_documentcontent
USING btree
(_document);
 
  The database was rencently vacum analyze , but not vacun full
 
  here is the explain of 2 diferent queries , when i put a large OFFSET
 
   EXPLAIN ANALYZE SELECT
  t_documentcontent._id AS _id
  FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 5
 
  Limit  (cost=137068.24..137068.36 rows=50 width=58) (actual time=
  41119.702..41119.792 rows=50 loops=1)
-  Sort  (cost=136943.24..137320.26 rows=150807 width=58) (actual
  time=41064.802..41100.424 rows=50050 loops=1)
  Sort Key: _id
  -  Seq Scan on t_documentcontent  (cost= 
  0.00..110772.07rows=150807 width=58) (actual time=
  106.679..33267.194 rows=150807 loops=1)
  Total runtime: 41120.015 ms
 
   EXPLAIN ANALYZE SELECT
  t_documentcontent._id AS _id
  FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 1
 
  Limit  (cost=39839.37..40038.56 rows=50 width=58) (actual time=
  1172.969..1194.228 rows=50 loops=1)
-  Index Scan using i_documentcontent_id on t_documentcontent
  (cost=0.00..600805.54 rows=150807 width=58) (actual time=
  0.077..1189.688 rows=10050 loops=1)
  Total runtime: 1194.316 ms
 
  Tom
   i using uuid for the _id field that is the primary key  add a WHERE id
   ?  don 't apply
  the cursor aproach is also not suitable for same of my queries
 
  I use this query for paginate contents of a filesysstem with lots of
  documents avoid offset is not posible always





Re: [GENERAL] 8.2.4 serious slowdown

2008-01-13 Thread Sim Zacks

Apparently I was suffering from brain freeze.
sim

Lew wrote:

(attribution restored)
Clodoaldo wrote:

  I don't know if the plan would be the same but this is a bit clearer:
 
  WHERE COALESCE(b.quantity, 0)  COALESCE(b.deliveredsum, 0)


Sim Zacks wrote:
That should be true, but sometimes we get deliveries of greater 
quantity then we ordered.
I just want to know the times when I haven't gotten the complete order 
yet.

If we get more then we ordered, I don't want it to be in this query.


Huh?

How does that relate to the suggestion?

The suggested expression is mathematically equivalent to and perfectly 
substitutable for the original.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] query performance

2008-01-13 Thread Alex Turner
Oh - if you do this then make sure that you have the primary key index on
overview too.

Alex

On Jan 14, 2008 12:53 AM, Alex Turner [EMAIL PROTECTED] wrote:

 If you combine it with Tom Lane's suggestion - it will go even better,
 something like:

 select * from t_documentcontent where _id in (select _id from overview
 where _idx order by _id limit 50);

 Alex


 On Jan 13, 2008 11:59 PM, pepone. onrez [EMAIL PROTECTED] wrote:

 
  Thanks Alex
 
  I test your solution and is realy more faster.
 
  Nested Loop  (cost=1743.31..2044.58 rows=50 width=908) (actual time=
  101.695..106.178 rows=50 loops=1)
-  HashAggregate  (cost=1743.31..1743.31 rows=50 width=108) (actual
  time=101.509..101.567 rows=50 loops=1)
  -  Subquery Scan IN_subquery  (cost=1741.60..1743.19 rows=50
  width=108) (actual time=101.327..101.456 rows=50 loops=1)
-  Limit  (cost=1741.60..1742.69 rows=50 width=108)
  (actual time=101.313..101.383 rows=50 loops=1)
  -  Seq Scan on overview  
  (cost=0.00..3283.07rows=150807 width=108) (actual time=
  0.036..72.249 rows=80050 loops=1)
-  Index Scan using i_documentcontent_id on t_documentcontent  (cost=
  0.00..6.01 rows=1 width=908) (actual time=0.083..0.085 rows=1 loops=50)
  Index Cond: ((t_documentcontent._id)::text =
  (outer._id)::text)
  Total runtime: 106.323 ms
 
  I now need to see what trigers i need to add, and test the insertions.
  Thanks again
 
 
  On Jan 14, 2008 5:54 AM, Alex Turner  [EMAIL PROTECTED] wrote:
 
   Here is a table I threw together to demonstrate the approximate speed
   of a materialized view in this case:
  
   trend=# explain analyze select property_id from overview order by
   property_id limit 50 offset 5;
QUERY
   PLAN
  
   
Limit  (cost=19112.75..19112.88 rows=50 width=8) (actual time=
   446.048..446.125 rows=50 loops=1)
  -  Sort  (cost=18987.75..19400.49 rows=165094 width=8) (actual
   time=384.788..424.433 rows=50050 loops=1)
Sort Key: property_id
-  Seq Scan on overview  (cost=0.00..2501.94 rows=165094
   width=8) (actual time= 0.012..88.691 rows=173409 loops=1)
Total runtime: 447.578 ms
   (5 rows)
  
   trend=# select count(*) from overview;
count
   
173409
   (1 row)
  
   trend=#
  
   It's not great - but it's better than 47 seconds (The machine I'm
   running it on is far from big iron, so these results should be fairly
   typical for any modern x86 box - also this materialized view is almost
   certainly in RAM, and therefore IO speed is irrelevant).
  
   Tom lane has already suggested another approach, whereby you order
   your results, then select the next 10 from the set where the id is greater
   than the greatest of the last one:
  
   select id from overview order by id limit 50;
  
   x = get row['id']  // for row 1
   do something
   x=get row['id']  // for row 2
   do something
   ...
   x=get row['id']  // for row 50
  
   select id from overview where idx order by id limit 50.
  
   The order by is relevant and infact imperative, because you must order
   your results somehow, otherwise your pagination will produce different
   results each time you try it as database updates will affect the order the
   rows come back by default without an order by clause.
  
   Let me say that again to be clear:  The order rows come back if you
   don't specify an order by can change!  so pulling rows without an order by
   is a REALLY bad idea.  This will break your pagination if a database 
   update
   happens between someone viewing a page and hitting next to view the next
   page.
  
   Alex
  
  
   On Jan 13, 2008 11:43 PM, Alex Turner [EMAIL PROTECTED]  wrote:
  
If you have to access the data this way (with no where clause at all
- which sometimes you do) then I have already provided a solution that 
will
work reasonably well.  If you create what is essentially a materialized 
view
of just the id field, the sequence scan will return much fewer pages 
than
when you do it on the main table.  Then you join it to the indexed main
table, and page in just the rows you need.  Voila - much faster result. 
 Of
course we haven't really talked about how that will affect insert speed 
and
delete speed if you trigger then up, but you haven't really talked 
about any
requirements there.
   
Alex
   
   
On Jan 13, 2008 11:27 PM, pepone. onrez [EMAIL PROTECTED] 
wrote:
   
 Sorry Alex i forget mention that i have setscan of in my last
 test.

 now I have set seqscan on  and indexscan on and added order by
 _id

 The table has an index in the _id field

 CREATE INDEX i_documentcontent_document
   ON t_documentcontent
   USING btree
   

Re: [GENERAL] query performance

2008-01-13 Thread Alex Turner
If you combine it with Tom Lane's suggestion - it will go even better,
something like:

select * from t_documentcontent where _id in (select _id from overview where
_idx order by _id limit 50);

Alex

On Jan 13, 2008 11:59 PM, pepone. onrez [EMAIL PROTECTED] wrote:


 Thanks Alex

 I test your solution and is realy more faster.

 Nested Loop  (cost=1743.31..2044.58 rows=50 width=908) (actual time=
 101.695..106.178 rows=50 loops=1)
   -  HashAggregate  (cost=1743.31..1743.31 rows=50 width=108) (actual
 time=101.509..101.567 rows=50 loops=1)
 -  Subquery Scan IN_subquery  (cost=1741.60..1743.19 rows=50
 width=108) (actual time=101.327..101.456 rows=50 loops=1)
   -  Limit  (cost=1741.60..1742.69 rows=50 width=108) (actual
 time=101.313..101.383 rows=50 loops=1)
 -  Seq Scan on overview  (cost=0.00..3283.07rows=150807 
 width=108) (actual time=
 0.036..72.249 rows=80050 loops=1)
   -  Index Scan using i_documentcontent_id on t_documentcontent  (cost=
 0.00..6.01 rows=1 width=908) (actual time=0.083..0.085 rows=1 loops=50)
 Index Cond: ((t_documentcontent._id)::text = (outer._id)::text)
 Total runtime: 106.323 ms

 I now need to see what trigers i need to add, and test the insertions.
 Thanks again


 On Jan 14, 2008 5:54 AM, Alex Turner [EMAIL PROTECTED] wrote:

  Here is a table I threw together to demonstrate the approximate speed of
  a materialized view in this case:
 
  trend=# explain analyze select property_id from overview order by
  property_id limit 50 offset 5;
   QUERY
  PLAN
 
  
   Limit  (cost=19112.75..19112.88 rows=50 width=8) (actual time=
  446.048..446.125 rows=50 loops=1)
 -  Sort  (cost=18987.75..19400.49 rows=165094 width=8) (actual time=
  384.788..424.433 rows=50050 loops=1)
   Sort Key: property_id
   -  Seq Scan on overview  (cost=0.00..2501.94 rows=165094
  width=8) (actual time= 0.012..88.691 rows=173409 loops=1)
   Total runtime: 447.578 ms
  (5 rows)
 
  trend=# select count(*) from overview;
   count
  
   173409
  (1 row)
 
  trend=#
 
  It's not great - but it's better than 47 seconds (The machine I'm
  running it on is far from big iron, so these results should be fairly
  typical for any modern x86 box - also this materialized view is almost
  certainly in RAM, and therefore IO speed is irrelevant).
 
  Tom lane has already suggested another approach, whereby you order your
  results, then select the next 10 from the set where the id is greater than
  the greatest of the last one:
 
  select id from overview order by id limit 50;
 
  x = get row['id']  // for row 1
  do something
  x=get row['id']  // for row 2
  do something
  ...
  x=get row['id']  // for row 50
 
  select id from overview where idx order by id limit 50.
 
  The order by is relevant and infact imperative, because you must order
  your results somehow, otherwise your pagination will produce different
  results each time you try it as database updates will affect the order the
  rows come back by default without an order by clause.
 
  Let me say that again to be clear:  The order rows come back if you
  don't specify an order by can change!  so pulling rows without an order by
  is a REALLY bad idea.  This will break your pagination if a database update
  happens between someone viewing a page and hitting next to view the next
  page.
 
  Alex
 
 
  On Jan 13, 2008 11:43 PM, Alex Turner [EMAIL PROTECTED]  wrote:
 
   If you have to access the data this way (with no where clause at all -
   which sometimes you do) then I have already provided a solution that will
   work reasonably well.  If you create what is essentially a materialized 
   view
   of just the id field, the sequence scan will return much fewer pages than
   when you do it on the main table.  Then you join it to the indexed main
   table, and page in just the rows you need.  Voila - much faster result.  
   Of
   course we haven't really talked about how that will affect insert speed 
   and
   delete speed if you trigger then up, but you haven't really talked about 
   any
   requirements there.
  
   Alex
  
  
   On Jan 13, 2008 11:27 PM, pepone. onrez [EMAIL PROTECTED] 
   wrote:
  
Sorry Alex i forget mention that i have setscan of in my last test.
   
now I have set seqscan on  and indexscan on and added order by _id
   
The table has an index in the _id field
   
CREATE INDEX i_documentcontent_document
  ON t_documentcontent
  USING btree
  (_document);
   
The database was rencently vacum analyze , but not vacun full
   
here is the explain of 2 diferent queries , when i put a large
OFFSET
   
 EXPLAIN ANALYZE SELECT
t_documentcontent._id AS _id
FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 5
   
Limit  

Re: [GENERAL] query performance

2008-01-13 Thread pepone . onrez
Thanks Alex

I test your solution and is realy more faster.

Nested Loop  (cost=1743.31..2044.58 rows=50 width=908) (actual time=
101.695..106.178 rows=50 loops=1)
  -  HashAggregate  (cost=1743.31..1743.31 rows=50 width=108) (actual time=
101.509..101.567 rows=50 loops=1)
-  Subquery Scan IN_subquery  (cost=1741.60..1743.19 rows=50
width=108) (actual time=101.327..101.456 rows=50 loops=1)
  -  Limit  (cost=1741.60..1742.69 rows=50 width=108) (actual
time=101.313..101.383 rows=50 loops=1)
-  Seq Scan on overview
(cost=0.00..3283.07rows=150807 width=108) (actual time=
0.036..72.249 rows=80050 loops=1)
  -  Index Scan using i_documentcontent_id on t_documentcontent  (cost=
0.00..6.01 rows=1 width=908) (actual time=0.083..0.085 rows=1 loops=50)
Index Cond: ((t_documentcontent._id)::text = (outer._id)::text)
Total runtime: 106.323 ms

I now need to see what trigers i need to add, and test the insertions.
Thanks again

On Jan 14, 2008 5:54 AM, Alex Turner [EMAIL PROTECTED] wrote:

 Here is a table I threw together to demonstrate the approximate speed of a
 materialized view in this case:

 trend=# explain analyze select property_id from overview order by
 property_id limit 50 offset 5;
  QUERY
 PLAN

 
  Limit  (cost=19112.75..19112.88 rows=50 width=8) (actual time=
 446.048..446.125 rows=50 loops=1)
-  Sort  (cost=18987.75..19400.49 rows=165094 width=8) (actual time=
 384.788..424.433 rows=50050 loops=1)
  Sort Key: property_id
  -  Seq Scan on overview  (cost=0.00..2501.94 rows=165094
 width=8) (actual time= 0.012..88.691 rows=173409 loops=1)
  Total runtime: 447.578 ms
 (5 rows)

 trend=# select count(*) from overview;
  count
 
  173409
 (1 row)

 trend=#

 It's not great - but it's better than 47 seconds (The machine I'm running
 it on is far from big iron, so these results should be fairly typical for
 any modern x86 box - also this materialized view is almost certainly in RAM,
 and therefore IO speed is irrelevant).

 Tom lane has already suggested another approach, whereby you order your
 results, then select the next 10 from the set where the id is greater than
 the greatest of the last one:

 select id from overview order by id limit 50;

 x = get row['id']  // for row 1
 do something
 x=get row['id']  // for row 2
 do something
 ...
 x=get row['id']  // for row 50

 select id from overview where idx order by id limit 50.

 The order by is relevant and infact imperative, because you must order
 your results somehow, otherwise your pagination will produce different
 results each time you try it as database updates will affect the order the
 rows come back by default without an order by clause.

 Let me say that again to be clear:  The order rows come back if you don't
 specify an order by can change!  so pulling rows without an order by is a
 REALLY bad idea.  This will break your pagination if a database update
 happens between someone viewing a page and hitting next to view the next
 page.

 Alex


 On Jan 13, 2008 11:43 PM, Alex Turner [EMAIL PROTECTED] wrote:

  If you have to access the data this way (with no where clause at all -
  which sometimes you do) then I have already provided a solution that will
  work reasonably well.  If you create what is essentially a materialized view
  of just the id field, the sequence scan will return much fewer pages than
  when you do it on the main table.  Then you join it to the indexed main
  table, and page in just the rows you need.  Voila - much faster result.  Of
  course we haven't really talked about how that will affect insert speed and
  delete speed if you trigger then up, but you haven't really talked about any
  requirements there.
 
  Alex
 
 
  On Jan 13, 2008 11:27 PM, pepone. onrez [EMAIL PROTECTED]  wrote:
 
   Sorry Alex i forget mention that i have setscan of in my last test.
  
   now I have set seqscan on  and indexscan on and added order by _id
  
   The table has an index in the _id field
  
   CREATE INDEX i_documentcontent_document
 ON t_documentcontent
 USING btree
 (_document);
  
   The database was rencently vacum analyze , but not vacun full
  
   here is the explain of 2 diferent queries , when i put a large OFFSET
  
EXPLAIN ANALYZE SELECT
   t_documentcontent._id AS _id
   FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 5
  
   Limit  (cost=137068.24..137068.36 rows=50 width=58) (actual time=
   41119.702..41119.792 rows=50 loops=1)
 -  Sort  (cost=136943.24..137320.26 rows=150807 width=58) (actual
   time=41064.802..41100.424 rows=50050 loops=1)
   Sort Key: _id
   -  Seq Scan on t_documentcontent  (cost= 
   0.00..110772.07rows=150807 width=58) (actual time=
   106.679..33267.194 rows=150807 loops=1)
   Total runtime: 

Re: [GENERAL] How to safely compare transaction id?

2008-01-13 Thread alphax
Tom Lane wrote:
 alphax [EMAIL PROTECTED] writes:
 So, it seems the system column cmin is the logical current version 
 of that record, isn't it?

 No.  Have you read
 http://www.postgresql.org/docs/8.2/static/ddl-system-columns.html

Yes, I do. But I don't understand the actual meaning.



Trevor Talbot wrote:
 On 1/11/08, alphax [EMAIL PROTECTED] wrote:

 I want to determines a given record which visible to current transaction
 whether or not be updated after some time point, that time point is
 indicated by aother transaction id started and committed in past time.

 I'm not sure I understand, but maybe this thread will help?
 http://archives.postgresql.org/pgsql-general/2007-10/msg00503.php

Thanks, I had readed that message thread. Actually, my application is
similar to
 XMIN changes when a (writing) transaction commits successfully.
 We don't care *how* it changes, just *that* it does so.

If I do:
type
TCompareResult = (PRECEDING, SAME, SUCCEEDING);
1) T1 BEGIN T1;
2) T1 READ table.xmin INTO :xmin_value
3) T1 COMMIT T1;
4) Other application(NOT Include VACUUM) do something
5) T2 BEGIN SERIALIZABLE T2;
6) T2 READ table.xmin INTO :xmin_value2
7) T2 TCompareResult compareResult := WRAPAROUND_SAFE_TID_COMPARE
(xmin_value, xmin_value2)
8) Other transaction(INCLUDE VACUUM) do some thing
9) T2 READ table.xmin INTO :xmin_value3
10)T2 COMMIT T2;

I must ensure:
A. In the time point 7), If compareResult is SAME, the record in table
is not changed by other committed transaction. if compareResult is
PRECEDING, the record is changed. This is meaning of current version
(value) of record I said.

B. In the time point 9), xmin_value3 must equal to previous readed
xmin_value2. I have qualms about this because the document say:|
xmax |is the identity (transaction ID) of the deleting transaction, or
zero for an undeleted row version. I dont known in which condiation my
transaction can see a deleted record. Can someone tell me?

I must known which system column(xmin I assume here) can indicate the
changing of a logical row, If there is not such a column, I need to
create an user column to do this job. I have take notice of the Tome
Lane and others say in the thread XMIN semantic at peril, my
understand is xmin is what I want, is it right?

But if the VACUUM is participate in time point 4) like below case, the
record change detection would be failed. It is a big problem in my
application.
4.1)...(long time)
4.2) other transaction do some thing with table(change xmin by there
transaction id) and commit.
4.X)...(long time)
4.X+1) VACUUM replace table.xmin to |FrozenXID.

Thanks for your help!

|


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly