Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread Ramasubramanian G
Hi ,

Set this parameter in psotgresql.conf set enable_seqscan=off;
And try:

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Scott
Marlowe
Sent: Tuesday, May 13, 2008 11:32 PM
To: idc danny
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Problem with 11 M records table

On Tue, May 13, 2008 at 10:57 AM, idc danny <[EMAIL PROTECTED]> wrote:
> Hi everybody,
>
> I'm fairly new to PostgreSQL and I have a problem with
> a query:
>
> SELECT * FROM "LockerEvents" LIMIT 1 OFFSET
> 1099
>
> The table LockerEvents has 11 Mlillions records on it
> and this query takes about 60 seconds to complete.
> Moreover, even after making for each column in the
> table a index the EXPLAIN still uses sequential scan
> instead of indexes.

Yep.  The way offset limit works is it first materializes the data
needed for OFFSET+LIMIT rows, then throws away OFFSET worth's of data.
So, it has to do a lot of retrieving.

Better off to use something like:

select * from table order by indexfield where indexfield between
1000 and 10001000;

which can use an index on indexfield, as long as the amount of data is
small enough, etc...

-- 
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread Scott Marlowe
On Tue, May 13, 2008 at 10:57 AM, idc danny <[EMAIL PROTECTED]> wrote:
> Hi everybody,
>
> I'm fairly new to PostgreSQL and I have a problem with
> a query:
>
> SELECT * FROM "LockerEvents" LIMIT 1 OFFSET
> 1099
>
> The table LockerEvents has 11 Mlillions records on it
> and this query takes about 60 seconds to complete.
> Moreover, even after making for each column in the
> table a index the EXPLAIN still uses sequential scan
> instead of indexes.

Yep.  The way offset limit works is it first materializes the data
needed for OFFSET+LIMIT rows, then throws away OFFSET worth's of data.
So, it has to do a lot of retrieving.

Better off to use something like:

select * from table order by indexfield where indexfield between
1000 and 10001000;

which can use an index on indexfield, as long as the amount of data is
small enough, etc...

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread Craig James

idc danny wrote:

Hi James,

Than you for your response.

What I want to achieve is to give to the application
user 10k rows where the records are one after another
in the table, and the application has a paginating GUI
("First page", "Previous page", "Next page", "Last
page" - all links & "Jump to page" combobox) where
thsi particular query gets to run if the user clicks
on the "Last page" link.
The application receive the first 10k rows in under a
second when the user clicks on "First page" link and
receive the last 10k rows in about 60 seconds when he
clicks on "Last page" link.


You need a sequence that automatically assigns an ascending "my_rownum" to each 
row as it is added to the table, and an index on that my_rownum column.  Then you select 
your page by (for example)

 select * from my_table where my_rownum >= 100 and id < 110;

That will do what you want, with instant performance that's linear over your 
whole table.

If your table will have deletions, then you have to update the row numbering a lot, which will cause you 
terrible performance problems due to the nature of the UPDATE operation in Postgres.  If this is the case, 
then you should keep a separate table just for numbering the rows, which is joined to your main table when 
you want to retrieve a "page" of data.  When you delete data (which should be batched, since this 
will be expensive), then you truncate your rownum table, reset the sequence that generates your row numbers, 
then regenerate your row numbers with something like "insert into my_rownum_table (select id, 
nextval('my_rownum_seq') from my_big_table)".  To retrieve a page, just do "select ... from 
my_table join my_rownum_table on (...)", which will be really fast since you'll have indexes on both 
tables.

Note that this method requires that you have a primary key, or at least a 
unique column, on your main table, so that you have something to join with your 
row-number table.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread Craig James

idc danny wrote:

Hi everybody,

I'm fairly new to PostgreSQL and I have a problem with
a query:

SELECT * FROM "LockerEvents" LIMIT 1 OFFSET
1099

The table LockerEvents has 11 Mlillions records on it
and this query takes about 60 seconds to complete.


The OFFSET clause is almost always inefficient for anything but very small 
tables or small offsets.  In order for a relational database (not just 
Postgres) to figure out which row is the 1100th row, it has to actually 
retrieve the first 1099 rows and and discard them.  There is no magical way 
to go directly to the 11-millionth row.  Even on a trivial query such as yours 
with no WHERE clause, the only way to determine which row is the 11 millionths 
is to scan the previous 1099.

There are better (faster) ways to achieve this, but it depends on why you are 
doing this query.  That is, do you just want this one block of data, or are you 
scanning the whole database in 10,000-row blocks?

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread salman



idc danny wrote:

Hi everybody,

I'm fairly new to PostgreSQL and I have a problem with
a query:

SELECT * FROM "LockerEvents" LIMIT 1 OFFSET
1099

The table LockerEvents has 11 Mlillions records on it
and this query takes about 60 seconds to complete.
Moreover, even after making for each column in the
table a index the EXPLAIN still uses sequential scan
instead of indexes.

The EXPLAIN is:
"Limit  (cost=100245579.54..100245803.00 rows=1
width=60) (actual time=58414.753..58482.661 rows=1
loops=1)"
"  ->  Seq Scan on "LockerEvents" 
(cost=1.00..100245803.00 rows=1100

width=60) (actual time=12.620..45463.222 rows=1100
loops=1)"
"Total runtime: 58493.648 ms"

The table is:

CREATE TABLE "LockerEvents"
(
  "ID" serial NOT NULL,
  "IDMoneySymbol" integer NOT NULL,
  "IDLocker" integer NOT NULL,
  "IDUser" integer NOT NULL,
  "IDEventType" integer NOT NULL,
  "TimeBegin" timestamp(0) without time zone NOT NULL,
  "Notes" character varying(200),
  "Income" double precision NOT NULL DEFAULT 0,
  "IncomeWithRate" double precision NOT NULL DEFAULT
0,
  CONSTRAINT pk_lockerevents_id PRIMARY KEY ("ID"),
  CONSTRAINT fk_lockerevents_ideventtype_eventtypes_id
FOREIGN KEY ("IDEventType")
  REFERENCES "EventTypes" ("ID") MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fk_lockerevents_idlocker_lockers_id
FOREIGN KEY ("IDLocker")
  REFERENCES "Lockers" ("ID") MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT
fk_lockerevents_idmoneysymbol_moneysymbols_id FOREIGN
KEY ("IDMoneySymbol")
  REFERENCES "MoneySymbols" ("ID") MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fk_lockerevents_iduser_users_id FOREIGN
KEY ("IDUser")
  REFERENCES "Users" ("ID") MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (OIDS=FALSE);


CREATE INDEX idx_col_lockerevents_income
  ON "LockerEvents"
  USING btree
  ("Income");

CREATE INDEX idx_col_lockerevents_incomewithrate
  ON "LockerEvents"
  USING btree
  ("IncomeWithRate");

CREATE INDEX idx_col_lockerevents_notes
  ON "LockerEvents"
  USING btree
  ("Notes");

CREATE INDEX idx_col_lockerevents_timebegin
  ON "LockerEvents"
  USING btree
  ("TimeBegin");

CREATE INDEX
idx_fk_lockerevents_ideventtype_eventtypes_id
  ON "LockerEvents"
  USING btree
  ("IDEventType");

CREATE INDEX idx_fk_lockerevents_idlocker_lockers_id
  ON "LockerEvents"
  USING btree
  ("IDLocker");

CREATE INDEX
idx_fk_lockerevents_idmoneysymbol_moneysymbols_id
  ON "LockerEvents"
  USING btree
  ("IDMoneySymbol");

CREATE INDEX idx_fk_lockerevents_iduser_users_id
  ON "LockerEvents"
  USING btree
  ("IDUser");

CREATE UNIQUE INDEX idx_pk_lockerevents_id
  ON "LockerEvents"
  USING btree
  ("ID");


If I do the query :
SELECT * FROM "LockerEvents" LIMIT 1 OFFSET 0
then this query takes under a second to complete - I
believe this is because the sequential scan starts
from beginning.

I need the query to complete under 10 seconds and I do
not know how to do it. 
Please help me!


Thank you,
Danny



I recall it being mentioned on one of these lists that with offset, all 
the rows in between still have to be read. So, you may get better 
results if you use a 'where id > 1' clause in the query.


-salman


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problem with 11 M records table

2008-05-13 Thread Bill Moran
In response to idc danny <[EMAIL PROTECTED]>:

> Hi everybody,
> 
> I'm fairly new to PostgreSQL and I have a problem with
> a query:
> 
> SELECT * FROM "LockerEvents" LIMIT 1 OFFSET
> 1099

This query makes no sense, and I can't blame PostgreSQL for using a
seq scan, since you've given it no reason to do otherwise.  If you
want a random sampling of rows, you should construct your query more
to that effect, as this query is going to give you a random sampling
of rows, and the LIMIT/OFFSET are simply junk that confuses the
query planner.

I suspect that you don't really want a random sampling of rows, although
I can't imagine what you think you're going to get from that query.
Have you tried putting an ORDER BY clause in?

> 
> The table LockerEvents has 11 Mlillions records on it
> and this query takes about 60 seconds to complete.
> Moreover, even after making for each column in the
> table a index the EXPLAIN still uses sequential scan
> instead of indexes.
> 
> The EXPLAIN is:
> "Limit  (cost=100245579.54..100245803.00 rows=1
> width=60) (actual time=58414.753..58482.661 rows=1
> loops=1)"
> "  ->  Seq Scan on "LockerEvents" 
> (cost=1.00..100245803.00 rows=1100
> width=60) (actual time=12.620..45463.222 rows=1100
> loops=1)"
> "Total runtime: 58493.648 ms"
> 
> The table is:
> 
> CREATE TABLE "LockerEvents"
> (
>   "ID" serial NOT NULL,
>   "IDMoneySymbol" integer NOT NULL,
>   "IDLocker" integer NOT NULL,
>   "IDUser" integer NOT NULL,
>   "IDEventType" integer NOT NULL,
>   "TimeBegin" timestamp(0) without time zone NOT NULL,
>   "Notes" character varying(200),
>   "Income" double precision NOT NULL DEFAULT 0,
>   "IncomeWithRate" double precision NOT NULL DEFAULT
> 0,
>   CONSTRAINT pk_lockerevents_id PRIMARY KEY ("ID"),
>   CONSTRAINT fk_lockerevents_ideventtype_eventtypes_id
> FOREIGN KEY ("IDEventType")
>   REFERENCES "EventTypes" ("ID") MATCH SIMPLE
>   ON UPDATE NO ACTION ON DELETE NO ACTION,
>   CONSTRAINT fk_lockerevents_idlocker_lockers_id
> FOREIGN KEY ("IDLocker")
>   REFERENCES "Lockers" ("ID") MATCH SIMPLE
>   ON UPDATE NO ACTION ON DELETE NO ACTION,
>   CONSTRAINT
> fk_lockerevents_idmoneysymbol_moneysymbols_id FOREIGN
> KEY ("IDMoneySymbol")
>   REFERENCES "MoneySymbols" ("ID") MATCH SIMPLE
>   ON UPDATE NO ACTION ON DELETE NO ACTION,
>   CONSTRAINT fk_lockerevents_iduser_users_id FOREIGN
> KEY ("IDUser")
>   REFERENCES "Users" ("ID") MATCH SIMPLE
>   ON UPDATE NO ACTION ON DELETE NO ACTION
> )
> WITH (OIDS=FALSE);
> 
> 
> CREATE INDEX idx_col_lockerevents_income
>   ON "LockerEvents"
>   USING btree
>   ("Income");
> 
> CREATE INDEX idx_col_lockerevents_incomewithrate
>   ON "LockerEvents"
>   USING btree
>   ("IncomeWithRate");
> 
> CREATE INDEX idx_col_lockerevents_notes
>   ON "LockerEvents"
>   USING btree
>   ("Notes");
> 
> CREATE INDEX idx_col_lockerevents_timebegin
>   ON "LockerEvents"
>   USING btree
>   ("TimeBegin");
> 
> CREATE INDEX
> idx_fk_lockerevents_ideventtype_eventtypes_id
>   ON "LockerEvents"
>   USING btree
>   ("IDEventType");
> 
> CREATE INDEX idx_fk_lockerevents_idlocker_lockers_id
>   ON "LockerEvents"
>   USING btree
>   ("IDLocker");
> 
> CREATE INDEX
> idx_fk_lockerevents_idmoneysymbol_moneysymbols_id
>   ON "LockerEvents"
>   USING btree
>   ("IDMoneySymbol");
> 
> CREATE INDEX idx_fk_lockerevents_iduser_users_id
>   ON "LockerEvents"
>   USING btree
>   ("IDUser");
> 
> CREATE UNIQUE INDEX idx_pk_lockerevents_id
>   ON "LockerEvents"
>   USING btree
>   ("ID");
> 
> 
> If I do the query :
> SELECT * FROM "LockerEvents" LIMIT 1 OFFSET 0
> then this query takes under a second to complete - I
> believe this is because the sequential scan starts
> from beginning.
> 
> I need the query to complete under 10 seconds and I do
> not know how to do it. 
> Please help me!
> 
> Thank you,
> Danny
> 
> 
>   
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023


IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any