Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Andrew Sullivan
On Thu, Apr 03, 2014 at 10:34:32AM -0300, Leonardo M. Ramé wrote:
 
 What strategy for showing the total number of records returned do you
 recommend?.

The best answer for this I've ever seen is to limit the number of rows
you're counting (at least at first) to some reasonably small number --
say 5000.  This is usually reasonably fast for a well-indexed query,
and your pagination can say something like First n of at least 5000
results, unless you have fewer than 5000 results, in which case you
know the number (and the count returned quickly anyway).  As you're
displaying those first 5000 results, you can work in the background
getting a more accurate number.  This is more work for your
application, but it provides a much better user experience (and you
can delay getting the detailed number until the user pages through to
the second page of results, so you don't count everything needlessly
in case the user just uses the first page, which IME happens a lot).
Note that even Google doesn't give you an accurate number -- they just
say about ten trillion or whatever.

Hope that's useful,

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Alban Hertroys
On 3 April 2014 15:34, Leonardo M. Ramé l.r...@griensu.com wrote:
 Hi, in one of our systems, we added a kind of pagination feature, that
 shows N records of Total records.

 To do this, we added a count(*) over() as Total field in our queries
 in replacement of doing two queries, one for fetching the records, and
 other for getting the count. This improved the performance, but we are't
 happy with the results yet, by removing the count, the query takes
 200ms vs 2000ms with it.

 We are thinking of removing the count, but if we do that, the system
 will lack an interesting feature.

 What strategy for showing the total number of records returned do you
 recommend?.

Assuming your results are unique (and what would be the point of
showing duplicate results?) in a specific order, it should be possible
to (uniquely) identify the last record shown on a previous page and
display n results from there on.

To add a result counter for displaying purposes to that, since you
need to remember the last displayed result anyway, you might as well
store the relative record number of that result with it and count from
there on.

This works well if you only have a next/previous results link, but not
so well when people can pick arbitrary page numbers. It can work
relative to the current page regardless of which page the user chose
to navigate to next, but you'll have to go through all the results in
between that page and the current page... That should still be faster
than always counting from the start though (and you can be smart about
from which end you start counting).

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Dorian Hoxha
Cache the total ?


On Thu, Apr 3, 2014 at 3:34 PM, Leonardo M. Ramé l.r...@griensu.com wrote:

 Hi, in one of our systems, we added a kind of pagination feature, that
 shows N records of Total records.

 To do this, we added a count(*) over() as Total field in our queries
 in replacement of doing two queries, one for fetching the records, and
 other for getting the count. This improved the performance, but we are't
 happy with the results yet, by removing the count, the query takes
 200ms vs 2000ms with it.

 We are thinking of removing the count, but if we do that, the system
 will lack an interesting feature.

 What strategy for showing the total number of records returned do you
 recommend?.

 Regards,
 --
 Leonardo M. Ramé
 Medical IT - Griensu S.A.
 Av. Colón 636 - Piso 8 Of. A
 X5000EPT -- Córdoba
 Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
 Cel.: +54 9 (011) 40871877



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Torsten Förtsch
On 03/04/14 15:34, Leonardo M. Ramé wrote:
 Hi, in one of our systems, we added a kind of pagination feature, that
 shows N records of Total records.
 
 To do this, we added a count(*) over() as Total field in our queries
 in replacement of doing two queries, one for fetching the records, and
 other for getting the count. This improved the performance, but we are't
 happy with the results yet, by removing the count, the query takes
 200ms vs 2000ms with it.
 
 We are thinking of removing the count, but if we do that, the system
 will lack an interesting feature.
 
 What strategy for showing the total number of records returned do you
 recommend?.

If you need only an estimated number and if your planner statistics are
up to date, you can use the planner.

Here is my implementation of the explain function. The COMMENT below
shows how to use it:

CREATE OR REPLACE FUNCTION explain(VARIADIC TEXT[])
RETURNS JSON AS $$
DECLARE
tmp TEXT;
BEGIN
EXECUTE 'EXPLAIN ('
 || array_to_string(array_append($1[2:array_upper($1, 1)],
'FORMAT JSON'), ', ')
 || ') '
 || $1[1] INTO tmp;
RETURN tmp::JSON;
END;
$$ LANGUAGE plpgsql VOLATILE;

COMMENT ON FUNCTION explain(VARIADIC TEXT[])
IS $def$
This function is a SQL interface to the planner. It returns the plan
(result of EXPLAIN) of the query passed as TEXT string as the first
parameter as JSON object.

The remaining parameters are EXPLAIN-modifiers, like ANALYZE or
BUFFERS.

The function can be used to store plans in the database.

Another interesting usage is when you need only an estimated row
count for a query. You can use

  SELECT count(*) ...

This gives you an exact number but is usually slow. If your planner
statistics are up to date and the query is not too complicated, the
planner usually gives a good estimate and is much faster.

  SELECT explain('SELECT 1 FROM tb WHERE id8000')
   -0-'Plan'-'Plan Rows';
$def$;


Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Leonardo M . Ramé
On 2014-04-03 10:00:18 -0400, Andrew Sullivan wrote:
 On Thu, Apr 03, 2014 at 10:34:32AM -0300, Leonardo M. Ramé wrote:
  
  What strategy for showing the total number of records returned do you
  recommend?.
 
 The best answer for this I've ever seen is to limit the number of rows
 you're counting (at least at first) to some reasonably small number --
 say 5000.  This is usually reasonably fast for a well-indexed query,
 and your pagination can say something like First n of at least 5000
 results, unless you have fewer than 5000 results, in which case you
 know the number (and the count returned quickly anyway).  As you're
 displaying those first 5000 results, you can work in the background
 getting a more accurate number.  This is more work for your
 application, but it provides a much better user experience (and you
 can delay getting the detailed number until the user pages through to
 the second page of results, so you don't count everything needlessly
 in case the user just uses the first page, which IME happens a lot).
 Note that even Google doesn't give you an accurate number -- they just
 say about ten trillion or whatever.
 
 Hope that's useful,
 
 A
 

Sounds nice, is it possible to modify my count(*) over() to what you
suggest?.

-- 
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Leonardo M . Ramé
On 2014-04-03 17:19:56 +0200, Torsten Förtsch wrote:
 On 03/04/14 15:34, Leonardo M. Ramé wrote:
  Hi, in one of our systems, we added a kind of pagination feature, that
  shows N records of Total records.
  
  To do this, we added a count(*) over() as Total field in our queries
  in replacement of doing two queries, one for fetching the records, and
  other for getting the count. This improved the performance, but we are't
  happy with the results yet, by removing the count, the query takes
  200ms vs 2000ms with it.
  
  We are thinking of removing the count, but if we do that, the system
  will lack an interesting feature.
  
  What strategy for showing the total number of records returned do you
  recommend?.
 
 If you need only an estimated number and if your planner statistics are
 up to date, you can use the planner.
 
 Here is my implementation of the explain function. The COMMENT below
 shows how to use it:
 
 CREATE OR REPLACE FUNCTION explain(VARIADIC TEXT[])
 RETURNS JSON AS $$
 DECLARE
 tmp TEXT;
 BEGIN
 EXECUTE 'EXPLAIN ('
  || array_to_string(array_append($1[2:array_upper($1, 1)],
 'FORMAT JSON'), ', ')
  || ') '
  || $1[1] INTO tmp;
 RETURN tmp::JSON;
 END;
 $$ LANGUAGE plpgsql VOLATILE;
 
 COMMENT ON FUNCTION explain(VARIADIC TEXT[])
 IS $def$
 This function is a SQL interface to the planner. It returns the plan
 (result of EXPLAIN) of the query passed as TEXT string as the first
 parameter as JSON object.
 
 The remaining parameters are EXPLAIN-modifiers, like ANALYZE or
 BUFFERS.
 
 The function can be used to store plans in the database.
 
 Another interesting usage is when you need only an estimated row
 count for a query. You can use
 
   SELECT count(*) ...
 
 This gives you an exact number but is usually slow. If your planner
 statistics are up to date and the query is not too complicated, the
 planner usually gives a good estimate and is much faster.
 
   SELECT explain('SELECT 1 FROM tb WHERE id8000')
-0-'Plan'-'Plan Rows';
 $def$;
 
 
 Torsten

Nice!, do you know if this will work on 8.4?.

-- 
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Andrew Sullivan
On Thu, Apr 03, 2014 at 12:44:23PM -0300, Leonardo M. Ramé wrote:
 Sounds nice, is it possible to modify my count(*) over() to what you
 suggest?.

I think the window_definition inside over() can contain a LIMIT, can't
it?  I didn't check just now, but I can't think any reason why not.
(ISTR when I did this in the past we didn't have window functions, so
I simulated it another way.)

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Torsten Förtsch
On 03/04/14 17:44, Leonardo M. Ramé wrote:
 Nice!, do you know if this will work on 8.4?.
 
no way


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general