Re: [GENERAL] Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr

2014-08-06 Thread David Johnston
On Wed, Aug 6, 2014 at 10:08 AM, Jeff Janes  wrote:

> On Tue, Aug 5, 2014 at 4:30 PM, David G Johnston <
> david.g.johns...@gmail.com> wrote:
>
>>
>> Anyway, you should probably experiment with creating a multi-column index
>> instead of allowing PostgreSQL to BitmapAnd them together.  Likely the
>> timestamp will have higher cardinality and so should be listed first in
>> the
>> index.
>
>
> No, the timestamp should almost certainly come second because it is used
> with inequality operators.
>
>
​Wouldn't that only matter if a typical inequality was expected to return
more rows than a given equality on the other field?  Depending on the
cardinality of the ID field I would expect a very large range of dates to
be required before digging down into ID becomes more effective.  My
instinct say there are relatively few IDs in play but that they are
continually adding new rows.

What statistics would the OP have to provide in order to actually make a
fact-based determination?

David J​.


Re: [GENERAL] Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr

2014-08-06 Thread Jeff Janes
On Tue, Aug 5, 2014 at 4:30 PM, David G Johnston  wrote:

>
> NOTE: I am confused by this line:
> ->  BitmapAnd  (cost=291564.31..291564.31 rows=28273 width=0) (actual
> time=23843.870..23843.870 rows=0 loops=1)
>
> How did actual match zero rows?  It should be something like 2.2M
>

The accounting for bitmap operations seems to be a bit of a mess.  In some
cases, it reports the number of rows represented in the bitmap.  Sometimes
it counts a bitmap itself as a row, and so there is just one of them no
matter how many rows it represents.  In this case, it seems to consider a
bitmap not to be a row at all.   The problem with counting the number of
rows represented by the bitmap is that that value is unknown if either if
the input bitmaps has gone lossy.



> Anyway, you should probably experiment with creating a multi-column index
> instead of allowing PostgreSQL to BitmapAnd them together.  Likely the
> timestamp will have higher cardinality and so should be listed first in the
> index.


No, the timestamp should almost certainly come second because it is used
with inequality operators.

Cheers,

Jeff