The GROUP BY was the fastest method.
Thanks for the suggestions,
Keaton
On 2/15/08 3:12 PM, "Gregory Stark" <[EMAIL PROTECTED]> wrote:
> "Keaton Adams" <[EMAIL PROTECTED]> writes:
>
>> Version: Postgres 8.1.4
>> Platform: RHEL
>>
>> Given this scenario with the indexes in place, when I ask for the distinct
>> field1_id values, why does the optimizer choose a sequential scan instead of
>> just reading from the kda_log_fid_cre_20080123_idx index? The time it takes
>> to perform the sequential scan against 20+ million records is way too slow.
>
> Try (temporarily) doing:
>
> SET enable_seqscan = off;
>
>> keaton=# explain select distinct field1_id into temp kda_temp from
>> kda_log_20080213;
>
> If the database is right that will be even slower. Using a full index scan
> requires a lot of random access seeks, generally the larger the table the
> *more* likely a sequential scan and sort is a better approach than using an
> index.
>
> If it's wrong and it's faster then you have to consider whether it's only
> faster because you've read the table into cache already. Will it be in cache
> in production? If so then you migth try raising effective_cache_size or
> lowering random_page_cost.
>
> Another thing to try is using GROUP BY instead of DISTINCT. This is one case
> where the postgres optimizer doesn't handle the two equivalent cases in
> exactly the same way and there are some plans available in one method that
> aren't in the other. That's only likely to help if you have relative few
> values of field1_id but it's worth trying.
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly