On 12/10/2013 01:45, BladeOfLight16 wrote:
On Fri, Oct 11, 2013 at 9:32 AM, Jesse Long <j...@unknown.za.net
<mailto:j...@unknown.za.net>> wrote:
explain select * from archive_document_index where node_id = 29
and value = 'BSH70002152';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Index Scan using archive_document_index_node_id_value_idx on
archive_document_index (cost=0.57..36.13 rows=14 width=33)
Index Cond: ((node_id = 29) AND ((value)::text =
'BSH70002152'::text))
(2 rows)
I believe that this is what Tom is suggesting:
SELECT *
FROM ARCHIVE_DOCUMENT AS ad
WHERE ad.NODE_ID = 29
AND ad.ARCHIVE_DATE >= '2013-07-08 18:28:00'
AND EXISTS (SELECT *
FROM ARCHIVE_DOCUMENT_INDEX AS adi
WHERE adi.ARCHIVE_ID = ad.ID
AND adi.NODE_ID = ad.NODE_ID
AND (adi.VALUE = 'BSH70002152' OR adi.VALUE = 'TC212592')
)
ORDER BY ad.ARCHIVE_DATE DESC
LIMIT 10;
Forgive my changing of the aliases. I found names like r0 difficult to
interpret easily.
This is a much simpler query since it only requires one subselect, and
I believe Tom is suggesting that this query may be able to make use of
the index or at least find some more efficient plan. How does this
perform?
Indeed it does perform very much better. With the modified query the
plan and the execution time are excellent.
However, my previous question remains - in the original query plan,
there are two hash tables being populated using seqscan + filter. During
each seqscan, over 95 million records were wastefully read. I think that
this could have been dramatically improved by using an index scan. Am I
mistaken? Is it a optimisation that could be implemented but has not
been implemented yet? Is it a bug?
Thanks,
Jesse