On 12/10/2013 01:45, BladeOfLight16 wrote:
On Fri, Oct 11, 2013 at 9:32 AM, Jesse Long <j...@unknown.za.net <mailto:j...@unknown.za.net>> wrote:

    explain select * from archive_document_index where node_id = 29
    and value = 'BSH70002152';
    QUERY PLAN
    
--------------------------------------------------------------------------------------------------------------------------
     Index Scan using archive_document_index_node_id_value_idx on
    archive_document_index  (cost=0.57..36.13 rows=14 width=33)
       Index Cond: ((node_id = 29) AND ((value)::text =
    'BSH70002152'::text))
    (2 rows)


I believe that this is what Tom is suggesting:

SELECT *
FROM ARCHIVE_DOCUMENT AS ad
WHERE ad.NODE_ID = 29
  AND ad.ARCHIVE_DATE >= '2013-07-08 18:28:00'
  AND EXISTS (SELECT *
              FROM ARCHIVE_DOCUMENT_INDEX AS adi
              WHERE adi.ARCHIVE_ID = ad.ID
                AND adi.NODE_ID = ad.NODE_ID
                AND (adi.VALUE = 'BSH70002152' OR adi.VALUE = 'TC212592')
             )
ORDER BY ad.ARCHIVE_DATE DESC
LIMIT 10;

Forgive my changing of the aliases. I found names like r0 difficult to interpret easily.

This is a much simpler query since it only requires one subselect, and I believe Tom is suggesting that this query may be able to make use of the index or at least find some more efficient plan. How does this perform?

Indeed it does perform very much better. With the modified query the plan and the execution time are excellent.

However, my previous question remains - in the original query plan, there are two hash tables being populated using seqscan + filter. During each seqscan, over 95 million records were wastefully read. I think that this could have been dramatically improved by using an index scan. Am I mistaken? Is it a optimisation that could be implemented but has not been implemented yet? Is it a bug?

Thanks,
Jesse

Reply via email to