Re: [HACKERS] Replacement Selection

mac_man2005 Mon, 26 Nov 2007 10:46:33 -0800

I must precise that it's not the improvement. Other more complex algorithmscorrespond to the refinements, but at the moment I just want to know whichpart of PostgreSQL code does what. I also implemented Replacement Selection(RS) so if I'm able to integrate my RS I hope I would be able to integratethe others too.

Anyway, even in my RS implementation a longer run is created. The first Minitialization elements will surely form part of the current run. M is thememory size so at least a run sized M will be created. After initialization,the elements are not suddenly output, but an element from heap is outputinto run as soon as I get an element from stream. In other words, for eachelement from stream, the root element of the heap is output, and the inputelement takes the root place into the heap. If that element is a "goodrecord" I just heapify (since the element will be placed at the now freeroot place). If that input element is a dead record I swap it with the lastleaf and reduce the heap size.




--------------------------------------------------
From: "Tom Lane" <[EMAIL PROTECTED]>
Sent: Monday, November 26, 2007 7:31 PM
To: <[EMAIL PROTECTED]>
Cc: <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Replacement Selection

<[EMAIL PROTECTED]> writes:
3) Start run generation. As for this phase, I see PostgreSQL code (asKnuthalgorithm) marks elements belonging to runs in otder to know which runthey
belong to and to know when the current heap has finished building the
current run. I don't memorize this kind of info. I just output from heaptorun all of the elements going into the current run. The elements supposedtogo into the next run (I call them "dead records") are still stored intomainmemory, but as leaves of the heap. This implies reducing the heap sizeand
so heapifying a smaller number of elements each time I get a dead record
(it's not necessary to sort dead records). When the heap size is zero anewrun is created heapifying all the dead records currently present intomain
memory.
Why would this be an improvement over Knuth?  AFAICS you can't generate
longer runs this way, and it's not saving any time --- in fact it's
costing time, because re-heapifying adds a lot of new comparisons.

regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] Replacement Selection

Reply via email to