Re: [PATCHES] 2WRS [WIP]
We need more testing to show this is a good idea. --- Manolo _ wrote: > > HI. > > I send you the diff of my code against the current CVS TIP. > Please tell me if it's what you were asking for. > > Thanks. > > Regards, Manolo di Domenico > > > > Date: Wed, 6 Feb 2008 17:03:16 -0800 > > From: [EMAIL PROTECTED] > > To: [EMAIL PROTECTED] > > Subject: Re: [PATCHES] 2WRS [WIP] > > > > > > Go here and snoop around a bit. > > > > http://neilconway.org/talks/hacking > _ > Express yourself instantly with MSN Messenger! Download today it's FREE! > http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [HACKERS] [PATCHES] 2WRS [WIP]
Referring to tuplesort.c andtuplestore.c BACKGROUND: Starting from dumptuples() [ tuplesort.c ] write functions move the tuple from a buffer to another in order to finally write it in a logical tape. Is there a way (even the most inefficient way) to use current read/write functions provided by PostgreSQL in order to retrieve the first tuple of a certain run while performing External Sorting? NOTE: I need the first tuple in order to manipulate the whole corresponding run, tuple by tuple since they are written sequentially in a run. Thanks for your attention. Regards, Manolo. -- From: <[EMAIL PROTECTED]> Sent: Tuesday, February 26, 2008 4:10 PM To: "Jaime Casanova" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Cc: "Decibel!" <[EMAIL PROTECTED]>; "David Fetter" <[EMAIL PROTECTED]>; ; <[EMAIL PROTECTED]> Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] For the joy of all of you: that's the correct WIP patch. At the moment it only tries to create runs uding two heaps. Hope you can help me with writing those runs on tapes. I'd be very pleased to give you more details. Thenks for your time. Regards, Manolo. -- From: "Jaime Casanova" <[EMAIL PROTECTED]> Sent: Friday, February 22, 2008 5:30 AM To: <[EMAIL PROTECTED]> Cc: "Decibel!" <[EMAIL PROTECTED]>; "Manolo _" <[EMAIL PROTECTED]>; "David Fetter" <[EMAIL PROTECTED]>; ; <[EMAIL PROTECTED]> Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] On Thu, Feb 21, 2008 at 6:44 AM, <[EMAIL PROTECTED]> wrote: Hi. That's the last release and refers to 8.3.0 and not to 8.2.5 as before. Hope you can tell me if I created it correctly please. no, it doesn't... ! /* GUC variables */ #ifdef TRACE_SORT bool trace_sort = false; #endif - #ifdef DEBUG_BOUNDED_SORT - bool optimize_bounded_sort = true; - #endif it's seems you're removing something added in 8.3 -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] 2WRS [WIP]
For the joy of all of you: that's the correct WIP patch. At the moment it only tries to create runs uding two heaps. Hope you can help me with writing those runs on tapes. I'd be very pleased to give you more details. Thenks for your time. Regards, Manolo. -- From: "Jaime Casanova" <[EMAIL PROTECTED]> Sent: Friday, February 22, 2008 5:30 AM To: <[EMAIL PROTECTED]> Cc: "Decibel!" <[EMAIL PROTECTED]>; "Manolo _" <[EMAIL PROTECTED]>; "David Fetter" <[EMAIL PROTECTED]>; ; <[EMAIL PROTECTED]> Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] On Thu, Feb 21, 2008 at 6:44 AM, <[EMAIL PROTECTED]> wrote: Hi. That's the last release and refers to 8.3.0 and not to 8.2.5 as before. Hope you can tell me if I created it correctly please. no, it doesn't... ! /* GUC variables */ #ifdef TRACE_SORT bool trace_sort = false; #endif - #ifdef DEBUG_BOUNDED_SORT - bool optimize_bounded_sort = true; - #endif it's seems you're removing something added in 8.3 -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match tuplesort.patch Description: Binary data ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] 2WRS [WIP]
For the joy of all of you: that's the correct WIP patch. At the moment it only tries to create runs uding two heaps. Hope you can help me with writing those runs on tapes. I'd be very pleased to give you more details. Thenks for your time. Regards, Manolo. -- From: "Jaime Casanova" <[EMAIL PROTECTED]> Sent: Friday, February 22, 2008 5:30 AM To: <[EMAIL PROTECTED]> Cc: "Decibel!" <[EMAIL PROTECTED]>; "Manolo _" <[EMAIL PROTECTED]>; "David Fetter" <[EMAIL PROTECTED]>; ; <[EMAIL PROTECTED]> Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] On Thu, Feb 21, 2008 at 6:44 AM, <[EMAIL PROTECTED]> wrote: Hi. That's the last release and refers to 8.3.0 and not to 8.2.5 as before. Hope you can tell me if I created it correctly please. no, it doesn't... ! /* GUC variables */ #ifdef TRACE_SORT bool trace_sort = false; #endif - #ifdef DEBUG_BOUNDED_SORT - bool optimize_bounded_sort = true; - #endif it's seems you're removing something added in 8.3 -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match tuplesort.patch Description: Binary data ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] 2WRS [WIP]
On Thu, Feb 21, 2008 at 6:44 AM, <[EMAIL PROTECTED]> wrote: > Hi. > > That's the last release and refers to 8.3.0 and not to 8.2.5 as before. Hope > you can tell me if I created it correctly please. > no, it doesn't... > ! /* GUC variables */ > #ifdef TRACE_SORT > booltrace_sort = false; > #endif > - #ifdef DEBUG_BOUNDED_SORT > - booloptimize_bounded_sort = true; > - #endif it's seems you're removing something added in 8.3 -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] 2WRS [WIP]
On Fri, Feb 08, 2008 at 12:27:23AM -0500, Jaime Casanova wrote: > On Feb 7, 2008 6:04 AM, Manolo _ <[EMAIL PROTECTED]> wrote: > > > > HI. > > > > I send you the diff of my code against the current CVS TIP. > > Please tell me if it's what you were asking for. > > > > not actually, because your patch removes an improvement that was > included in 8.3... > what you will have to do (if someone has a better solution feel free > to comment on this) is to manually merge your 8.2's patch into the > 8.3's source and then generate a diff s/8.3/HEAD/ -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpiovHuPRYnM.pgp Description: PGP signature
Re: [PATCHES] 2WRS [WIP]
On Feb 7, 2008 6:04 AM, Manolo _ <[EMAIL PROTECTED]> wrote: > > HI. > > I send you the diff of my code against the current CVS TIP. > Please tell me if it's what you were asking for. > not actually, because your patch removes an improvement that was included in 8.3... what you will have to do (if someone has a better solution feel free to comment on this) is to manually merge your 8.2's patch into the 8.3's source and then generate a diff another sugestion is to comment a little more your code. simply put a mark where you modify something is not a comment, specially if you can get that info from a simple cvs diff -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] 2WRS [WIP]
On Mon, Feb 04, 2008 at 07:10:10PM +0100, [EMAIL PROTECTED] wrote: > Hi to all. > > I'm implementing a refinement of the External Sorting (ES) algorithm > [postgresql-8.2.5/src/backend/utils/sort/tuplesort.c] . The patch is > still WIP. Patches for new features need to be on CVS TIP, as the project does not add new features to stable releases. How do you want to be named on this? Cheers, David. -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PATCHES] 2WRS [WIP]
Hi to all. I'm implementing a refinement of the External Sorting (ES) algorithm [postgresql-8.2.5/src/backend/utils/sort/tuplesort.c] . The patch is still WIP. Here it follows the explanation of my refinement. Please ask ask ask for any kind of related question. At the moment I'm stuck on what I'll write on the "*** HOW TO ? ***" part of this mail. At least please jump directly to that part please. "Two Way Replacement Selection" (2WRS) Recall current ES is implemented through Replacement Selection (RS) [as for run formation] and Polyphase Merge (PM) to merge [when RS terminates] all of the runs built by RS. As for the run formation phase, this technique uses 2 heaps instead of just one as in the current PostgreSQL version of RS: that's why we call it "Two Way Replacement Selection" (2WRS). What I do is: 1) fill the 'memtuples' array 2) qsort 'memtuples' and divide it into two [suppose equal] part: HeapUP and HeapDOWN, each heap builds it's own "physical run" respectively a runUP and a runDOWN 3) a new input tuple exclusively goes into one of those two heaps: the corresponding root of the destination heap is written into the physical run associated to that heap 4) when one of the heap is over we stop building the corresponding run and possibly go on executing using only the other one, in case it isn't still over. When both heaps are over, we start building the next "logical run" 5) repeat steps from 2 to 4 until the input is over. That is the refinement to the phase of "run formation". More details. *** THE HEAPS *** After dividing memory we obtain 2 heaps arranged in a "sand clock" way. http://maxotek.net/images/base/sandClock.png In a way we could say HeapUP is an "upside down" heap, while HeapDOWN is an "ordinary heap". Both heaps have their own root towards the center of 'memtuples'. They are not communicating in any way: each of them builds its own physical run. *** THE LOGICAL RUN *** It's what should be passed to the merge phase. It would be obtained by reading backward the corresponding runUP and appending the corresponding runDOWN to it. Of course we should avoid making useless "extra work" joining those two runs, we could just take in account their arrangement while merging. *** STATE OF THE ART *** Any change to the current merge algorithm [Polyphase Merge] is not taken in account at the moment. Now I'd just like to test the technique supposing the Polyphase Merge gets correctly the "logical run" in the same way it got/wrote/red the ordinary runs. Recall the runUP should be red backward while merging and that actually 2WRS produces two physical runs instead of just one. That's why, at the moment, I prefer making that above "extra work" to physically join those two part [runUP and runDOWN associated to the same logical run] to prove that we do build logical runs longer than ordinary runs built by current RS. *** TODO *** Temporarily write runUP and runDOWN in a tape different from destTape and join them into destTape when we finish building them. *** HOW TO ? *** How to do that joining those two runs? I should know the position on disk (?) where the first and the last tuple of a certain run are stored. I tried to follow all of the function calls made starting from LogicalTapeWrite() but I get lost into the various buffers used before finally writing to disk. Is there any other way to get that info? Actually I suppose BufFile structure has something to do... but there's no documentation related so I ask you please give me any suggestion. Thanks for your attention. Regards, Manolo. tuplesort.patch Description: Binary data ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings