The patch has now been applied and committed. Please let me know of any issues should they arise.
Cheers mick > -----Original Message----- > From: Kenneth Marshall [mailto:[EMAIL PROTECTED] > Sent: Friday, February 01, 2008 2:53 PM > To: Mick Johnson > Cc: [EMAIL PROTECTED]; dspam- > [EMAIL PROTECTED] > Subject: Re: [dspam-dev] CVS updates > > Mick, > > I was reading my mail from most recent to less recent. If you > are the one who should receive patches, I have attached my > PostgreSQL 8+ performance patch for group performance. It is > based on the DSPAM 3.6.2 update to the driver. Please consider > applying it for the 3.8.1 release. I would be happy to answer > any questions and there is a longish thread in the mailing > list archives from when I submitted it originally. Thank you > for your consideration. > > Cheers, > Ken > > On Fri, Feb 01, 2008 at 01:47:25PM -0800, Mick Johnson wrote: > > All > > > > A few pending updates have been pushed up to CVS : > > > > * Allow users to select multiple rows by clicking > > on the initial row, holding shift, and clicking on the final row. > > * Adds a "select 200" button to the quarantine page. > > * Removed some junk from a previous merge. > > > > The Feature Request page has also been updated - the donations button > and > > dollar value components have been removed as we're no longer > accepting > > donations for this project. > > > > A few of the patches came in via this interface, this actually makes > it > > harder to patch as a) I don't know who submitted them and can't bring > up any > > suggestions or corrections directly, and b) they have less visibility > on > > this list. In the future, if you wish to submit patches (and I'm > always > > happy when people do) please do so here to ensure the community can > review. > > > > The latest CVS version seems to have been running stably for some > time now > > and I'm looking to push this out as a stable 3.8.1 this month unless > I hear > > otherwise. > > > > Finally, looking forward to a great 2008! > > > > Cheers > > Mick Johnson > > Sensory Networks > > > > > -----------previous patch posting to dspam-users/dspam-dev-------- > Dear dspam-users: > > I sent this patch update to the dspam-dev list, but I never > saw it. It should be useful to others in the DSPAM community > that are or would like to use PostgreSQL as the DB backend. > The patch is against version 3.6.2 but should apply fairly > easily to any of the 3.6.x series. > > Ken Marshall > > ----- Forwarded message from Kenneth Marshall <[EMAIL PROTECTED]> ---- > - > > Date: Sun, 26 Feb 2006 16:49:16 -0600 > From: Kenneth Marshall <[EMAIL PROTECTED]> > To: Rustam Aliyev <[EMAIL PROTECTED]> > Cc: Jonathan Zdziarski <[EMAIL PROTECTED]>, > dspam-dev@lists.nuclearelephant.com, > Subject: Re: [dspam-dev] PostgreSQL Performance Patch > > Jonathan and Rustam, > > We are still having problems with the delicateness of MySQL > as a backend DB. The problem is not with DSPAM at all but that > we are having problems scaling MySQL to the number of users and > still be able to provide a backup DB server, in case the primary > server has an outage. Also, the database cleaning and optimization > shutsdown the processing while it is running when using a MYISAM > DB. With the updated PostgreSQL driver in DSPAM and the release > of version 8.1, we are again testing the PostgreSQL backend. > > I had high expectations for the combo of DSPAM 3.6 and PostgreSQL > 8.1, but while it easily handled much more than 3.2/3.4 and 8.0 > it still took too long to process messages. The basic token select > took seconds to perform. I finally tracked the problem down to the > fact that the updated 8.x query is only used if you are not using > the merged group. Without the merged/global group the selects took > approximately 20ms with it they took 3000ms. > > I generated the attached patch which uses the same technique to > speed the selects for both the uid and gid. With this patch, my > test queries returned identical results, but in 22ms versus almost > 5900ms for the unpatched code. Could you please take a look at it > and consider it for inclusion in the next point release. > > Yours truly, > Ken Marshall > > On Wed, Aug 03, 2005 at 10:42:57AM +0500, Rustam Aliyev wrote: > > Here's the patch. Should work, tested. > > > > There's also alternative way (just for discussion): maybe it would be > > better to create two different "lookup_tokens()" functions for 7.x > and > > 8.x versions? > > > > > > Jonathan Zdziarski wrote: > > > > >so what do you think rustam about checking the version and adjusting > > >the query accordingly? > > > > > >Jonathan > > > > > > > --- pgsql_drv.c.jz Wed Aug 3 10:09:18 2005 > > +++ pgsql_drv.c Wed Aug 3 10:16:10 2005 > > @@ -498,8 +498,15 @@ > > "FROM dspam_token_data WHERE uid IN ('%d','%d') AND token > IN (", > > uid, gid); > > } else { > > - snprintf (scratch, sizeof (scratch), > > - "SELECT * FROM lookup_tokens(%d, '{", uid); > > + if (PQserverVersion(s->dbh) > 80000) { > > + snprintf (scratch, sizeof (scratch), > > + "SELECT * FROM lookup_tokens(%d, '{", uid); > > + } else { > > + snprintf (scratch, sizeof (scratch), > > + "SELECT uid, token, spam_hits, innocent_hits " > > + "FROM dspam_token_data WHERE uid = '%d' AND token IN > (", > > + uid); > > + } > > } > > > > buffer_cat (query, scratch); > > @@ -520,10 +527,14 @@ > > } > > ds_diction_close(ds_c); > > > > - if (gid != uid) > > + if (PQserverVersion(s->dbh) > 80000) { > > + if (gid != uid) > > + buffer_cat (query, ")"); > > + else > > + buffer_cat(query, "}')"); > > + } else { > > buffer_cat (query, ")"); > > - else > > - buffer_cat(query, "}')"); > > + } > > > > #ifdef VERBOSE > > LOGDEBUG ("pgsql query length: %ld\n", query->used); > > > --- pgsql_drv.c_362 2006-02-26 16:25:29.266084245 -0600 > +++ pgsql_drv.c 2006-02-24 14:17:24.717817917 -0600 > @@ -490,10 +490,15 @@ > } > > if (gid != uid) { > - snprintf (scratch, sizeof (scratch), > - "SELECT uid, token, spam_hits, innocent_hits " > - "FROM dspam_token_data WHERE uid IN ('%d','%d') AND > token IN (", > - uid, gid); > + if (s->pg_major_ver >= 8) { > + snprintf (scratch, sizeof (scratch), > + "SELECT * FROM lookup_tokens(%d, %d, '{", uid, gid); > + } else { > + snprintf (scratch, sizeof (scratch), > + "SELECT uid, token, spam_hits, innocent_hits " > + "FROM dspam_token_data WHERE uid IN ('%d','%d') AND > token IN (", > + uid, gid); > + } > } else { > if (s->pg_major_ver >= 8) { > snprintf (scratch, sizeof (scratch), > @@ -525,10 +530,7 @@ > ds_diction_close(ds_c); > > if (s->pg_major_ver >= 8) { > - if (gid != uid) > - buffer_cat (query, ")"); > - else > - buffer_cat(query, "}')"); > + buffer_cat(query, "}')"); > } else { > buffer_cat (query, ")"); > } > > --- pgsql_objects.sql_362 2006-02-26 16:25:52.636539923 -0600 > +++ pgsql_objects.sql 2006-02-24 12:56:51.523174334 -0600 > @@ -74,3 +74,29 @@ > return; > end;'; > > +create function lookup_tokens(integer,integer,bigint[]) > + returns setof dspam_token_data > + language plpgsql stable > + as ' > +declare > + v_rec record; > +begin > + for v_rec in select * from dspam_token_data > + where uid=$1 > + and token in (select $3[i] > + from > generate_series(array_lower($3,1), > + > array_upper($3,1)) s(i)) > + loop > + return next v_rec; > + end loop; > + for v_rec in select * from dspam_token_data > + where uid=$2 > + and token in (select $3[i] > + from > generate_series(array_lower($3,1), > + > array_upper($3,1)) s(i)) > + loop > + return next v_rec; > + end loop; > + return; > +end;'; > + > > > ----- End forwarded message -----