The patch has now been applied and committed. Please let me know of any
issues should they arise.

Cheers
mick

> -----Original Message-----
> From: Kenneth Marshall [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 01, 2008 2:53 PM
> To: Mick Johnson
> Cc: [EMAIL PROTECTED]; dspam-
> [EMAIL PROTECTED]
> Subject: Re: [dspam-dev] CVS updates
> 
> Mick,
> 
> I was reading my mail from most recent to less recent. If you
> are the one who should receive patches, I have attached my
> PostgreSQL 8+ performance patch for group performance. It is
> based on the DSPAM 3.6.2 update to the driver. Please consider
> applying it for the 3.8.1 release. I would be happy to answer
> any questions and there is a longish thread in the mailing
> list archives from when I submitted it originally. Thank you
> for your consideration.
> 
> Cheers,
> Ken
> 
> On Fri, Feb 01, 2008 at 01:47:25PM -0800, Mick Johnson wrote:
> > All
> >
> > A few pending updates have been pushed up to CVS :
> >
> > * Allow users to select multiple rows by clicking
> > on the initial row, holding shift, and clicking on the final row.
> > * Adds a "select 200" button to the quarantine page.
> > * Removed some junk from a previous merge.
> >
> > The Feature Request page has also been updated - the donations button
> and
> > dollar value components have been removed as we're no longer
> accepting
> > donations for this project.
> >
> > A few of the patches came in via this interface, this actually makes
> it
> > harder to patch as a) I don't know who submitted them and can't bring
> up any
> > suggestions or corrections directly, and b) they have less visibility
> on
> > this list. In the future, if you wish to submit patches (and I'm
> always
> > happy when people do) please do so here to ensure the community can
> review.
> >
> > The latest CVS version seems to have been running stably for some
> time now
> > and I'm looking to push this out as a stable 3.8.1 this month unless
> I hear
> > otherwise.
> >
> > Finally, looking forward to a great 2008!
> >
> > Cheers
> > Mick Johnson
> > Sensory Networks
> >
> >
> -----------previous patch posting to dspam-users/dspam-dev--------
> Dear dspam-users:
> 
> I sent this patch update to the dspam-dev list, but I never
> saw it. It should be useful to others in the DSPAM community
> that are or would like to use PostgreSQL as the DB backend.
> The patch is against version 3.6.2 but should apply fairly
> easily to any of the 3.6.x series.
> 
> Ken Marshall
> 
> ----- Forwarded message from Kenneth Marshall <[EMAIL PROTECTED]> ----
> -
> 
> Date: Sun, 26 Feb 2006 16:49:16 -0600
> From: Kenneth Marshall <[EMAIL PROTECTED]>
> To: Rustam Aliyev <[EMAIL PROTECTED]>
> Cc: Jonathan Zdziarski <[EMAIL PROTECTED]>,
>       dspam-dev@lists.nuclearelephant.com,
> Subject: Re: [dspam-dev] PostgreSQL Performance Patch
> 
> Jonathan and Rustam,
> 
> We are still having problems with the delicateness of MySQL
> as a backend DB. The problem is not with DSPAM at all but that
> we are having problems scaling MySQL to the number of users and
> still be able to provide a backup DB server, in case the primary
> server has an outage. Also, the database cleaning and optimization
> shutsdown the processing while it is running when using a MYISAM
> DB. With the updated PostgreSQL driver in DSPAM and the release
> of version 8.1, we are again testing the PostgreSQL backend.
> 
> I had high expectations for the combo of DSPAM 3.6 and PostgreSQL
> 8.1, but while it easily handled much more than 3.2/3.4 and 8.0
> it still took too long to process messages. The basic token select
> took seconds to perform. I finally tracked the problem down to the
> fact that the updated 8.x query is only used if you are not using
> the merged group. Without the merged/global group the selects took
> approximately 20ms with it they took 3000ms.
> 
> I generated the attached patch which uses the same technique to
> speed the selects for both the uid and gid. With this patch, my
> test queries returned identical results, but in 22ms versus almost
> 5900ms for the unpatched code. Could you please take a look at it
> and consider it for inclusion in the next point release.
> 
> Yours truly,
> Ken Marshall
> 
> On Wed, Aug 03, 2005 at 10:42:57AM +0500, Rustam Aliyev wrote:
> > Here's the patch. Should work, tested.
> >
> > There's also alternative way (just for discussion): maybe it would be
> > better to create two different "lookup_tokens()" functions for 7.x
> and
> > 8.x versions?
> >
> >
> > Jonathan Zdziarski wrote:
> >
> > >so what do you think rustam about checking the version and adjusting
> > >the query accordingly?
> > >
> > >Jonathan
> >
> >
> 
> > --- pgsql_drv.c.jz  Wed Aug  3 10:09:18 2005
> > +++ pgsql_drv.c     Wed Aug  3 10:16:10 2005
> > @@ -498,8 +498,15 @@
> >           "FROM dspam_token_data WHERE uid IN ('%d','%d') AND token
> IN (",
> >           uid, gid);
> >    } else {
> > -    snprintf (scratch, sizeof (scratch),
> > -              "SELECT * FROM lookup_tokens(%d, '{", uid);
> > +    if (PQserverVersion(s->dbh) > 80000) {
> > +      snprintf (scratch, sizeof (scratch),
> > +                "SELECT * FROM lookup_tokens(%d, '{", uid);
> > +    } else {
> > +      snprintf (scratch, sizeof (scratch),
> > +                "SELECT uid, token, spam_hits, innocent_hits "
> > +                "FROM dspam_token_data WHERE uid = '%d' AND token IN
> (",
> > +                uid);
> > +    }
> >    }
> >
> >    buffer_cat (query, scratch);
> > @@ -520,10 +527,14 @@
> >    }
> >    ds_diction_close(ds_c);
> >
> > -  if (gid != uid)
> > +  if (PQserverVersion(s->dbh) > 80000) {
> > +    if (gid != uid)
> > +      buffer_cat (query, ")");
> > +    else
> > +      buffer_cat(query, "}')");
> > +  } else {
> >      buffer_cat (query, ")");
> > -  else
> > -    buffer_cat(query, "}')");
> > +  }
> >
> >  #ifdef VERBOSE
> >    LOGDEBUG ("pgsql query length: %ld\n", query->used);
> 
> 
> --- pgsql_drv.c_362   2006-02-26 16:25:29.266084245 -0600
> +++ pgsql_drv.c       2006-02-24 14:17:24.717817917 -0600
> @@ -490,10 +490,15 @@
>    }
> 
>    if (gid != uid) {
> -    snprintf (scratch, sizeof (scratch),
> -              "SELECT uid, token, spam_hits, innocent_hits "
> -              "FROM dspam_token_data WHERE uid IN ('%d','%d') AND
> token IN (",
> -              uid, gid);
> +    if (s->pg_major_ver >= 8) {
> +      snprintf (scratch, sizeof (scratch),
> +                "SELECT * FROM lookup_tokens(%d, %d, '{", uid, gid);
> +    } else {
> +      snprintf (scratch, sizeof (scratch),
> +                "SELECT uid, token, spam_hits, innocent_hits "
> +                "FROM dspam_token_data WHERE uid IN ('%d','%d') AND
> token IN (",
> +                uid, gid);
> +    }
>    } else {
>      if (s->pg_major_ver >= 8) {
>        snprintf (scratch, sizeof (scratch),
> @@ -525,10 +530,7 @@
>    ds_diction_close(ds_c);
> 
>    if (s->pg_major_ver >= 8) {
> -    if (gid != uid)
> -      buffer_cat (query, ")");
> -    else
> -      buffer_cat(query, "}')");
> +    buffer_cat(query, "}')");
>    } else {
>      buffer_cat (query, ")");
>    }
> 
> --- pgsql_objects.sql_362     2006-02-26 16:25:52.636539923 -0600
> +++ pgsql_objects.sql 2006-02-24 12:56:51.523174334 -0600
> @@ -74,3 +74,29 @@
>    return;
>  end;';
> 
> +create function lookup_tokens(integer,integer,bigint[])
> +  returns setof dspam_token_data
> +  language plpgsql stable
> +  as '
> +declare
> +  v_rec record;
> +begin
> +  for v_rec in select * from dspam_token_data
> +                where uid=$1
> +                  and token in (select $3[i]
> +                                  from
> generate_series(array_lower($3,1),
> +
> array_upper($3,1)) s(i))
> +  loop
> +    return next v_rec;
> +  end loop;
> +  for v_rec in select * from dspam_token_data
> +                where uid=$2
> +                  and token in (select $3[i]
> +                                  from
> generate_series(array_lower($3,1),
> +
> array_upper($3,1)) s(i))
> +  loop
> +    return next v_rec;
> +  end loop;
> +  return;
> +end;';
> +
> 
> 
> ----- End forwarded message -----


Reply via email to