On 17.08.2010 12:40, Robert Collins wrote: > On Tue, Aug 17, 2010 at 10:34 PM, Abel Deuring > <[email protected]> wrote: >> On 17.08.2010 12:16, Robert Collins wrote: >>> On Tue, Aug 17, 2010 at 10:12 PM, Abel Deuring >>> <[email protected]> wrote: >>>> On 17.08.2010 11:45, Robert Collins wrote: >>>>> So there is some conflation/confusion here I think. >>>>> >>>>> *subscribing* to a ft search - +1 >>>>> >>>>> putting a tsearch vector in the *subscription* - I'm lost why that is >>>>> useful. >>>> >>>> It's not a tsearch vector but a tsquery I want to store :) >>>> >>>> If you have a number of subscriptions to a full text search -- how else >>>> would you remove the not matching searches in something like >>>> >>>> SELECT whatever FROM bugsubscription >>>> WHERE bugsubscription.bug=our_current_bug_id >>>> AND there_is_a_match( >>>> (SELECT full_text FROM bug WHERE id=our_current_bug_id), >>>> bugsubscription.fulltext_search_words) >>>> >>>> With a canned tsquery you can use an WHERE expression like >>>> >>>> bugsubscription.tsquery @@ bug.fti >>> >>> I'd _really_ like to see a performance test of that; if it behaves >>> like some of the ts2 stuff we may be very disappointed. >> >> Admittedly, I don't expect such a query to be very fast. But remember: >> We are not talking about web requests but about a script (or a job) that >> should generate emails not-too-long after a bug has been filed, somebody >> had commented on a bug, after bug status changes etc. I am all for >> using/writing efficient code, and if we get millions of subscriptions, >> performance is indeed an issue -- but if a script runs 10 or 30 seconds >> for a few hundred or thousand subscriptions, does not really matter, I >> think. (The same applies, BTW, for filtering on Python level.) >> >> Abel > > Respectfully, I have to disagree. > > Slow processing means high consumption of resources. If it takes 30 > seconds to process a single bug subscription notifications, and we > have more than 1 bug filed every 30 seconds: we'll need 2 concurrent > tasks doing nothing but that.
Agreed, a processing time in the order of dozens of seconds would be an issue. But I think nevertheless that it is worth a try to allow a full text search for bug subscriptions. even a sequential search should not take very long, because we check just one bug/FTI vector. I suspect that delivering bug mail, or "spamming" the mail server with bug mail, is much more likely to cause congestion problems. > > I'm of the opinion that there are extremely few places in our system > where performance does not matter. > > Its ok to say 'we will start with something that will be tolerable, > and iterate to faster' - but we have to have done *something* to > convince ourselves that tolerable will be the starting point. OK, so let's try to use queries with WHERE expressions containing "bug.fti @@ subscription.tsquery". Abel _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

