> On 31 May 2017, at 15:59, Vincent Massol <[email protected]> wrote:
> 
> Hi Guillaume,
> 
>> On 31 May 2017, at 12:15, Guillaume Delhumeau 
>> <[email protected]> wrote:
>> 
>> Help me to decide!
>> 
>> TL;DR:
>> 
>> * I need to know if performing a query on the database for each user who
>> want to receive an email with all the notifications, is a scalability issue
>> (in a job context).
> 
> Yes whenever we do a lot of queries to the DB it’s a scalability issue. If we 
> have 100K users then it’s 100K queries for definitely a scalability issue.
> 
> We need to find a way to do a single query (or a small fixed number of 
> queries independent of the # of users). 
> 
> If not possible then we may need to either:
> A) Add some new table in our DB to help do that
> B) Use some tool other than the DB, e.g. SOLR, etc

PS: I forgot to mention that I haven’t read the full message part yet before 
answering :)

Thanks
-Vincent

> 
> Thanks
> -Vincent
> 
>> * If it's not an issue, I can implement the "naïve" solution which requires
>> less development.
>> 
>> Full message:
>> 
>> Status:
>> * notifications are displayed on the top menu when you browse the wiki.
>> * notifications are displayed differently for each individual user
>> according to their preferences (filters on event type, on locations,
>> etc...).
>> * similar notifications are grouped together into "composite notifications".
>> * there is only a few notifications displayed (5 by default).
>> 
>> Objective:
>> * send an email periodically (every hour, every day, every week) according
>> to the user preferences with ALL events that happened during the last
>> period of time, but still according to the user preferences.
>> 
>> Inspiration:
>> * the watchlist gets ALL events that happened during the last period of time
>> * then, for each user, remove the events which the user is not interested in
>> * Benefit: only one query to get the events from the database for all users
>> 
>> Problems:
>> * in the notifications, I have introduced a NotificationFilter role the
>> make possible to inject some SQL in the query to get the events according
>> to the user preferences. I call this "pre-filters".
>> ** it means we generate a unique request for each individual user, so if we
>> send a mail to 1000 users, we will have 1000 requests to the database.
>> 
>> I wonder if it's a non-problem or a big scability issue. Because even if
>> the whole job that send emails take ~10 minutes, it does not matter. It's
>> not a realtime thing.
>> 
>> For the records, NotificationFilter have "post-filters" too, that perform
>> check on the event itself (for example checking the permissions, etc...).
>> 
>> Alternatives:
>> * just like the watchlist, perform a very generic query on the database to
>> get all the events that happened during the last period of time
>> * then for each user, use only the "post-filters" to remove events the user
>> don't care of
>> 
>> Problem:
>> * it means the pre-filters that make sense in the notification use-case
>> cannot be used for emails. Developers must be aware of this.
>> * it requires some refactoring of the code that group similar notifications.
>> 
>> Question:
>> Should I go with the "naive" solution, ie for each user get all
>> notifications and send a mail, or should I go with the "only 1 query to the
>> database for all users" version?
>> 
>> Thanks,
>> 
>> -- 
>> Guillaume Delhumeau ([email protected])
>> Research & Development Engineer at XWiki SAS
>> Committer on the XWiki.org project
> 

Reply via email to