> On 31 May 2017, at 15:59, Vincent Massol <[email protected]> wrote: > > Hi Guillaume, > >> On 31 May 2017, at 12:15, Guillaume Delhumeau >> <[email protected]> wrote: >> >> Help me to decide! >> >> TL;DR: >> >> * I need to know if performing a query on the database for each user who >> want to receive an email with all the notifications, is a scalability issue >> (in a job context). > > Yes whenever we do a lot of queries to the DB it’s a scalability issue. If we > have 100K users then it’s 100K queries for definitely a scalability issue. > > We need to find a way to do a single query (or a small fixed number of > queries independent of the # of users). > > If not possible then we may need to either: > A) Add some new table in our DB to help do that > B) Use some tool other than the DB, e.g. SOLR, etc
PS: I forgot to mention that I haven’t read the full message part yet before answering :) Thanks -Vincent > > Thanks > -Vincent > >> * If it's not an issue, I can implement the "naïve" solution which requires >> less development. >> >> Full message: >> >> Status: >> * notifications are displayed on the top menu when you browse the wiki. >> * notifications are displayed differently for each individual user >> according to their preferences (filters on event type, on locations, >> etc...). >> * similar notifications are grouped together into "composite notifications". >> * there is only a few notifications displayed (5 by default). >> >> Objective: >> * send an email periodically (every hour, every day, every week) according >> to the user preferences with ALL events that happened during the last >> period of time, but still according to the user preferences. >> >> Inspiration: >> * the watchlist gets ALL events that happened during the last period of time >> * then, for each user, remove the events which the user is not interested in >> * Benefit: only one query to get the events from the database for all users >> >> Problems: >> * in the notifications, I have introduced a NotificationFilter role the >> make possible to inject some SQL in the query to get the events according >> to the user preferences. I call this "pre-filters". >> ** it means we generate a unique request for each individual user, so if we >> send a mail to 1000 users, we will have 1000 requests to the database. >> >> I wonder if it's a non-problem or a big scability issue. Because even if >> the whole job that send emails take ~10 minutes, it does not matter. It's >> not a realtime thing. >> >> For the records, NotificationFilter have "post-filters" too, that perform >> check on the event itself (for example checking the permissions, etc...). >> >> Alternatives: >> * just like the watchlist, perform a very generic query on the database to >> get all the events that happened during the last period of time >> * then for each user, use only the "post-filters" to remove events the user >> don't care of >> >> Problem: >> * it means the pre-filters that make sense in the notification use-case >> cannot be used for emails. Developers must be aware of this. >> * it requires some refactoring of the code that group similar notifications. >> >> Question: >> Should I go with the "naive" solution, ie for each user get all >> notifications and send a mail, or should I go with the "only 1 query to the >> database for all users" version? >> >> Thanks, >> >> -- >> Guillaume Delhumeau ([email protected]) >> Research & Development Engineer at XWiki SAS >> Committer on the XWiki.org project >

