Thanks for your explanation.

It was an excellent and interesting work

alara


2013/5/2 porker2008 <zhangxion...@gmail.com>

> Very interesting.
>
> Good Job!
>
> Wish everything working fine on Round1B
>
> On Friday, May 3, 2013 3:30:53 AM UTC+8, Bartholomew Furrow wrote:
> > All right, I think I've tracked down the bug!
> >
> >
> > First, my apologies to everyone who got an email saying he or she had
> been disqualified.  That's a nasty surprise.  I'd also like to apologize to
> the entirely different set of people who then got an email saying "Don't
> worry, you haven't been disqualified!", which was probably also quite
> confusing.  I sent an email to all contestants after that, because without
> knowing what had gone wrong, there was no other way to make sure I reached
> everyone who had gotten the erroneous email.
> >
> >
> >
> >
> > Since this was a bug that affected our users, it seems only fair that I
> tell you about it.  If you don't want to read a story about tracking down a
> difficult-to-reproduce bug, this is a good time to stop reading.
> >
> >
> >
> >
> > The backgroundCode Jam runs on App Engine.  The email-sending service is
> something like a hand-rolled MapReduce, implemented one summer by an
> excellent intern who should feel free to chime in and take credit.  It's
> supposed to work this way:
> >
> >
> > - An admin sets up a "notification" object, which gets written to the
> App Engine datastore and to memcache.  It has a time at which it will
> automatically start sending.
> > - Emails will be sent to the set of all users who were eligible to
> participate in a given contest, but has a list of "filters" that restrict
> that set, which are stored in the notification's "unprocessed_filters"
> field.  For example, a filter could require that the field "rule_breaker"
> appears in a user's scoreboard row, or it could require that user's rank be
> <= 1000.
> >
> >
> > - When that time arrives, the first thing that happens is that filters
> are "processed".  "Rank <= 1000" becomes "(Points, Time) >= (36, 45:50)".
> The other filter remains unchanged. Both are stored in the
> "processed_filters" field of the notification, and the modified
> "notification" object gets put back into memcache and datastore.
> >
> >
> > - Mappers go and fetch the "notification" object from memcache (or, if
> that fails, datastore) and send email accordingly.
> >
> >
> > One more piece of background: our library that stores objects in
> memcache+datastore uses a local cache, which is supposed to be cleared at
> the start of every request.
> >
> >
> >
> >
> > What happened
> > The library that stores objects in memcache+datastore wasn't cleared at
> the start of the Mapper jobs.  On App Engine, a series of requests
> automatically gets sent to various machines.  So if the following series of
> events takes place:
> >
> >
> > - Machine A sets up the notification, and caches it locally.
> > - Machine B does the filter processing.
> > - Machine A acts as a Mapper, without clearing the local cache.
> >
> >
> >
> >
> > ...then Machine A will look in its local cache, and find the old version
> of the notification: the one that has unprocessed filters, but no processed
> filters.  It will think, "Should I send this email to user X?  Well, I
> don't have any filters, so I should."
> >
> >
> >
> >
> > This bug was really tough to track down.  I disabled mail-sending, added
> a lot of debug logging, and then wrote a bunch of test emails, then sent
> them off one by one; it took me ten tries to get one that failed in this
> particular way.  Once I realized that the field was unset, I started
> wondering about whether App Engine's datastore or memcache could be giving
> old versions of the data; but then I remembered the local cache.  I checked
> the logs to see whether when multiple Mappers failed in the same MapReduce,
> it was always on the same machine; and it was.
> >
> >
> >
> >
> > The immediate fix
> > 1. Go through all request handlers and make sure our local cache is
> cleared properly.  Add a comment to those lines saying that they must
> appear in new handlers.  I'm fairly confident that will prevent this from
> happening again, because I expect people to do a certain amount of
> copy/pasting when they make new handlers.
> >
> >
> > 2. Add an assertion, before any filters are evaluated, to make sure
> len(notification["processed_filters"]) ==
> len(notification["unprocessed_filters"]).  Now even if another bug occurs
> that causes filters not to be propagated, this should catch it.
> >
> >
> >
> >
> > I hope this was interesting!  Expect the final emails from Round 1A
> today, and please let me know if you got the wrong one. :-)
> > - Bartholomew
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google Code Jam" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to google-code+unsubscr...@googlegroups.com.
> To post to this group, send email to google-code@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-code/-/qkvprNJTldUJ.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google Code Jam" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-code+unsubscr...@googlegroups.com.
To post to this group, send email to google-code@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to