Re: [google-appengine] Fan-in with materialized views: A sketch

2010-10-13 Thread Carles Gonzalez
Neat! I'm going to see this code, hopefully I'll understand something :)
On Wednesday, October 13, 2010, Robert Kluin  wrote:
> Hey Dmitry,
>    In case it might help, I pushed some code to bitbucket.  At the
> moment I would (personally) say the code is not too pretty, but it
> works well.  :)
>       http://bitbucket.org/thebobert/slagg
>
>   Sorry it does not really have good documentation at the moment, but
> I think the basic example I threw together will give you a good idea
> of how to use it.  I need to do another cleanup pass over the API to
> make a few more refinements.
>
>    I pulled this code out of one of my apps, and tried to quickly
> refactor it to be a bit more generic.  We are currently using
> basically the same code in three apps to do some really complex
> calculations.  As soon as I get time I will get an example up showing
> how to use it for neat stuff, like overall, yearly, monthly, and daily
> aggregates across multiple values (like total dollars and quantity).
> The cool thing is that you can do all of those aggregations across
> various groupings, like customer, company, contact, and sales-person,
> at once.  I'll get that code pushed out in the next few days.
>
>   Would love to get some feedback on it.
>
>
> Robert
>
>
>
>
>
> On Tue, Oct 12, 2010 at 17:26, Dmitry  wrote:
>> Ben, thanks for your code! I'm trying to understand all this stuff
>> too...
>> Robert, any success with your "library"? May be you've already done
>> all stuff we are trying to implement...
>>
>> p.s. where is Brett S.:) would like to hear his comments on this
>>
>> On Sep 21, 1:49 pm, Ben  wrote:
>>> Thanks for your insights. I would love feedback on this implementation
>>> (Brett S. suggested we send in our code for 
>>> this)http://pastebin.com/3pUhFdk8
>>>
>>> This implementation is for just one materialized view row at a time
>>> (e.g. a simple counter, no presence markers). Hopefully putting an ETA
>>> on the transactional task will relieve the write pressure, since
>>> usually it should be an old update with an out-of-date sequence number
>>> and be discarded (the update having already been completed in batch by
>>> the fork-join-queue).
>>>
>>> I'd love to generalize this to do more than one materialized view row
>>> but thought I'd get feedback first.
>>>
>>> Thanks,
>>> Ben
>>>
>>> On Sep 17, 7:30 am, Robert Kluin  wrote:
>>>
>>> > Responses inline.
>>>
>>> > On Thu, Sep 16, 2010 at 17:32, Ben  wrote:
>>> > > I have a question about Brett Slatkin's talk at I/O 2010 on data
>>> > > pipelines. The question is about slide #67 of his pdf, corresponding
>>> > > to minute 51:30 of his talk
>>> > >http://code.google.com/events/io/2010/sessions/high-throughput-data-p...
>>>
>>> > > I am wondering what is supposed to happen in the transactional task
>>> > > (bullet point 2c). Would these updates to the materialized view cause
>>> > > you to write too frequently to the entity group containing the
>>> > > materialized view?
>>>
>>> > I think there are really two different approaches you can use to
>>> > insert your work models.
>>> > 1)  The work models get added to the original entity's group.  So,
>>> > inside of the original transaction you do not write to the entity
>>> > group containing the materialized view -- so no contention on it.
>>> > Commit the transaction and proceed to step 3.
>>> > 2)  You kick off a transactional task to insert the work model, or
>>> > fan-out more tasks to create work models  :).   Then you proceed to
>>> > step 3.
>>>
>>> > You can use method 1 if you have only a few aggregates.  If you have
>>> > more aggregates use the second method.  I have a "library" I am almost
>>> > ready to open source that makes method 2 really easy, so you can have
>>> > lots of aggregates.  I'll post to this group when I release it.
>>>
>>> > > And a related question, what happens if there is a failure just after
>>> > > the transaction in bullet #2, but right before the named task gets
>>> > > inserted in bullet #3. In my current implementation I just left out
>>> > > the transactional task (bullet point 2c) but I think that causes me to
>>> > > lose the eventual consistency.
>>>
>>> > Failure between steps 2 and 3 just means _that_ particular update will
>>> > not try to kick-off, ie insert, the fan-in (aggregation) task.  But it
>>> > might have already been inserted by the previous update, or the next
>>> > update.  However, if nothing else kicks of the fan-in task you will
>>> > need some periodic "cleanup" method to catch the update and kick of
>>> > the fan-in task.  Depending on exactly how you implemented step 2 you
>>> > may not need a transactional task.
>>>
>>> > Robert
>>>
>>> > > Thanks!
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, v

Re: [google-appengine] Fan-in with materialized views: A sketch

2010-10-14 Thread Carles Gonzalez
Robert, I took a brief inspection at your code and seems very cool. Exactly
what i was lloking for for my report generation and such.

I'm looking forward for more examples, but it seems a very valuable addition
for our toolbox.

Thanks a lot!

On Wed, Oct 13, 2010 at 9:20 PM, Carles Gonzalez  wrote:

> Neat! I'm going to see this code, hopefully I'll understand something :)
> On Wednesday, October 13, 2010, Robert Kluin 
> wrote:
> > Hey Dmitry,
> >In case it might help, I pushed some code to bitbucket.  At the
> > moment I would (personally) say the code is not too pretty, but it
> > works well.  :)
> >   http://bitbucket.org/thebobert/slagg
> >
> >   Sorry it does not really have good documentation at the moment, but
> > I think the basic example I threw together will give you a good idea
> > of how to use it.  I need to do another cleanup pass over the API to
> > make a few more refinements.
> >
> >I pulled this code out of one of my apps, and tried to quickly
> > refactor it to be a bit more generic.  We are currently using
> > basically the same code in three apps to do some really complex
> > calculations.  As soon as I get time I will get an example up showing
> > how to use it for neat stuff, like overall, yearly, monthly, and daily
> > aggregates across multiple values (like total dollars and quantity).
> > The cool thing is that you can do all of those aggregations across
> > various groupings, like customer, company, contact, and sales-person,
> > at once.  I'll get that code pushed out in the next few days.
> >
> >   Would love to get some feedback on it.
> >
> >
> > Robert
> >
> >
> >
> >
> >
> > On Tue, Oct 12, 2010 at 17:26, Dmitry  wrote:
> >> Ben, thanks for your code! I'm trying to understand all this stuff
> >> too...
> >> Robert, any success with your "library"? May be you've already done
> >> all stuff we are trying to implement...
> >>
> >> p.s. where is Brett S.:) would like to hear his comments on this
> >>
> >> On Sep 21, 1:49 pm, Ben  wrote:
> >>> Thanks for your insights. I would love feedback on this implementation
> >>> (Brett S. suggested we send in our code for this)
> http://pastebin.com/3pUhFdk8
> >>>
> >>> This implementation is for just one materialized view row at a time
> >>> (e.g. a simple counter, no presence markers). Hopefully putting an ETA
> >>> on the transactional task will relieve the write pressure, since
> >>> usually it should be an old update with an out-of-date sequence number
> >>> and be discarded (the update having already been completed in batch by
> >>> the fork-join-queue).
> >>>
> >>> I'd love to generalize this to do more than one materialized view row
> >>> but thought I'd get feedback first.
> >>>
> >>> Thanks,
> >>> Ben
> >>>
> >>> On Sep 17, 7:30 am, Robert Kluin  wrote:
> >>>
> >>> > Responses inline.
> >>>
> >>> > On Thu, Sep 16, 2010 at 17:32, Ben 
> wrote:
> >>> > > I have a question about Brett Slatkin's talk at I/O 2010 on data
> >>> > > pipelines. The question is about slide #67 of his pdf,
> corresponding
> >>> > > to minute 51:30 of his talk
> >>> > >
> http://code.google.com/events/io/2010/sessions/high-throughput-data-p...
> >>>
> >>> > > I am wondering what is supposed to happen in the transactional task
> >>> > > (bullet point 2c). Would these updates to the materialized view
> cause
> >>> > > you to write too frequently to the entity group containing the
> >>> > > materialized view?
> >>>
> >>> > I think there are really two different approaches you can use to
> >>> > insert your work models.
> >>> > 1)  The work models get added to the original entity's group.  So,
> >>> > inside of the original transaction you do not write to the entity
> >>> > group containing the materialized view -- so no contention on it.
> >>> > Commit the transaction and proceed to step 3.
> >>> > 2)  You kick off a transactional task to insert the work model, or
> >>> > fan-out more tasks to create work models  :).   Then you proceed to
> >>> > step 3.
> >>>
> >>> > You can use method 1 if you have only a few aggregates.