I like the concept of MapReduce, however, I think it might be easier
to borrow a page from Apple with the Grand Central Dispatch released
in Snow Leopard. The hardest part would be implement a usable tool /
framework in Java which many developers could leverage and understand.
Especially, in my experience, most developers do not write thread safe
code by default which is a fundamental tenant to both GCD and
MapReduce.

Tim

On Nov 13, 4:55 pm, "Ikai L (Google)" <ika...@google.com> wrote:
> Thanks for the feedback, Tim. It sounds to me like what you are looking for
> is MapReduce support. There's an feature in our issue tracker for this:
>
> http://code.google.com/p/googleappengine/issues/detail?id=112
>
> Map/Reduce would be a great fit for our model since the work could be
> transparently distributed among your application instances. App Engine
> definitely favors the approach you describe of breaking a big job into
> smaller pieces and reassembling the data, but currently this is up to the
> developer to manage and build.
>
> On Thu, Nov 12, 2009 at 8:26 AM, tsp...@tangiblesoftware.com <
>
>
>
>
>
> tsp...@tangiblesoftware.com> wrote:
> > Ikai,
> >        This is not really a relational data question. It is a summary data
> > question. To give a brief overview on my approach; here is the history
> > over the past 20 years on my approach to summary information:
>
> >        1. Calculate the summary information on the fly per user request.
> > Very database intensive and potentially slow performance for the user.
> >        2. Create summary data tables which the application can read very
> > quickly, use database triggers to create/update the summary values.
> > Improved user experience, but has a penalty at write time and requires
> > developers to know two tools (database triggers and application
> > language).
> >        3.  Same approach as number 2, but create/update the summary values
> > in the application code. Reduces maintenance headaches by having a
> > single tool, makes the write performance a little worse because now
> > the transaction spans computers/servers. Since servers are cheap and
> > developers are not, this became the preferred approach.
> >        4. Avoid the possible create/search of step two/three and assume a
> > summary record exists at time of write. Increases performance by
> > eliminating the check for a summary record at each write, downside;
> > need an asynchronous process to pre-create all possible summary
> > records and prune ones which never were used after a reasonable time.
>
> > Depending on the requirements, I prefer the first or forth choice
> > (mostly read to write ratio is what matters). However, it is hard to
> > create a long running process via the existing toolset and constraints
> > provided by GAE. Because of this, I was falling back to the third
> > option; which was the basis for my original question. (I am looking
> > into trying to break the process into many 30 seconds or less tasks,
> > but it is not looking like a practical solution yet. This is another
> > reason we need to get support for long running batch processes within
> > GAE.)
>
> > Tim
>
> > On Nov 10, 5:44 pm, "Ikai L (Google)" <ika...@google.com> wrote:
> > > Tim,
>
> > > It really depends on what you're doing. One of the challenges of
> > developing
> > > on a distributed store like the App Engine data store is adjusting the
> > way
> > > you approach persistence for objects. For instance, suppose you store
> > > favorite colors per application user. The canonical way of solving this
> > > problem in a relational environment is to normalize the color data and
> > > create a lock around inserting each individual new color. In App Engine's
> > > environment, we would likely recommend that you take advantage of data
> > store
> > > list properties as a much more performant alternative to data
> > normalization:
> > > App Engine will handle all the indexing for you.
>
> > > If you are working with objects in parent/child relationships and require
> > > transactional integrity, you should take a look at our documentation
> > > describing Entities and Entity Groups:
> >http://code.google.com/appengine/docs/java/datastore/transactions.html.
>
> > > On Fri, Nov 6, 2009 at 12:12 PM, tsp...@tangiblesoftware.com <
>
> > > tsp...@tangiblesoftware.com> wrote:
>
> > > > Guys,
> > > >    In a normal relational database, I am used to using a combination
> > > > of singletons (single application server), synchronized objects in a
> > > > dedicated thread (single application server) or table locks (multiple
> > > > application servers) to manage the creation of summary data records
> > > > which could created by multiple simultaneous requests.
> > > >    In GAE, none of the methods seem to be supported; what would be
> > > > the suggested method?
>
> > > >    I am using the JPA method of accessing the data store.
>
> > > > Thanks,
>
> > > > Tim
>
> > > --
> > > Ikai Lan
> > > Developer Programs Engineer, Google App Engine
>
> > --
>
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine for Java" group.
> > To post to this group, send email to
> > google-appengine-j...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2B 
> > unsubscr...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/google-appengine-java?hl=.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine

--

You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=.


Reply via email to