Re: [appengine-java] Re: Why should app startup times be a problem.

2010-03-31 Thread Blake Caldwell
Yeah, that blog post was awesome. I started moving to Twig last night.
I understand and appreciate the concerns of others that frameworks are
our friends, and that it's not unreasonable to use them, but at the
same time, CPU and network optimization are even more important, so
long as the barrier to achieve them isn't too high.

 From the few hours I've spent with Twig, it seems easier -  not
harder than JDO for everything I'm doing. If you want to use JPA or
JDO because you're already familiar with them, that's great and all,
but the mapping to BigTable isn't that straight forward, so cramming
the framework on top might not be the answer.

Create a data store tier, wrap all Twig/Objectify logic in there, and
while your app won't be as portable as if you were using JDO, it'll
perform better, use less resources, and not be terribly dificult to
port later on, if you have to.

My $0.02


---
Sent from my iPad

On Mar 31, 2010, at 6:23 AM, SRF srfar...@gmail.com wrote:

 David, Ikai:

 Thanks very much for those blog posts on reducing the cold start
 time.  I think 1 to 2 seconds is reasonable.  I'm definitely going to
 take a closer look at Objectify.

 ++Steve

 On Mar 30, 4:54 pm, Ikai L (Google) ika...@google.com wrote:
 David, that post mirrors many of the points made here:

 http://www.answercow.com/2010/03/google-app-engine-cold-start-guide-
 f...

 There's one or two more tips on that page.

 On Tue, Mar 30, 2010 at 12:47 PM, David Chandler
 turboman...@gmail.comwrote:





 In the mean time, here are some ideas for reducing startup times by
 shrinking our apps. I went from 8.1s to 2.5s mainly by eliminating
 Guice, and I would expect similar results with Spring. I can
 definitely live with 2.5s...

 http://turbomanage.wordpress.com/2010/03/26/appengine-cold-starts-con
 ...

 /dmc

 On Mar 30, 3:04 pm, Baz b...@thinkloop.com wrote:
 Great information, Ikai.

 I really feel that instances should be completely avoided in
 concept
 and
 language on the GAE. What if the feature was simply an enable/
 disable
 deal
 called Warm Scale. If it were enabled, then your *next*
 instance would
 always be warm, regardless of how many instances you already had.
 This
 would
 be most noticeable and suitable for low QPS production apps that
 are
 constantly going from 0 to 1 instances (as you mentioned), but it
 could
 still be important for others, say, for a super-high-profile
 site, or a
 situation where your QPS is right at the threshold of instances and
 oscillating back and forth between two instances. Whatever the
 situation,
 if
 the solution were generalized like that, and most importantly not
 tied to
 a
 SPECIFIC NUMBER of instances, it would be up to the user to
 decide how
 important it was for them and whether to enable it.

 Cheers,
 Baz

 --
 You received this message because you are subscribed to the Google
 Groups
 Google App Engine for Java group.
 To post to this group, send email to
 google-appengine-j...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine-java+unsubscr...@googlegroups.comgoogle-
 appengine-java%2B unsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine-java?hl=en.

 --
 Ikai Lan
 Developer Programs Engineer, Google App Enginehttp://
 googleappengine.blogspot.com|http://twitter.com/app_engine

 --
 You received this message because you are subscribed to the Google
 Groups Google App Engine for Java group.
 To post to this group, send email to google-appengine-java@googlegroups.com
 .
 To unsubscribe from this group, send email to 
 google-appengine-java+unsubscr...@googlegroups.com
 .
 For more options, visit this group at 
 http://groups.google.com/group/google-appengine-java?hl=en
 .


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: Objectify - Twig - SimpleDS articles

2010-03-29 Thread Blake
+1

On Mar 29, 4:03 am, Andreas Borglin andreas.borg...@gmail.com wrote:
 Hi all.

 I recently decided to migrate away from JDO to one of the third party
 datastore frameworks. At first I had only heard about objectify, but
 after some further digging I  found out about 5 other frameworks as
 well (Twig, SimpleDS, siena, slim3, cloud2db).

 I was only interested in simple wrapper frameworks that acted as a
 convenience layer above the AppEngine low-level API. I _want_ the
 framework to expose the true nature of the datastore, but at the same
 time relieve the developer of the tedious tasks that's involved when
 working with the low-level API directly. It is much easier to work
 with the AppEngine datastore when its concepts, features, constraints
 and limitations are exposed directly. You can read more about the
 reasons for this in the article.

 This left me with objectify, Twig and SimpleDS. (siena and cloud2db
 are multi-platform and slim3 is more than just a datastore framework)

 I spent some time researching these when I got the idea to write an
 article about them. I contacted the authors for each framework and
 asked if they would be interested in participating. Passionate as they
 are, they agreed :-). Thanks to Jeff Schnitzer (objectify), John
 Patterson (Twig) and Ignacio Coloma (SimpleDS) for this.

 The goal is to publish two articles; one interview with the authors,
 and one where I solve some typical scenario with each framework.
 The interview article has now been published and can be found 
 athttp://borglin.net/gwt-project/?page_id=604.
 The code example article will be posted sometime in the upcoming two
 weeks.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: performance of Task Queue Java API

2010-03-25 Thread Blake
I'm doing this too.  Turn that task into one that takes parameters - a
start and end index.  You set a threshold of the biggest number of
tasks that can be kicked off in one execution.  Let's say it's 50.
So, you're given 1 to 5000... So, you kick off a task that spawns off
new instances of itself from 1-50, 51-100, 101-150, etc.  This task
will do no logic but kick off tasks.

If your range is less than that threshold, then you can just process
them.

This strategy takes a lot of work, but it's totally worth it, because
you've just made your app scale better.  You can now handle much more
than 5,000 tasks - if your logic is clever enough, there's no limit to
how big that number can be.

I think I just keep dividing by 10 until I have less than 10, then
process those locally.

I'm excited about the 1.3.2 bulk add API - I hadn't heard of that
before.


On Mar 25, 10:45 am, Eugene Kuleshov ekules...@gmail.com wrote:
   My application need to create bunch of tasks to do some data
 processing. I've tried to prototype a small application that spawns
 5000 tasks from the process initiated by cron job, but it seem like I
 am hitting some wall, because my test can't spawn more then 1000 tasks
 and it is terminated by GAE runtime after 30 seconds.

   Is there any known performance limitations of the Task Queue Java
 API or any best practices on how to spawn large number of tasks?

   Thanks

   Eugene

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: elegant way of implementing sequence generator

2010-03-09 Thread Blake
You could also go with the sharded counter strategy.  The more shards
you have, the less the chance that you'll have a collision, and you'd
use @Version for optimistic locking on each shard.

On Mar 9, 7:18 am, legendlink gregc...@gmail.com wrote:
 Thanks for the reply.

 If memcache is used, how do I implement it so that the counter would
 always be updated and not be deleted?

 On Mar 6, 4:35 am, Ikai L (Google) ika...@google.com wrote:

  Have you looked into Memcache's INCR?

 http://code.google.com/appengine/docs/java/javadoc/com/google/appengi...,
  long)

  This'll do it atomically, but you run the risk of it being volatile,
  so you'll have to account for that in your client code.

  On Tue, Mar 2, 2010 at 11:40 PM, legendlink gregc...@gmail.com wrote:
   hi, i wanted to have a sequence generator that increments by x value
   everytime it generates a value. if i would create the sequence
   generator by using the datastore, it is likely that data contention
   would occurr if there is high access times.

   i have looked into the sample code of max ross in the google code
   repository (SequenceExamplesJDO.java) and  think this is limited to
   increment by 1 only and not increment by x value.

   if sharding technique is used, my concern is that i might not get the
   right sequence.

   what is the best/elegant way of doing sequence generator that
   increments x value?

   --
   You received this message because you are subscribed to the Google Groups 
   Google App Engine for Java group.
   To post to this group, send email to 
   google-appengine-j...@googlegroups.com.
   To unsubscribe from this group, send email to 
   google-appengine-java+unsubscr...@googlegroups.com.
   For more options, visit this group 
   athttp://groups.google.com/group/google-appengine-java?hl=en.

  --
  Ikai Lan
  Developer Programs Engineer, Google App 
  Enginehttp://googleappengine.blogspot.com|http://twitter.com/app_engine

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: complex query

2010-02-23 Thread Blake
Hah, that's good to know, and I appreciate you throwing the it's in
the docs right back at me :)

On Feb 22, 3:26 pm, Karel Alvarez kalvar...@gmail.com wrote:
 Actually you can have more than one inequality condition, but they
 must all refer to the same field, it is in the docs... :-) that is not
 the problem I have, thanks for replying though...
 Karel

 On 2/22/10, Blake blakecaldw...@gmail.com wrote:

  You can only have one inequality filter per query.  So you can say:

  where A==b  B==c  D==e  e0

  that last e0 is your ownly allowed inequality filter.  It's in the
  docs :)

  On Feb 22, 1:39 am, ka2 kalvar...@gmail.com wrote:
  Hi
  I am trying to execute a query in the data store, the query would be
  something like this:

  select listingNumber from {class name here} where listPrice=1 
  listPrice=20  listingStatusId IN (1)  houseTypeId IN
  (1,2,4,6)   zipCode IN ('33035')

  I know the In syntax is not standard, but I got the impression from
  some issue that I can no longer find that it could work

  I also tried using contains, with parameters and executeWithMap,
  declaring the parameters, and not, nothing seems to work, I can make
  simpler queries, by Id, by a field, delete all entities, all works
  fine, except this.

  Somebody has a complex query sample in java, query on one class only
  is fine, but would like something with = and more than one contains?,
  my code would translate to something like:

  Query query = pm.newQuery(select listingNumber from {class name here}
  );
                  query.setFilter(listPrice=:minPrice 
  listPrice=:maxPrice
   :status.contains(listingStatusId)
   :houseType.contains(houseTypeId)   :zip.contains(zipCode));
                  //query.declareParameters(pars defined here);
                  MapString,Object pars= new HashMapString, Object();
                  pars.put(minPrice, minPrice);
                  pars.put(maxPrice, maxPrice);
                  pars.put(status, Arrays.asList(status));
                  pars.put(houseType, Arrays.asList(houseType));
                  pars.put(zip, Arrays.asList(zipCode));
                  results= (ListString) query.executeWithMap(pars);

  thanks
  Karel

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine for Java group.
  To post to this group, send email to google-appengine-j...@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine-java+unsubscr...@googlegroups.com.
  For more options, visit this group at
 http://groups.google.com/group/google-appengine-java?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: complex query

2010-02-22 Thread Blake
You can only have one inequality filter per query.  So you can say:

where A==b  B==c  D==e  e0

that last e0 is your ownly allowed inequality filter.  It's in the
docs :)

On Feb 22, 1:39 am, ka2 kalvar...@gmail.com wrote:
 Hi
 I am trying to execute a query in the data store, the query would be
 something like this:

 select listingNumber from {class name here} where listPrice=1 
 listPrice=20  listingStatusId IN (1)  houseTypeId IN
 (1,2,4,6)   zipCode IN ('33035')

 I know the In syntax is not standard, but I got the impression from
 some issue that I can no longer find that it could work

 I also tried using contains, with parameters and executeWithMap,
 declaring the parameters, and not, nothing seems to work, I can make
 simpler queries, by Id, by a field, delete all entities, all works
 fine, except this.

 Somebody has a complex query sample in java, query on one class only
 is fine, but would like something with = and more than one contains?,
 my code would translate to something like:

 Query query = pm.newQuery(select listingNumber from {class name here}
 );
                 query.setFilter(listPrice=:minPrice  listPrice=:maxPrice
  :status.contains(listingStatusId)
  :houseType.contains(houseTypeId)   :zip.contains(zipCode));
                 //query.declareParameters(pars defined here);
                 MapString,Object pars= new HashMapString, Object();
                 pars.put(minPrice, minPrice);
                 pars.put(maxPrice, maxPrice);
                 pars.put(status, Arrays.asList(status));
                 pars.put(houseType, Arrays.asList(houseType));
                 pars.put(zip, Arrays.asList(zipCode));
                 results= (ListString) query.executeWithMap(pars);

 thanks
 Karel

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] What CPU times should I expect in the logs?

2010-02-19 Thread Blake
I'm relatively new to GAE.  My queries so far have for single entities
with owned relationships, queried by key, which has been great - no
orange or red screaming in my logs.  More recently, I've been working
on a simple page that does this:

1. get a list of 10 of the user's Following entities - each contain
a key to an object the user is following, and some extra metadata
about the relationship

2. get each of the ObjectBeingFollowed entities by key in the
Following entity

Easy enough to get working, but the logs are screaming at me with CPU
times up around 0.5-1.0 seconds.  Is this just what to expect?  Do you
guys have pages that always return red or orange log messages?

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



[appengine-java] Re: Parsing files uses too much cpu loading

2010-02-19 Thread Blake
Create a task (for the task queue) - we'll call it SomeTaskServlet
that imports a section of the file between two line numbers that are
passed into it.

In this task above, here's what you'd do:
1. count how many lines are in the file - let's say 105
2. divide that by ten (make sure to handle the remainder!!)
3. kick off/queue 10 SomeTaskServlet tasks:
   - lines 1-10
   - lines 11-20
   - lines 21-30
   ...
   - lines 100-105
4. Make sure that your queued task is idempotent (http://
en.wikipedia.org/wiki/Idempotence), and to throw an exception if
there's a problem.  That way, the queue processor will retry it on
error, and you'll never have to worry about a thing.

The one big gotcha is that you really should know how many records you
have to process up front, or you'll have a hard time knowing when to
stop chunking.  This is tough when you're dealing with databases in
App Engine, because (afaik), you can't SELECT COUNT(*), but you're
working with a file.

Simple!

If that file grows, and you wanna make sure you're scalable, then the
SomeTaskServlet handles a max number of lines - say 10.  If the range
that was passed into it is larger than 10, then queue off the work
that it was given into 10 batches back to another instance of itself.
By the time you have a small enough batch, you'll have a chunk of data
that you can process in 1/10 second.  I'd recommend giving this task
its own queue so you can throttle it so that you don't eat up your
dynamic concurrent thread count (or whatever they call that).

Reply whether this makes sense.  I just did this to import 5,000
records from another system via REST.  The first several rounds keep
forking off more and more threads to chunk the data down into smaller
bits.  At the end, each of the hundreds of threads has SUCH a small
job to do, you can throttle it, and they retry themselves on error.

- Blake

On Feb 18, 4:31 pm, novarse stephenmwi...@gmail.com wrote:
 Hello,
 I'm trying to get data from csv files into my datastore tables. My app
 is showing cpu loadings of
  30356ms 20023cpu_ms 11480api_cpu_ms from the dash board and I was
 wondering if someone could see how I could improve this situation. I'm
 pretty new to Java.

 sample line from file:
 -470,16/12/2008 0:00:00,125

 this parses the file:
         private void processEvents(String fileName) {
                 try {
                         previousLineNumber = 0;
                         i = 1;
                         file = new File(fileName);
                         CSVParser shredder = new CSVParser(new 
 FileInputStream(file));
                         while ((t = shredder.nextValue()) != null) {
                                 if (previousLineNumber != 
 shredder.getLastLineNumber()) {
                                         if (previousLineNumber != 0) { // 
 save event
                                                 saveData(jdoEvent);
                                         }
                                         previousLineNumber = 
 shredder.getLastLineNumber();
                                         i = 1;
                                 } else
                                         i++;
                                 switch (i) {
                                 case 1:
                                         
 jdoEvent.setPKeyEventID(Long.parseLong(t));
                                         break;
                                 case 2:
                                         try {
                                                 Date d = processDate(t);
                                                 jdoEvent.setDate(d);
                                         } catch (ParseException e) {
                                                 
 System.out.println(e.getMessage());
                                         }
                                         break;
                                 case 3:
                                         
 jdoEvent.setFKeyRaceDescription(Long.parseLong(t));
                                         break;
                                 }
                         }

                         if (previousLineNumber != 0) {
                                 saveData(jdoEvent);
                         }
                 } catch (Exception e) {
                         System.err.println(e.getMessage());
                 }
         }

 this saves the object:
         private J void saveData(J jdoObject) {
                 PersistenceManager pm = PMF.get().getPersistenceManager();
                 try {
                         pm.makePersistent(jdoObject);
                 } finally {
                         pm.close();
                 }
         }

 this is my data object:

 package com.myproj.client;

 import java.util.Date;

 import javax.jdo.annotations.IdGeneratorStrategy;
 import javax.jdo.annotations.IdentityType;
 import javax.jdo.annotations.PersistenceCapable;
 import javax.jdo.annotations.Persistent;
 import

[appengine-java] Re: What CPU times should I expect in the logs?

2010-02-19 Thread Blake
Thanks John.  I did do that, but after reading the documentation, it
seems that if you query for 10 objects with || key=='abc' ||
key=='def' || key=='ghi', then it'll actually perform 10 queries
under the hood.

I'm noticing a lot of slow-down in my app from these GAE exceptions
that I think are due to my app starting up in new JVMs.  That might be
part of the problem.  The other part is that my entities were a little
screwy.  I flattened the children into the parent since there's always
three children, and each child only has a couple properties.  That
helped.

After that, I implemented caching, which makes it scream now.  I'm
caching the display DTOs rather than the entities, because just
building the DTOs was taking about 250 cpu ms.

So, with *nothing* in cache, my original takes 2 cpu seconds, 1 api
second. With the supporting entities in cache, it comes down to 800ms/
500ms.  After cleaning up my query and the entities as mentioned
above, I brought that down to 500ms/200.  And then, after caching the
objects I'm querying for here, I'm down to 50-90 cpu ms with no API
ms.

Okay, as I was finishing this post, I hit the app a few more times...
It's funny/frustrating how sporadic the system is.  Sometimes the
optimized path still takes up to 7 seconds, just to pull from
cache!!!  A few more refreshes and it's back down to 55 cpu ms and no
api ms.

Oh well, I guess this is just the nature of the beast.  Some users are
going to have pages take a few seconds to load at times, but at least
I know that if I had to scale this up, it would still work, right? :)

On Feb 19, 1:36 pm, John Patterson jdpatter...@gmail.com wrote:
 You could try to batch the get of your ObjectBeingFollowed entities  
 so you only do two gets.  Is most of your cpu time api_cpu?

 On 19 Feb 2010, at 12:29, Blake wrote:



  I'm relatively new to GAE.  My queries so far have for single entities
  with owned relationships, queried by key, which has been great - no
  orange or red screaming in my logs.  More recently, I've been working
  on a simple page that does this:

  1. get a list of 10 of the user's Following entities - each contain
  a key to an object the user is following, and some extra metadata
  about the relationship

  2. get each of the ObjectBeingFollowed entities by key in the
  Following entity

  Easy enough to get working, but the logs are screaming at me with CPU
  times up around 0.5-1.0 seconds.  Is this just what to expect?  Do you
  guys have pages that always return red or orange log messages?

  Thanks!

  --
  You received this message because you are subscribed to the Google  
  Groups Google App Engine for Java group.
  To post to this group, send email to google-appengine-java@googlegroups.com
  .
  To unsubscribe from this group, send email to 
  google-appengine-java+unsubscr...@googlegroups.com
  .
  For more options, visit this group 
  athttp://groups.google.com/group/google-appengine-java?hl=en
  .

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.