[google-appengine] Re: GAE datastore insertion rate

2009-01-28 Thread iDavid

Thank you Sebastian, I see what you are saying, and now wonder why it
seems
to work better with the devices as group entities.

I'm attempting to delete the Note entities from the datastore, as I've
reached 23%,
and want to run another load test. I coded up a request to delete 512
Notes, then,
make that request repeatedly. That presented me with this error:
The API call datastore_v3.Delete() required more quota than is
available.
I slowed down the rate at which I make that request by sleeping a
couple of seconds between each POST to the App Server.  Same error.

Then, using the dashboard, I attempt to view the Note entities and got
a
500 Server error.  Now, I can't view my Note entities.

Funny thing is, when I look at the quota details page, and everything
under the
column on the far right (Rate), states "Okay".  With only 3 of the
Daily Quota's
even showing a non-zero value.
Request:CPU Time = 4%
DataStore:Stored Data = 23%
DataStore:Datastore CPU Time = 3%

I'm trying some very simple things, and hope to get past these
obstacles.
In the cloud computing is a "good thing" and I want to take full
advantage of it.

Help,
-David Story
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-28 Thread Sebastian E. Ovide
Hi David

On Wed, Jan 28, 2009 at 1:22 AM, iDavid  wrote:

>
> I got the contention errors, during my first run, using a
> ReferenceProperty to link a Note to a Device (not a parent).  Only
> after I watched Brett's presentation did I add the device group entity
> and parent=device code.
>

that is very weird and we should investigate better.


>
> Adding the device group entity and parent=device code, seems to have
> solved the "too much contention for these entities" problem.
>
> Quote From 'Keys and Entity Groups':
>
> http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths
> "The more entity groups your application has—that is, the more root
> entities there are—the more efficiently the datastore can distribute
> the entity groups across datastore nodes."
>

that is true  and is true that if you do not specify a parent, then the
entity is a root and therefore it is an entity group for it self ! ("An
entity without a parent is a *root* entity.")

from the same official document it is clear that every entity is an entity
group. What you can do is to increase the "family" size adding entitities to
a group (using the parent keyword) but tjhat is usefull ONLY if you need
transaction...

"The more entity groups your application has—that is, the more ROOT ENTITIES
there are—the more efficiently the datastore can distribute the entity
groups across datastore nodes"

from the same paper: "An entity without a parent is a *root* entity."



>
> Brett does a great job of describing this in his presentation:
>
> http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine
>

yes.. I think that what he meant is that if you use entity groups (because
you need transaction) then you must know that you have to use as many entity
groups as possible... as each transaction operation is done serially... so
keep your entity group small and use as many as you can... the more is
better... and if you do NOT use the keyword "parent" then any entity IS an
entity group  for it self (as it is a ROOT): that is the maximum number of
entity groups that you can have !

We need to investigate why you were getting those errors even if any entity
was an entity group for it self.

regards

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-27 Thread iDavid

I got the contention errors, during my first run, using a
ReferenceProperty to link a Note to a Device (not a parent).  Only
after I watched Brett's presentation did I add the device group entity
and parent=device code.

Adding the device group entity and parent=device code, seems to have
solved the "too much contention for these entities" problem.

Quote From 'Keys and Entity Groups':
http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths
"The more entity groups your application has—that is, the more root
entities there are—the more efficiently the datastore can distribute
the entity groups across datastore nodes."

Brett does a great job of describing this in his presentation:
http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine

For me, having all the notes from a device grouped together on a
datastore node is great.  I process the notes in sequence, but, I'd
like to process the devices in parallel.  This is where I'm looking
for scalability, across the devices.  I need to inform the App Engine
that its ok to process these (notes) in sequence, but handle these
(devices) in parallel.  Making the device group entity seems to have
done that.

Thank you for your help,
-David Story
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-27 Thread sebastian.ov...@gmail.com

Hi David,

accordingly with
http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths
"An entity without a parent is a root entity"...

so you don't need to associate each note to a parent...

I'm wondering... when you got these contention errors, were you
assiging a parent to the notes (entity groups) ? if that is the case,
then you were using a single entity group !

On Jan 27, 12:03 am, iDavid  wrote:
> After watching Brett's video, the answer became 
> clear.http://sites.google.com/site/io/building-scalable-web-applications-wi...
>
> Create a group entity for each device.  Then all device's can
> have their notes inserted in parallel.
>
> I will give this a try and report back.
>
> -David Story
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-27 Thread iDavid

Thank you for your suggestions, I really appreciate them.

My load test is sending data exactly as the physical devices in the
field would do.  Each device will send 4 Notes per POST, once a minute
(a note every 15 seconds).

I did not know about the "batch-put", I will make the necessary code
changes for that.  (Thank you Brett)

I'm grouping the notes with the devices they came from.  My first
implementation, simply used the ReferenceProperty(), but quickly ran
into the"too much contention for these entities" problem on the Note
entity.  Afterwatching Brett's presentation on "sharding", I thought
I'd get moreparallelizm by making the devices a group entity and
putting the notes in them. Which seemed to work, until I hit my
quota.  It's ok if the device group entity is 'locked' while inserting
notes associated with it.  Each physical device is unique, and it will
only make one POST at a time.  My goal is to be able to handle each
device in parallel.

If this works out, I'd like to run a 1 hour load test with 5,000
devices, to come up with a cost model for my company.  I expect the
number of devices to grow over the coming year, and need a scalable
system to support this load.

Thank you,
-David Story

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-27 Thread djidjadji

Why do you put all nodes for a device in one entity group (same parent node)?
Entity groups should be kept small (in most cases) because you lock
all members of the group if you do a put() on one of them. Only use
entity groups if you need some transaction function to do the update
on some of the group members or you need the parent / grandparent
relation. All members of an entity group are stored on the same
Bigtable "disk" (no possibility for distributed storage for members in
an entity group).

Why don't you add a ReferenceProperty to the node class? [1]
This property references the device object you now set as parent.
For a query you now select on this ReferenceProperty instead of
selecting for parent.
This is what the back-reference property is doing behind the scene,
construct a Query object that filters on the ReferenceProperty.

[1] 
http://code.google.com/appengine/docs/python/datastore/entitiesandmodels.html#References

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-26 Thread Brett Slatkin

Hi David,

On Mon, Jan 26, 2009 at 9:54 PM, iDavid  wrote:
> I created a root group entity for each Device.
> I modified the Note entity to use it's Device as its parent.
>
> device = Device.get_by_key_name(name)
> if device != None:
>for n in noteList:
>note = Note(parent=device, n)
>note.put()

Have you considered using a batch-put here? Like this:

device = Device.get_by_key_name(name)
if device:
  notes = [Note(parent=device, n) for n in noteList]
  db.put(notes)

That should do the writes in parallel, which should increase your throughput.

> Now I'm seeing this error:
> The API call datastore_v3.Get() required more quota than is available.

Otherwise, it sounds like you're hitting quota limits, which is
probably related to the way you're running your load test. How many
requests per second were you issuing? Approximately how many Note
entities were you inserting per second? 10 CPU seconds per second
doesn't make this quite clear.

We have short-term quota limits in place for handling very large
bursts of usage. More detail is here:
http://code.google.com/appengine/docs/quotas.html#Burst_Limits

You may be hitting one of these burst quotas because of your load.
Another possibility is that you're using key_names for your Device and
Note instances that are sequential or lexically "close", which could
cause hotspots in your Bigtable access patterns. When you run the load
test, in what order are you writing Notes to Devices?

-Brett

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-26 Thread iDavid

I created a root group entity for each Device.
I modified the Note entity to use it's Device as its parent.

device = Device.get_by_key_name(name)
if device != None:
for n in noteList:
note = Note(parent=device, n)
note.put()

Now I'm seeing this error:
The API call datastore_v3.Get() required more quota than is available.

According to my interpretation of Brett, in that video,
"get by key is very fast".

Again I ask, is this a limit of the beta, or a reality of the
datastore?

-David Story
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: GAE datastore insertion rate

2009-01-26 Thread iDavid

After watching Brett's video, the answer became clear.
http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine

Create a group entity for each device.  Then all device's can
have their notes inserted in parallel.

I will give this a try and report back.

-David Story

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---