[google-appengine] Re: GAE datastore insertion rate
Thank you Sebastian, I see what you are saying, and now wonder why it seems to work better with the devices as group entities. I'm attempting to delete the Note entities from the datastore, as I've reached 23%, and want to run another load test. I coded up a request to delete 512 Notes, then, make that request repeatedly. That presented me with this error: The API call datastore_v3.Delete() required more quota than is available. I slowed down the rate at which I make that request by sleeping a couple of seconds between each POST to the App Server. Same error. Then, using the dashboard, I attempt to view the Note entities and got a 500 Server error. Now, I can't view my Note entities. Funny thing is, when I look at the quota details page, and everything under the column on the far right (Rate), states "Okay". With only 3 of the Daily Quota's even showing a non-zero value. Request:CPU Time = 4% DataStore:Stored Data = 23% DataStore:Datastore CPU Time = 3% I'm trying some very simple things, and hope to get past these obstacles. In the cloud computing is a "good thing" and I want to take full advantage of it. Help, -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
Hi David On Wed, Jan 28, 2009 at 1:22 AM, iDavid wrote: > > I got the contention errors, during my first run, using a > ReferenceProperty to link a Note to a Device (not a parent). Only > after I watched Brett's presentation did I add the device group entity > and parent=device code. > that is very weird and we should investigate better. > > Adding the device group entity and parent=device code, seems to have > solved the "too much contention for these entities" problem. > > Quote From 'Keys and Entity Groups': > > http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths > "The more entity groups your application has—that is, the more root > entities there are—the more efficiently the datastore can distribute > the entity groups across datastore nodes." > that is true and is true that if you do not specify a parent, then the entity is a root and therefore it is an entity group for it self ! ("An entity without a parent is a *root* entity.") from the same official document it is clear that every entity is an entity group. What you can do is to increase the "family" size adding entitities to a group (using the parent keyword) but tjhat is usefull ONLY if you need transaction... "The more entity groups your application has—that is, the more ROOT ENTITIES there are—the more efficiently the datastore can distribute the entity groups across datastore nodes" from the same paper: "An entity without a parent is a *root* entity." > > Brett does a great job of describing this in his presentation: > > http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine > yes.. I think that what he meant is that if you use entity groups (because you need transaction) then you must know that you have to use as many entity groups as possible... as each transaction operation is done serially... so keep your entity group small and use as many as you can... the more is better... and if you do NOT use the keyword "parent" then any entity IS an entity group for it self (as it is a ROOT): that is the maximum number of entity groups that you can have ! We need to investigate why you were getting those errors even if any entity was an entity group for it self. regards --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
I got the contention errors, during my first run, using a ReferenceProperty to link a Note to a Device (not a parent). Only after I watched Brett's presentation did I add the device group entity and parent=device code. Adding the device group entity and parent=device code, seems to have solved the "too much contention for these entities" problem. Quote From 'Keys and Entity Groups': http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths "The more entity groups your application has—that is, the more root entities there are—the more efficiently the datastore can distribute the entity groups across datastore nodes." Brett does a great job of describing this in his presentation: http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine For me, having all the notes from a device grouped together on a datastore node is great. I process the notes in sequence, but, I'd like to process the devices in parallel. This is where I'm looking for scalability, across the devices. I need to inform the App Engine that its ok to process these (notes) in sequence, but handle these (devices) in parallel. Making the device group entity seems to have done that. Thank you for your help, -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
Hi David, accordingly with http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths "An entity without a parent is a root entity"... so you don't need to associate each note to a parent... I'm wondering... when you got these contention errors, were you assiging a parent to the notes (entity groups) ? if that is the case, then you were using a single entity group ! On Jan 27, 12:03 am, iDavid wrote: > After watching Brett's video, the answer became > clear.http://sites.google.com/site/io/building-scalable-web-applications-wi... > > Create a group entity for each device. Then all device's can > have their notes inserted in parallel. > > I will give this a try and report back. > > -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
Thank you for your suggestions, I really appreciate them. My load test is sending data exactly as the physical devices in the field would do. Each device will send 4 Notes per POST, once a minute (a note every 15 seconds). I did not know about the "batch-put", I will make the necessary code changes for that. (Thank you Brett) I'm grouping the notes with the devices they came from. My first implementation, simply used the ReferenceProperty(), but quickly ran into the"too much contention for these entities" problem on the Note entity. Afterwatching Brett's presentation on "sharding", I thought I'd get moreparallelizm by making the devices a group entity and putting the notes in them. Which seemed to work, until I hit my quota. It's ok if the device group entity is 'locked' while inserting notes associated with it. Each physical device is unique, and it will only make one POST at a time. My goal is to be able to handle each device in parallel. If this works out, I'd like to run a 1 hour load test with 5,000 devices, to come up with a cost model for my company. I expect the number of devices to grow over the coming year, and need a scalable system to support this load. Thank you, -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
Why do you put all nodes for a device in one entity group (same parent node)? Entity groups should be kept small (in most cases) because you lock all members of the group if you do a put() on one of them. Only use entity groups if you need some transaction function to do the update on some of the group members or you need the parent / grandparent relation. All members of an entity group are stored on the same Bigtable "disk" (no possibility for distributed storage for members in an entity group). Why don't you add a ReferenceProperty to the node class? [1] This property references the device object you now set as parent. For a query you now select on this ReferenceProperty instead of selecting for parent. This is what the back-reference property is doing behind the scene, construct a Query object that filters on the ReferenceProperty. [1] http://code.google.com/appengine/docs/python/datastore/entitiesandmodels.html#References --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
Hi David, On Mon, Jan 26, 2009 at 9:54 PM, iDavid wrote: > I created a root group entity for each Device. > I modified the Note entity to use it's Device as its parent. > > device = Device.get_by_key_name(name) > if device != None: >for n in noteList: >note = Note(parent=device, n) >note.put() Have you considered using a batch-put here? Like this: device = Device.get_by_key_name(name) if device: notes = [Note(parent=device, n) for n in noteList] db.put(notes) That should do the writes in parallel, which should increase your throughput. > Now I'm seeing this error: > The API call datastore_v3.Get() required more quota than is available. Otherwise, it sounds like you're hitting quota limits, which is probably related to the way you're running your load test. How many requests per second were you issuing? Approximately how many Note entities were you inserting per second? 10 CPU seconds per second doesn't make this quite clear. We have short-term quota limits in place for handling very large bursts of usage. More detail is here: http://code.google.com/appengine/docs/quotas.html#Burst_Limits You may be hitting one of these burst quotas because of your load. Another possibility is that you're using key_names for your Device and Note instances that are sequential or lexically "close", which could cause hotspots in your Bigtable access patterns. When you run the load test, in what order are you writing Notes to Devices? -Brett --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
I created a root group entity for each Device. I modified the Note entity to use it's Device as its parent. device = Device.get_by_key_name(name) if device != None: for n in noteList: note = Note(parent=device, n) note.put() Now I'm seeing this error: The API call datastore_v3.Get() required more quota than is available. According to my interpretation of Brett, in that video, "get by key is very fast". Again I ask, is this a limit of the beta, or a reality of the datastore? -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: GAE datastore insertion rate
After watching Brett's video, the answer became clear. http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine Create a group entity for each device. Then all device's can have their notes inserted in parallel. I will give this a try and report back. -David Story --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---