[google-appengine] Re: GQL performance slow
Hi Ikai see the http://i52.tinypic.com/28hndxk.jpg"; target="_blank">app stats @905ms datastore_v3.RunQuery real=1090ms api=8225ms @2504ms datastore_v3.Next real=339ms api=4711ms my q.fetch(2000) method takes ~3.5 seconds On Mar 23, 12:42 am, "Ikai Lan (Google)" wrote: > I think you should run AppStats and see what's going on. 2000 entities is > never going to be particularly fast, but 3 seconds might be a bit on the > high side. Run the tool to be sure. > > Ikai Lan > Developer Programs Engineer, Google App Engine > Blog:http://googleappengine.blogspot.com > Twitter:http://twitter.com/app_engine > Reddit:http://www.reddit.com/r/appengine > > > > > > > > On Tue, Mar 22, 2011 at 7:25 AM, adhi wrote: > > Hi Ikai, is there a measure to say the entities are large? is it about > > the number of > > columns or blob or text? if so in my entities I have string and number > > properties only. > > > Yes. I need all the 2000 (some times bit more) entities at once for > > reporting. > > Currently I'm not seeing a way where I can split the keys and fetch. > > > On Mar 21, 11:54 pm, "Ikai Lan (Google)" wrote: > > > If that's the performance, then I'm inclined to say the entities are > > large > > > and that's about the performance you can expect. One more thing to keep > > in > > > mind: the cost of deserializing each entity is non-trivial. > > > > Is there a reason you need all 2000 entities? Perhaps there is a > > different > > > way to accomplish what you want without querying a (relatively) large > > number > > > of entities at once. > > > > One other possible solution that will at least make your user facing code > > > run faster is to split the query up by key, perform multiple async > > queries > > > and join them at the end. It'll still consume the same (or more) CPU > > time, > > > but the user ms should be lower since you are doing things in parallel. > > The > > > tradeoff here, of course, is that you'd need to know how you can split up > > > the keyspace. > > > > Ikai Lan > > > Developer Programs Engineer, Google App Engine > > > Blog:http://googleappengine.blogspot.com > > > Twitter:http://twitter.com/app_engine > > > Reddit:http://www.reddit.com/r/appengine > > > > On Mon, Mar 21, 2011 at 12:17 AM, adhi wrote: > > > > q = db.Query(PrimaryData) > > > > q.filter('SheetMetadataId', metadata.getSheetId()) > > > > return q.fetch(2000) > > > > > Hi Ikai, even the above query takes 3.3 ~ 3.6 seconds. The number of > > > > entities returned by the query is 1261. > > > > I'm using Expando model, the total number columns for this particular > > > > set of entities are 25. > > > > Yes. I've created composite index. Here is the index definition. > > > > > - kind: PrimaryData > > > > properties: > > > > - name: SheetMetadataId > > > > - name: InstanceAssignedTo > > > > > please let me know if you need more information. > > > > > On Mar 17, 11:50 pm, "Ikai Lan (Google)" wrote: > > > > > If you have a composite index, the performance of this query should > > be > > > > > mostly linear (find start location of index and just return min(2000, > > > > number > > > > > of entities that satisfy query from there on out) number of keys and > > > > items. > > > > > If you only have individual indexes for SheetMetadataId as well as > > > > > InstanceAssignedTo, this uses zigzag merge join, where performance > > can > > > > > greatly vary depending on the shape of the data. > > > > > > Just out of curiosity, what is the performance of this query? > > > > > > q = db.Query(PrimaryData) > > > > > q.filter('SheetMetadataId', metadata.getSheetId()) > > > > > return q.fetch(2000) > > > > > > If it's in a similar ballpark, my other guess is that the query takes > > > > that > > > > > amount of time because the entities are large. > > > > > > You mentioned created a composite index. Can you post that? > > > > > > I'll talk to the datastore team to see what we can do to provide > > > > something > > > > > similar to EXPLAIN plans for queries. > > > > > > Ikai Lan > > > > > Developer Progr
[google-appengine] Re: GQL performance slow
Hi Ikai, is there a measure to say the entities are large? is it about the number of columns or blob or text? if so in my entities I have string and number properties only. Yes. I need all the 2000 (some times bit more) entities at once for reporting. Currently I'm not seeing a way where I can split the keys and fetch. On Mar 21, 11:54 pm, "Ikai Lan (Google)" wrote: > If that's the performance, then I'm inclined to say the entities are large > and that's about the performance you can expect. One more thing to keep in > mind: the cost of deserializing each entity is non-trivial. > > Is there a reason you need all 2000 entities? Perhaps there is a different > way to accomplish what you want without querying a (relatively) large number > of entities at once. > > One other possible solution that will at least make your user facing code > run faster is to split the query up by key, perform multiple async queries > and join them at the end. It'll still consume the same (or more) CPU time, > but the user ms should be lower since you are doing things in parallel. The > tradeoff here, of course, is that you'd need to know how you can split up > the keyspace. > > Ikai Lan > Developer Programs Engineer, Google App Engine > Blog:http://googleappengine.blogspot.com > Twitter:http://twitter.com/app_engine > Reddit:http://www.reddit.com/r/appengine > > > > > > > > On Mon, Mar 21, 2011 at 12:17 AM, adhi wrote: > > q = db.Query(PrimaryData) > > q.filter('SheetMetadataId', metadata.getSheetId()) > > return q.fetch(2000) > > > Hi Ikai, even the above query takes 3.3 ~ 3.6 seconds. The number of > > entities returned by the query is 1261. > > I'm using Expando model, the total number columns for this particular > > set of entities are 25. > > Yes. I've created composite index. Here is the index definition. > > > - kind: PrimaryData > > properties: > > - name: SheetMetadataId > > - name: InstanceAssignedTo > > > please let me know if you need more information. > > > On Mar 17, 11:50 pm, "Ikai Lan (Google)" wrote: > > > If you have a composite index, the performance of this query should be > > > mostly linear (find start location of index and just return min(2000, > > number > > > of entities that satisfy query from there on out) number of keys and > > items. > > > If you only have individual indexes for SheetMetadataId as well as > > > InstanceAssignedTo, this uses zigzag merge join, where performance can > > > greatly vary depending on the shape of the data. > > > > Just out of curiosity, what is the performance of this query? > > > > q = db.Query(PrimaryData) > > > q.filter('SheetMetadataId', metadata.getSheetId()) > > > return q.fetch(2000) > > > > If it's in a similar ballpark, my other guess is that the query takes > > that > > > amount of time because the entities are large. > > > > You mentioned created a composite index. Can you post that? > > > > I'll talk to the datastore team to see what we can do to provide > > something > > > similar to EXPLAIN plans for queries. > > > > Ikai Lan > > > Developer Programs Engineer, Google App Engine > > > Blog:http://googleappengine.blogspot.com > > > Twitter:http://twitter.com/app_engine > > > Reddit:http://www.reddit.com/r/appengine > > > > On Thu, Mar 17, 2011 at 10:07 AM, adhi wrote: > > > > Hi Robert, > > > > I've not used reference properties. Its a simple query and it'll be > > > > like this. > > > > > q = db.Query(PrimaryData) > > > > q.filter('SheetMetadataId', metadata.getSheetId()) > > > > q.filter('InstanceAssignedTo IN', > > > > [u'User_c42e8919_448e_11e0_b87b_f58d20c6e2c3', > > > > u'User_1fd87ac5_073d_11e0_8ba1_c122a5867c4a']) > > > > return q.fetch(2000) > > > > > and I replaced the IN filter to '=' with single value. Still its > > > > taking same time. > > > > > On Mar 17, 6:27 am, Robert Kluin wrote: > > > > > Use Appstats. It may not be the query that is slow. If you are using > > > > > reference properties, perhaps you are dereferencing them. If you > > > > > should us the query and how you're using the results we might be abel > > > > > to give more suggestions. > > > > > >http://code.google.com/appengine/docs/python/tools/appstats.html > > > > >
[google-appengine] Re: GQL performance slow
q = db.Query(PrimaryData) q.filter('SheetMetadataId', metadata.getSheetId()) return q.fetch(2000) Hi Ikai, even the above query takes 3.3 ~ 3.6 seconds. The number of entities returned by the query is 1261. I'm using Expando model, the total number columns for this particular set of entities are 25. Yes. I've created composite index. Here is the index definition. - kind: PrimaryData properties: - name: SheetMetadataId - name: InstanceAssignedTo please let me know if you need more information. On Mar 17, 11:50 pm, "Ikai Lan (Google)" wrote: > If you have a composite index, the performance of this query should be > mostly linear (find start location of index and just return min(2000, number > of entities that satisfy query from there on out) number of keys and items. > If you only have individual indexes for SheetMetadataId as well as > InstanceAssignedTo, this uses zigzag merge join, where performance can > greatly vary depending on the shape of the data. > > Just out of curiosity, what is the performance of this query? > > q = db.Query(PrimaryData) > q.filter('SheetMetadataId', metadata.getSheetId()) > return q.fetch(2000) > > If it's in a similar ballpark, my other guess is that the query takes that > amount of time because the entities are large. > > You mentioned created a composite index. Can you post that? > > I'll talk to the datastore team to see what we can do to provide something > similar to EXPLAIN plans for queries. > > Ikai Lan > Developer Programs Engineer, Google App Engine > Blog:http://googleappengine.blogspot.com > Twitter:http://twitter.com/app_engine > Reddit:http://www.reddit.com/r/appengine > > > > > > > > On Thu, Mar 17, 2011 at 10:07 AM, adhi wrote: > > Hi Robert, > > I've not used reference properties. Its a simple query and it'll be > > like this. > > > q = db.Query(PrimaryData) > > q.filter('SheetMetadataId', metadata.getSheetId()) > > q.filter('InstanceAssignedTo IN', > > [u'User_c42e8919_448e_11e0_b87b_f58d20c6e2c3', > > u'User_1fd87ac5_073d_11e0_8ba1_c122a5867c4a']) > > return q.fetch(2000) > > > and I replaced the IN filter to '=' with single value. Still its > > taking same time. > > > On Mar 17, 6:27 am, Robert Kluin wrote: > > > Use Appstats. It may not be the query that is slow. If you are using > > > reference properties, perhaps you are dereferencing them. If you > > > should us the query and how you're using the results we might be abel > > > to give more suggestions. > > > >http://code.google.com/appengine/docs/python/tools/appstats.html > > > > Robert > > > > On Mon, Mar 14, 2011 at 10:54, adhi wrote: > > > > Hi, I'm running a query in appengine which is returning just 1200 > > > > entities, but its taking 3.5 seconds. The query doesn't contains > > > > inequality filters, but anyway I added index for that. Can anyone tell > > > > me how to analyse this and improve theperformance? > > > > > -- > > > > You received this message because you are subscribed to the Google > > Groups "Google App Engine" group. > > > > To post to this group, send email to google-appengine@googlegroups.com > > . > > > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com. > > > > For more options, visit this group athttp:// > > groups.google.com/group/google-appengine?hl=en. > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to google-appengine@googlegroups.com. > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com. > > For more options, visit this group at > >http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: GQL performance slow
Hi Robert, I've not used reference properties. Its a simple query and it'll be like this. q = db.Query(PrimaryData) q.filter('SheetMetadataId', metadata.getSheetId()) q.filter('InstanceAssignedTo IN', [u'User_c42e8919_448e_11e0_b87b_f58d20c6e2c3', u'User_1fd87ac5_073d_11e0_8ba1_c122a5867c4a']) return q.fetch(2000) and I replaced the IN filter to '=' with single value. Still its taking same time. On Mar 17, 6:27 am, Robert Kluin wrote: > Use Appstats. It may not be the query that is slow. If you are using > reference properties, perhaps you are dereferencing them. If you > should us the query and how you're using the results we might be abel > to give more suggestions. > > http://code.google.com/appengine/docs/python/tools/appstats.html > > Robert > > > > > > > > On Mon, Mar 14, 2011 at 10:54, adhi wrote: > > Hi, I'm running a query in appengine which is returning just 1200 > > entities, but its taking 3.5 seconds. The query doesn't contains > > inequality filters, but anyway I added index for that. Can anyone tell > > me how to analyse this and improve theperformance? > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to google-appengine@googlegroups.com. > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com. > > For more options, visit this group > > athttp://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] GQL performance slow
Hi, I'm running a query in appengine which is returning just 1200 entities, but its taking 3.5 seconds. The query doesn't contains inequality filters, but anyway I added index for that. Can anyone tell me how to analyse this and improve the performance? -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: BlobProperty limit
Hi Brian, Are you using single db.put() for all the entities? Thanks Adhi On Mar 10, 5:36 pm, bFlood wrote: > hello > > maybe you should try 11 parts so each would be below the max 1MB api > limit. You definitely need separate entities for each part > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: memcache not getting updated
Hi Nick, Thanks, thats the problem. I'm trying to put around 60 objects in memcache using set_multi and its failing for all the objects so in case of failure I'm deleting the objects in memcache then splitting the dict and updating, and its working. I don't know, after posting the problem I couldn't see the post for the past few days, anyway now I got it thanks:) Adhi On Mar 9, 9:27 pm, "Nick Johnson (Google)" wrote: > Hi, > > Have you checked the return code of the memcache call? Unlike most of the > APIs, memcache indicates errors in its return code, rather than throwing an > exception. > > -Nick Johnson > > On Tue, Mar 9, 2010 at 6:43 AM, Adhi wrote: > > > > > > > Hi, > > Is there any chance that updating memcache fails? In a same > > transaction I'm updating a particular record both in memcache and db, > > but still the memcache having old data where as db has updated record. > > And I have checked the url in stats, the memcache.set is getting > > called followed by the db.put(). > > > But for the same application it is not happening in different appspot > > (staging). Any suggestions or clue? the problem happening in > > production. > > > Thanks > > Adhi > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to google-appeng...@googlegroups.com. > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com > e...@googlegroups.com> > > . > > For more options, visit this group at > >http://groups.google.com/group/google-appengine?hl=en. > > -- > Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: > Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: BlobProperty limit
Hi Barry, Thanks for your reply. I tried in 2 ways Split the data into 10 parts and 1. Created separate entity for each part (as you told) 2. In a single entity with 10 dynamic properties (since its not always 10 parts might be less for some cases) In both ways I'm getting "RequestTooLargeError: The request to API call datastore_v3.Put() was too large." when I try to put the entity(s) in a single db.put(). So do I've to use separate db.put for each entity? Thankyou Adhi On Mar 2, 6:45 pm, Barry Hunter wrote: > > Is it possible to increase thelimitforblobpropertysize upto 10 MB? > > no. > > Split the Blob and store in 10 Separate entities. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] memcache not getting updated
Hi, Is there any chance that updating memcache fails? In a same transaction I'm updating a particular record both in memcache and db, but still the memcache having old data where as db has updated record. And I have checked the url in stats, the memcache.set is getting called followed by the db.put(). But for the same application it is not happening in different appspot (staging). Any suggestions or clue? the problem happening in production. Thanks Adhi -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] BlobProperty limit
Hi, Is it possible to increase the limit for blobproperty size upto 10 MB? We are not able to use the blobstore api as mentioned in the docs for sizes more then 1 MB because we want to do fine grained operations on the blob. What we are trying to store is a JSON datastructure(pickled) and not something like video. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Timeout error
Hi Jeff, Sorry for the delayed reply. Since we have a seriously struck with another problem in gql which I already posted in http://groups.google.com/group/google-appengine/browse_thread/thread/44f68399e1dea1c0/9b92b3a369f831d7?pli=1 I couldn't respond you. My app id is os-dev. Its not deadline exceeded exception, its datastore timeout. For the same request (query) and for the same set of results it is quite fast some time and throws timeout error some time. Adhi On Nov 10, 12:32 am, "Jeff S (Google)" wrote: > Hiadhi, > > Could you tell me the app ID for this application? When you mention > timeout errors, are these datastore timeouts or are they overall > deadline exceeded exceptions? It could be that fetching 1300 records > in one request does take longer than the allowed time for a datastore > operation. > > Thank you, > > Jeff > > On Nov 5, 3:28 am,Adhi wrote: > > > > > Hi, > > We are running into timeout problem very frequently where as the same > > request is working fine during the consecutive attempts and its taking > > 4-5 seconds. The request involves only fetching the data from gfs > > using query. normally we fetch around 100-1000 in every request. any > > inputs for this abnormal behavior? > > > this is the biggest request as of now, in terms of fetching data (1300 > > records)http://dev.orangescape.com/bulkexporter/export/primary?SheetMetadataI... > > > Its working normally for some time and some times even the smaller > > requests are fails. > > > Note:- We are splitting the request into multiple request and hitting > > the server concurrently, the above url is one of the split. > > >adhi -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=.
[google-appengine] Re: gql not giving full result set
Martin Thanks for the info. But I explicitly created all the necessary indexes. When debugging this issue I felt there might be a problem in building the indexes, if I just open and save the missing record then that record is also included in the resultset. So I think when I the record updation has to do something with the query execution. Any clues..? Adhi On Nov 9, 10:28 am, Jason Smith wrote: > I have the same problem, which I wrote about on Stack Overflow but > received no response. > > http://stackoverflow.com/questions/1691792/query-gqlquery-order-restr... > > My models require the property in question and I manually confirmed > that they are all present, so it is not an issue of queries not > returning entities with missing properties. I am stuck with this > problem, and currently I am working around it by fetching all data and > sorting in memory. Fortunately I can get away with that as it's a > small data set and in infrequent query. > > On Nov 7, 12:53 am, Adhi wrote: > > > > > Yes, I've tried using order by also. But its giving different > > resultset. > > When using order by I got only 842 records, but with out order by I > > got 1251 > > where as my actual records will be >1260. and when I change the fetch > > size > > I'm getting different count. > > > Here is my code... > > > def get_serialized_data(entityClass, params): > > query = entityClass.all() > > query.order('__key__') > > > for filterColumn, filterValue in params.iteritems(): > > query.filter(filterColumn, filterValue) > > limit = 400 > > offset = 0 > > totalLimit = 800 > > lastRecordKey = None > > n = 0 > > entities = query.fetch(limit, offset) > > while entities and offset <= (totalLimit-limit): > > lastRecordKey = entities[-1].key() > > n += len(entities) > > # My serialization code here > > offset+=limit > > if len(entities)==limit: > > entities = query.fetch(limit, offset) > > else: > > entities = None > > entities = None > > return (n>=totalLimit, lastRecordKey) > > > def download_data(): > > params = {'ApplicationId':applicationId, 'Deleted':False, > > 'SheetMetadataId':'Sheet003'} > > (moreRecords, lastRecordKey) = get_serialized_data(PrimaryData, > > params) > > while moreRecords: > > params['__key__ >'] = lastRecordKey > > (moreRecords, lastRecordKey) = get_serialized_data > > (PrimaryData, params) > > > download_data() > > > Each batch will fetch 800 records if I use q.fetch(800) its giving > > Timeout so I've used offset. > > As per the documentation > > inhttp://code.google.com/appengine/articles/remote_api.html > > they haven't specified > > to add order by for __key__ so I thought its implicit. Thats why I > > initially tried with out order by. > > Am I doing anything wrong? > > > Now I'm trying to delete and recreating the indexes because of this > > problem, but it still in deleting state. > > > Adhi > > > On Nov 6, 7:13 pm, Eli Jones wrote: > > > > Always post a full code snippet. > > > > Aren't you supposed to use Order By when paging by key? > > > > On 11/6/09, Adhi wrote: > > > > > Hi, > > > > Sometimes I am not getting the complete resultset fromgqleven though > > > > the missing records satisfies the condition. I've proper indexes. > > > > Total records for that query will be around 1300. So, I'm not fetching > > > > the records in a single fetch, I'm using __key__ > last_record_key to > > > > get in batches. > > > > > Why is this anomaly..? anything I am missing here. > > > > > Adhi > > > > -- > > > Sent from my mobile device --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Index building for a while. Unable to reset
One of the indexes in my application is still in building state. Please mark it as error so that I can update it again. My app-id is os-dev Is there any permanent solution for this. Thanks adhi --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---