[google-appengine] Re: Bloom filter index
I found a pure-python bloom filter implementation by Kevin Scotthttp://www.coolsnap.net/kevin/?p=13, based on BitVector http://pypi.python.org/pypi/BitVector/, that appears to work on GAE (python27). -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ohYPbJPGw-sJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] Bloom filter index
Thanks for the lib, Tom, but I'm in python-land so unfortunately I can't use it. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/KiTAbhvvUlcJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Bloom filter index
I wonder whether anybody has tried to build an in-memory bloom filter in front of an index to reduce datastore read operations? In my application, I have an exact-match query on a single field, and it commonly matches no results. However, I still have to pay for datastore read operations in this case. My idea was to build a bloom filter on every value of the field in my datastore. Given a query input, if the bloom filter says the value is a member of the set, I will query the datastore for it, which may or may not match results (i.e., a false positive). The bloom filter would be wrapped in an app engine model and stored in the datastore and memcached. The write rate to the datastore for this index is rather low, so I plan to update the bloom filter transactionally and cache it on every write. The updates could also be done offline in a task queue. The goal is to reduce the cost of searches, especially in the no matches case. I believe this change would reduce costs on datastore read operations, but increase CPU time because each request would have to read and deserialize a potentially large bloom filter from memcached. Clearly, this tradeoff could be tuned to the needs of the app, as a larger bloom filter would produce fewer false positives and wasted datastore reads. Thoughts? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ViVc7VJ8iOAJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] counting missed datastore read operations
I have an index on one field of my model which I use for exact-match lookups (i.e., no range queries). When a query matches, I expect to get the records which match the value. When it doesn't match (when it missed), I expect to get no records. I believe the missed cases count towards my datastore read operations quota, even though no datastore read operations were performed. My intuition is that the index and the datastore are two separate things, and read of the index should be free. In other words, read operations which do not return any records should be free and not count towards the datastore read quota. Am I way off? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/FLFbvLe795cJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] counting missed datastore read operations
I don't doubt you need a read in each case. My point is about *what* the read is for, and how the quota are calculated. App Engine has a quota for, - Datastore Write Operations - Datastore Read Operations - Datastore Index Write Ops Now, I understand that the indexes may be part of the datastore, but if index write operations are separate from datastore write operations, why aren't the read operations also separated? Clearly they are separate things, but it seems they are combine under one quota. My only question is why, and whether that makes sense. For instance, it should be much cheaper to check if a query has results (a index read) than to return those results (a datastore read), so wouldn't it make more sense to have separate quotas? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/jYkQSgzJzeEJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] counting missed datastore read operations
Thanks Brian! This is exactly what I was looking for. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/BxT1hVOWeWIJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Fetching a single result using query
You might have more luck posting in the Java forum, but in python query.get would return the first result or None: http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query_get On May 20, 1:29 am, Arjun arumugamar...@gmail.com wrote: public static Employee getEmp(String name){ ListEmployee lst=null; PersistenceManager pm = PMF.get().getPersistenceManager(); try { Query query=pm.newQuery(Employee.class); query.setFilter(Name == NameParam); query.declareParameters(String NameParam); lst = (ListEmployee)query.execute(name); lst.size();} catch (JDOObjectNotFoundException e) { pm.close(); return null;}finally{ pm.close(); } if(lst.isEmpty()) return null; return lst.get(0); } In the above function the employee with the name provided as the parameter will be returned. The Name field is not a primary key hence i used the Query to fetch the result. Since the query.execute() function returns the result of type ListEmployee i have to use many lines of coding. Is there any other way to fetch a single result i.e a result of type Employee not ListEmployee using the query object ? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Connect users to Google profiles
Is there an automated or opt-in method to connect my logged-in users with their Google profiles? I would like to be able to discover, at least, the URL of their Google profile. It would be nice to have the URL of their avatar, too. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] memcache and api_cpu_ms
I recently attempted to improve the responsiveness of one of my app's more elementary handlers by using memcache to cache the datastore lookups. According to my logs, this has had a positive effect on my api_cpu_ms, reducing this time to 72 ms. However, the cpu_ms has not seen a similar decrease, and hovers around 1000ms. Do memcache gets count towards api_cpu_ms or cpu_ms? Do I need to worry about performance issues around deserializing model instances in memcache? My caching strategy looks like this: response = dict() # (might not be empty) cached = memcache.get(__CACHE_KEY) if cached: response.update(cached) return else: # datastore calls foo = get_foo() bar = get_bar() # build cache object cached = dict(foo=foo, edits=bar) response.update(cached) # cache memcache.set(__CACHE_KEY, cached) return --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] memcache and api_cpu_ms
I recently attempted to improve the responsiveness of one of my app's more elementary handlers by using memcache to cache the datastore lookups. According to my logs, this has had a positive effect on my api_cpu_ms, reducing this time to 72 ms. However, the cpu_ms has not seen a similar decrease, and hovers around 1000ms. Do memcache gets count towards api_cpu_ms or cpu_ms? Do I need to worry about performance issues around deserializing model instances in memcache? My caching strategy looks like this: response = dict() # (might not be empty) cached = memcache.get(__CACHE_KEY) if cached: response.update(cached) return else: # datastore calls foo = get_foo() bar = get_bar() # build cache object cached = dict(foo=foo, edits=bar) response.update(cached) # cache memcache.set(__CACHE_KEY, cached) return --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: memcache and api_cpu_ms
Tony, The update call is the standard dict.update[1], which should be plenty fast for my purposes. My data is actually under a kilobyte, so I am quite confused why it would take nearly 1000ms in CPU. Here's an example of the data (in yaml format) with some personally identifying information stripped out: http://emend.appspot.com/?yaml The actual data being cached is slightly larger, but not by much. [1] http://docs.python.org/library/stdtypes.html#dict.update On Jun 22, 11:06 am, Tony fatd...@gmail.com wrote: Without knowing more about your app, I can't say for sure, but it seems likely that whatever processing takes place in response.update (object) is using your cpu time, which is why you don't see much of a speedup via caching here. I would suggest profiling the operation to determine what function call(s) are specifically taking the most resources. In my experience, you won't notice a large difference in cpu usage between serializing model instances to memcache vs. adding identifier information (like db keys) for fetching later. My entities are small, however, your mileage my vary. I find that the primary tradeoff in serializing large amounts of info to memcache is in increased memory pressure and thus lower memcache hit rate, higher datastore access. On Jun 22, 12:48 pm, John Tantalo john.tant...@gmail.com wrote: I recently attempted to improve the responsiveness of one of my app's more elementary handlers by using memcache to cache the datastore lookups. According to my logs, this has had a positive effect on my api_cpu_ms, reducing this time to 72 ms. However, the cpu_ms has not seen a similar decrease, and hovers around 1000ms. Do memcache gets count towards api_cpu_ms or cpu_ms? Do I need to worry about performance issues around deserializing model instances in memcache? My caching strategy looks like this: response = dict() # (might not be empty) cached = memcache.get(__CACHE_KEY) if cached: response.update(cached) return else: # datastore calls foo = get_foo() bar = get_bar() # build cache object cached = dict(foo=foo, edits=bar) response.update(cached) # cache memcache.set(__CACHE_KEY, cached) return --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: memcache and api_cpu_ms
Thanks, Tony. I'll try the profiling and post again if I discover anything interesting. On Jun 22, 11:36 am, Tony fatd...@gmail.com wrote: I see, I didn't realize you were just calling the dict method. In that case, 1000ms seems unusually high. Still, it seems unlikely that memcache usage is causing it. Your best bet is to profile requests (http://code.google.com/appengine/kb/commontasks.html#profiling) to this handler and see where the cpu time is being spent - you might have some large imports or something elsewhere that's causing a performance drop. On Jun 22, 2:29 pm, John Tantalo john.tant...@gmail.com wrote: Tony, The update call is the standard dict.update[1], which should be plenty fast for my purposes. My data is actually under a kilobyte, so I am quite confused why it would take nearly 1000ms in CPU. Here's an example of the data (in yaml format) with some personally identifying information stripped out: http://emend.appspot.com/?yaml The actual data being cached is slightly larger, but not by much. [1]http://docs.python.org/library/stdtypes.html#dict.update On Jun 22, 11:06 am, Tony fatd...@gmail.com wrote: Without knowing more about your app, I can't say for sure, but it seems likely that whatever processing takes place in response.update (object) is using your cpu time, which is why you don't see much of a speedup via caching here. I would suggest profiling the operation to determine what function call(s) are specifically taking the most resources. In my experience, you won't notice a large difference in cpu usage between serializing model instances to memcache vs. adding identifier information (like db keys) for fetching later. My entities are small, however, your mileage my vary. I find that the primary tradeoff in serializing large amounts of info to memcache is in increased memory pressure and thus lower memcache hit rate, higher datastore access. On Jun 22, 12:48 pm, John Tantalo john.tant...@gmail.com wrote: I recently attempted to improve the responsiveness of one of my app's more elementary handlers by using memcache to cache the datastore lookups. According to my logs, this has had a positive effect on my api_cpu_ms, reducing this time to 72 ms. However, the cpu_ms has not seen a similar decrease, and hovers around 1000ms. Do memcache gets count towards api_cpu_ms or cpu_ms? Do I need to worry about performance issues around deserializing model instances in memcache? My caching strategy looks like this: response = dict() # (might not be empty) cached = memcache.get(__CACHE_KEY) if cached: response.update(cached) return else: # datastore calls foo = get_foo() bar = get_bar() # build cache object cached = dict(foo=foo, edits=bar) response.update(cached) # cache memcache.set(__CACHE_KEY, cached) return --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: removing ACSID cookie expiration, ACSID-reset
Disregard the above. Without an 'expires=', this cookie will expire when the browser closes (not desired). I attempted to length the ACSID cookie expiration, but I believe GAE specifically removes my headers before sending to the client in this case. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] removing ACSID cookie expiration, ACSID-reset
I grew tired of constantly logging into my app, so I figured out a way to remove the 24-hour expiration from the ACSID cookie GAE uses to keep your users logged in. Behold: # remove expiration from ACSID cookie, set ACSID-reset=1 if 'ACSID' in self.request.cookies and 'ACSID-reset' not in self.request.cookies: acsid = self.request.cookies['ACSID'] self.response.headers.add_header('Set-Cookie', ACSID=%s % acsid, path='/') self.response.headers.add_header('Set-Cookie', ACSID-reset=1, path='/') Does anyone know an easier way to do this? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: How to reference model property with a variable?
Try http://docs.python.org/library/functions.html#setattr On Jun 6, 7:56 am, Ethan Post post.et...@gmail.com wrote: I want a method which takes a model object and a dictionary and adds a new record to the model using the the dictionary. The problem is I don't know how to refer to model.property using a variable. model(property) does not work. MyTable (db.model): location ... def Foo (p_model, p_dict): for i in p_dict: # Here is the tricky part, this does not work but you get the idea. p_model(i) = p_dict[i] p_model.put() def CallFoo: m = MyTable() d = {key_name:some_key_123, location:someplace} Foo(m, d) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Datastore queries fail after adding searchable text index
My datastore is currently refusing all queries for one of my entities after I added a new __searchable_text_index for the entity. The model is an instance of search.SearchableModel and I have had other searchable text indexes for this entity. As a result, any query except select * fails and returns 0 results, even queries with existing indexes. My index list shows all indexes as serving. Any idea what could cause this? How to fix it? Should I try to kill the new index? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Urgent help needed: A kind of model can not be accessed!
I had similar problems with my datastore. At one point, after I added a new index, all the indexes for my main model failed. I tried vacuuming the unused indexes in index.yaml. This had the effect of restoring some of my indexes on my main model but I still cannot do others, and indexes for other models were also failing. Next I tried vacuuming *all* my indexes, waiting until they were deleted, then adding them back. This finally worked and I have full access to my datastore and indexes. On Jun 4, 7:11 am, CNBorn cnb...@gmail.com wrote: I have played with the index.yaml files. And there is one time I successed to access this model. But then I tried some new deploys and It doesn't work again. I think it might be index's fault, but I can not figured it out. After serveral deployments now my app encounters a Your application is exceeding a quota: Datastore Indices Count . Can't try further index settting... On 6月4日, 下午1时46分, CNBorn cnb...@gmail.com wrote: Test with a brand new function which will only do simple datastore queries: I found when I do the queries without any conditions, for example, get_by_id or GQLquery(select * FROM tarsusaItem LIMIT 20) will run smoothly. Once I put a query my site should executed like SELECT * FROM tarsusaItem WHERE public = 'public' and routine = 'none' and done = False ORDER BY date DESC LIMIT 9, it doesn't work. On 6月4日, 下午12时12分, CNBorn cnb...@gmail.com wrote: Hi Nick, Thank you for your response. First I post a presumption: Is this problem may possibly related to indexes changes? I found there is such a index changes during my first deploy (And later, I found this kind'tarsusaItem' was unable to get) Created 3 index(es) kinds=tarsusaItem This is the only thing I can track from my admin log. I have tried appcfg.py vacuum_indexes, but it doesn't help. --- the problem is that all my querys against this model gets nothing, and I didn't change any of them. The cause of this may due to a modified count function, in which there is a traversal while loop to count this model(named 'tarsusaItem') in case it has more than 1000 records. After I uploaded this code, the weird thing happened. Later, I will put this count function here. On 6月4日, 上午4时15分, Nick Johnson (Google) nick.john...@google.com wrote: Hi CBorn, It's not clear from your post exactly what the problem is. Are you saying you have kinds for which entities are visible in the admin console datastore viewer, but do not appear in query results? If that is the case, you need to show us the query code you're using - more likely than not, it's a problem with the query that's causing it to return 0 results. -Nick Johnson On Wed, Jun 3, 2009 at 10:45 AM, CNBorn cnb...@gmail.com wrote: Hi All, I need some help here, for I suddenly found my application can not found any data in my main model. You can accesshttp://checknerds.appspot.comtocheckthat. In the right corner, the second number is the count of that model (978 or sth). But there should be a brief list at the bottom, but there is not. After I logged in, I found there are all signs says that this model is kind of EMPTY. It looks like I am a new user at all. But I still can see this model with all its data in Data Viewer. I have tried to revert my code and updated it again, it won't works. Before this disaster happened, I am just trying some new ways to count this model, there is no any write actions against it. Is there any body who can help me to solve this problem? Thanks. site:http://checknerds.appspot.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Datastore queries fail after adding searchable text index
Nick, I've since solved my problem by deleting all my indexes and rebuilding them. My app id is emend and the entity type in question was Edit. I was definitely *unable* to perform simple queries with equality conditions. It seems that whatever took out my application's indexes also fubared the built-in indexes for this type. I can't be certain, but as far as I know the only change I made to the application before this problem occurred was the addition of a index with kind: Edit and a name: __searchable_text_index property. I can verify that I don't have any exploding indexes because I don't use list properties. On Jun 4, 10:29 am, Nick Johnson (Google) nick.john...@google.com wrote: Hi John, What is your app ID and entity type? Email me with them if you don't feel comfortable revealing them here. Are you sure about not getting any results for any queries with conditions? Queries for simple equality conditions, at least, should be satisfied by the built-in indexes, and building a new index should not be able to affect any other, existing indexes. -Nick Johnson On Wed, Jun 3, 2009 at 4:58 PM, John Tantalo john.tant...@gmail.com wrote: My datastore is currently refusing all queries for one of my entities after I added a new __searchable_text_index for the entity. The model is an instance of search.SearchableModel and I have had other searchable text indexes for this entity. As a result, any query except select * fails and returns 0 results, even queries with existing indexes. My index list shows all indexes as serving. Any idea what could cause this? How to fix it? Should I try to kill the new index? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Can Python process a list/array of checkboxes from a web form?
Try http://code.google.com/appengine/docs/python/tools/webapp/requestclass.html#Request_get_all On Jun 4, 5:57 pm, NealWalters nealwalt...@nealwalters.com wrote: Is it possible to pass a list or array of checkbox values from a webpage to Python? For example - give all fields the same name like this: English:input type=checkbox name=language value=English /br/ Spanish:input type=checkbox name=language value=Spanish /br/ Portuguese: input type=checkbox name=language value=Portuguese / br/ Or am I going to have give every checkbox a different name and write more code? I'd like to do something like this: languages = self.request.get('language') self.response.out.write(size= + str(len(languages)) + BR) for language in languages: self.response.out.write(language= + language + br/) return and even persist the languages in BigTable as a single list field. The above seems to be returning the first item with a value, then enumerating the letters of that language: size=10 language=P language=o language=r language=t language=u language=g language=u language=e language=s language=e What I would like to see is (if these two languages were checked): size=2 language=English language=Portuguese Thanks, Neal Walters --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---