Hi Karel,

Keep in mind that each time you modify a list and put it in memcache the whole list is serialized which is why you see it is expensive. There is an efficient approach to merging queries that does not need memcahe that Bret Slatkin called the "zig zag" method:

http://www.scribd.com/doc/16952419/Building-scalable-complex-apps-on-App-Engine

It does not require the results to be in memory either so will work for large datasets. You just need to make sure all the queries are sorted by the same property e.g. __key__

JD

On 2 Mar 2010, at 23:17, Karel Alvarez wrote:

Hi
Some time ago, I asked how to use multiple contains in a query, and I got some responses, that was great and I thank everybody for they help.

I am posting my findings and advance in hope it might be useful for somebody trying to do the same. I am trying to build a database with real state listings in my area, and build some searches on it, the search is likely to have many fields, and several of the fields the user can select multiple values. I chose to handle the whole entity relationship myself instead of using the fancy features of GAE, I had my reasons for it, but frustration with GAE it is a part of it, regardless of that, the approach to searching might still apply if you chose to use relationships from GAE.

I read somewhere in the docs that when you use contains in a query, it internally it executes an equal sub-query for each of the values in the list, (somebody care to confirm that?) so if you have several fields with contains you might bump into the 30 sub-query constraint pretty fast.
So I choose to:
-execute the search by each one of the fields, and each ones of the selected values sequentially, get only the ids, each one of this should hit only one index, and be fast. -add the results from each result to a memcache instance, using increment, collect the ids in a list for later (there is no way to get all the keys in the cache,that I found) -collect the counts for each id in the list I got, and for each check the count, if the count is equal to the number of queries, it means that entity returned true for each of the queries and its an entity that I want to return, i collect all the ides that are good results, and go to the datastore to collect the full entities to return.

This process is expensive, and I still got to try it out with a a big set, but executes sufficiently fast for my test set, of course I cache the result until the user changes the search criteria (or expires).

Here is the code for the search method:

private List<IndexEntry> buildResultsFor(SearchCriteria sc) {
                List<IndexEntry> result = new ArrayList<IndexEntry>();
                // Price parsing
                float minPrice = -1;
                float maxPrice = -1;
                if (sc.getMinPrice().length() > 0) {
                        minPrice = Float.parseFloat(sc.getMinPrice());
                }
                if (sc.getMaxPrice().length() > 0) {
                        maxPrice = Float.parseFloat(sc.getMaxPrice());
                }
                // Listing Status parsing
                Long[] statusIds = null;
                String[] statusNames = sc.getStatus();
                if (statusNames != null && statusNames.length > 0) {
                        statusIds = getListingStatusIdsByNames(statusNames);
                }
                // House Types parsing
                Long[] houseTypesIds = null;
                String[] houseTypeNames = sc.getHouseType();
                if (houseTypeNames != null && houseTypeNames.length > 0) {
                        houseTypesIds = getHouseTypeIdsByNames(houseTypeNames);
                }
                // THE search
                MemcacheService cache = 
MemcacheServiceFactory.getMemcacheService();
                Set<String> allIds = new HashSet<String>();
                int condCount = 0;
                Map<Object, Long> lastResults = null;
                Long one = new Long(1);
                // by price
                if (minPrice > 0 || maxPrice > 0) {
                        condCount++;
List<String> ids = indexService.getByPriceRange(minPrice, maxPrice);
                        allIds.addAll(ids);
                        lastResults = addToChache(cache, one, ids);
                }
                // by status
                if (statusIds != null) {
                        condCount++;
                        for (int i = 0; i < statusIds.length; i++) {
List<String> listingByStatus = indexService.getByListingStatus(statusIds[i]);
                                allIds.addAll(listingByStatus);
                                lastResults = addToChache(cache, one, 
listingByStatus);
                        }
                }
                // by house type
                if (houseTypesIds != null) {
                        condCount++;
                        for (int i = 0; i < houseTypesIds.length; i++) {
List<String> listingByHT = indexService.getByHouseType(houseTypesIds[i]);
                                allIds.addAll(listingByHT);
                                lastResults = addToChache(cache, one, 
listingByHT);
                        }
                }
                // by Zip Code
                String[] zipCodes = parseZipCode(sc.getZipCode());
                if (zipCodes != null && zipCodes.length > 0) {
                        condCount++;
                        for (int i = 0; i < zipCodes.length; i++) {
                                List<String> listingByZ = 
indexService.getByZipCode(zipCodes[i]);
                                allIds.addAll(listingByZ);
                                lastResults = addToChache(cache, one, 
listingByZ);
                        }
                }
                if (lastResults != null) {
Map<Object, Object> counters = cache.getAll(Arrays.asList(allIds.toArray()));
                        List<String> ids = new ArrayList<String>();
                        for (Object listingNumber : counters.keySet()) {
                                String sCount = (String) 
counters.get(listingNumber);
                                long count = Long.parseLong(sCount);
                                if (count > condCount) {
                                        ids.add(listingNumber.toString());
                                        if (ids.size()>500){
                                                break;
                                        }
                                }
                        }
                        cache.clearAll();
                        if (ids.size() > 0) {
                                result = indexService.getEntriesOn(ids);
                        }
                }
                return result;
        }



hope it helps somebody
thanks
Karel

--
You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-java@googlegroups.com . To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en .

--
You received this message because you are subscribed to the Google Groups "Google 
App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to