to query with n filters on one relation index entity *should be* to query with n filters on User with many UserSkill entity children
On Mar 16, 2:05 pm, Max <thebb...@gmail.com> wrote: > Hi John, > > Thanks for your reply. To be honest I am quite disappointed that GAE > can not perform such query directly. > > According do your suggestion, to query with n filters on one relation > index entity, we need to: > 1, perform n separate queries to obtain n cursors of UserSkill > entities by some sorting order > 2, use in-memory zig-zag method to obtain User keys match all query > 3, use a batch get to obtain all User entities > > Please correct me if my understanding is wrong. > > Best regards, > Max > > On Mar 16, 10:53 am, John Patterson <jdpatter...@gmail.com> wrote: > > > Hi Max, in that case definitely the model suggested by Jeff would be > > perfect. So something like: > > > class User > > { > > String name; > > > } > > > class UserSkill > > { > > String skill; > > int ability; > > > } > > > When you store them make sure the User is the parent of the UserSkill > > so they can be updated in a single transaction. > > > datastore.store(aUser); > > datastore.store(aSkill, aUser); // set the parent relationship > > > Twig has support for this "Relation Entity" pattern like so: > > > // return all high skilled java developers > > datastore.find().type(UserSkill.class) > > .addFilter("skill", EQUAL, "java") > > .addFilter("ability", GREATER_THAN, 4) > > .returnParentsNow(); > > > To find Users with Java > 4 and C++ > 2 you still need to do two > > queries. To avoid having to load all the results into memory make > > sure you sort by ability and key then "stream" the results using the > > "zig-zag" method. This will soon be built into Twig. Also do > > parallel queries to speed up the response. > > > datastore.find().type(UserSkill.class) > > .addSort("ability") > > .addSort(Entity.KEY_RESERVED_PROPERTY) > > .addFilter("skill", EQUAL, "java") > > .addFilter("ability", GREATER_THAN, 4) > > .returnParentsLater(); > > > If you are expecting a lot of results you should tune this query by > > setting fetchResultsBy(100) or something similar to reduce the number > > of trips to the datastore. The default is 20 results returned at a > > time. > > > Hope this helps! > > > John > > > On 16 Mar 2010, at 01:06, Max wrote: > > > > We have already done some testings on RDBMS and the performance is not > > > acceptable to us. (for the second query, that means self join a table > > > with 10 million records for n times). That's why we try GAE now. > > > > Thank you. > > > > On Mar 16, 1:54 am, Max <thebb...@gmail.com> wrote: > > >> Hi John, > > > >> I am designing a quite similar data model of cited *User-Skill* > > >> problem, but not exactly the same. > > > >> People may not be familiar with our domain. Basically it will be a > > >> track record system. User can perform different tasks and same task > > >> can be done by many users. According to historical data, more than > > >> 5000 different users will finish the same task, and it is possible > > >> for > > >> some users to finish more than 5000 tasks within the data archive > > >> period. Additionally, following queries will be performed frequently: > > >> 1, given 2 user, A and B, find out common tasks they have done (less > > >> important) > > >> 2, given several tasks, find out users who have done all these tasks > > >> (more important) > > > >> Translate our problem into user-skill scenario, in this case, one > > >> user > > >> can have more than 5000 skills and there could be more than 5000 > > >> users > > >> having the same skill. > > >> 1, given 2 user, A and B, how to find out their mutual skills > > >> 2, given n skills, how to find full list of users having all n skills > > > >> Best regards, > > >> Max > > > >> On Mar 15, 5:59 pm, John Patterson <jdpatter...@gmail.com> wrote: > > > >>> I was meaning just put the UserSkills of the two people into the > > >>> set. > > >>> Each person only has a small number of skills yeah? > > > >>> Perhaps I mis understood your last requirement "similarity between > > >>> user A and User B" > > > >>> On 15 Mar 2010, at 14:23, Max wrote: > > > >>>> Hi John, > > > >>>> Thanks for your reply. I need some time to study and test your > > >>>> codes. > > > >>>> For the last point, Sets.intersection() means we need to load all > > >>>> keys > > >>>> into memory and perform an in memory Sets.intersection(). Is that > > >>>> possible to do this by a query directly. In other words, is that > > >>>> possible to use more than one equality filter on a list property > > >>>> of a > > >>>> relation entity index for their parents? > > > >>>> Best regards, > > >>>> Max > > > >>>> On Mar 15, 2:45 pm, John Patterson <jdpatter...@gmail.com> wrote: > > >>>>> Hi Max, > > > >>>>> Regarding your original question, a more efficient solution > > >>>>> would be > > >>>>> to embed the UesrSkill in the User instance which would allow > > >>>>> you to > > >>>>> find all results in a single query. Th problem is that embedded > > >>>>> instances can only be queried on a single value. There would be > > >>>>> no > > >>>>> way to query skill and ability on the same UserSkill - just "java > > >>>>> and c > > >>>>> ++ with any skill over 3 and any skill over 5" > > > >>>>> To solve this you could create a combined property in UserSkill > > >>>>> for > > >>>>> querying "skillAbility" which would hold values such as "java: > > >>>>> 5", "c > > >>>>> ++: > > >>>>> 4". This will only work with skill from 0-9 because it depends on > > >>>>> lexical ordering (or e.g. 000 - 999) > > > >>>>> Both Twig and Objectify but not JDO support embedded collections > > >>>>> of > > >>>>> instances. > > > >>>>> In Twig it would be defined like this > > > >>>>> class User > > >>>>> { > > >>>>> @Embed List<UserSkill> skills; > > > >>>>> } > > > >>>>> class UserSkill > > >>>>> { > > >>>>> String skillAbility; > > >>>>> Skill skill; // direct reference to Skill instance > > >>>>> int ability; > > > >>>>> } > > > >>>>> Disclaimer: I have not tried any of this code - it is just off the > > >>>>> top > > >>>>> of my head > > > >>>>> You would then do a single range query to find "java-5", "java-6, > > >>>>> "java-7"... > > > >>>>> // find java developers with ability over 5 in a single query > > >>>>> datastore.find().type(User.class) > > >>>>> .addFilter("skillAbility", GREATER_THAN_EQUAL, "java: > > >>>>> 5") // range > > >>>>> start > > >>>>> .addFilter("skillAbility", LESS_THAN, "java-" + > > >>>>> Character.MAX_VALUE) // range end > > >>>>> .returnResultsNow(); > > > >>>>> But that doesn't fully answer your question which includes an > > >>>>> AND on > > >>>>> multiple property values which is not supported by the > > >>>>> datastore. To > > >>>>> do this you will need to perform two queries and merge the > > >>>>> results. > > > >>>>> Twig has support for merging only OR queries right now so you > > >>>>> can do: > > > >>>>> // find users with c++ ability > 2 OR java ability > 5 > > > >>>>> RootFindCommand or = datastore.find().type(User.class); // > > >>>>> default > > >>>>> (only) operator is OR > > > >>>>> or.addChildCommand() > > >>>>> .addFilter("skillAbility", GREATER_THAN_EQUAL, "java: > > >>>>> 5") // range > > >>>>> start > > >>>>> .addFilter("skillAbility", LESS_THAN, "java-" + > > >>>>> Character.MAX_VALUE); // range end > > > >>>>> or.addChildCommand() > > >>>>> .addFilter("skillAbility", GREATER_THAN_EQUAL, "java: > > >>>>> 5") // range > > >>>>> start > > >>>>> .addFilter("skillAbility", LESS_THAN, "java-" + > > >>>>> Character.MAX_VALUE); // end > > > >>>>> // merges results from both queries into a single iterator > > >>>>> Iterator<User> results = or.returnResultsNow(); > > > >>>>> Supporting AND merges is coming! Add a feature request if you > > >>>>> like. > > >>>>> But for now you will have to do two separate queries as in the > > >>>>> first > > >>>>> example and join the results in your own code. You should make > > >>>>> sure > > >>>>> both queries are sorted by key then you can "stream" the results > > >>>>> without loading them all into memory at once. > > > >>>>> // find java developers with ability over 5 > > >>>>> datastore.find().type(User.class) > > >>>>> .addSort("skillAbility") // first sort required to be > > >>>>> inequality filter > > >>>>> .addSort(Entity.KEY_RESERVED_PROPERTY) // ensure results > > >>>>> in same order > > >>>>> .addFilter("skillAbility", GREATER_THAN_EQUAL, "java: > > >>>>> 5") // range > > >>>>> start > > >>>>> .addFilter("skillAbility", LESS_THAN, "java-" + > > >>>>> Character.MAX_VALUE) // range end > > >>>>> .returnResultsNow(); > > > >>>>> // find c++ developers with ability over 2 > > >>>>> datastore.find().type(User.class) > > >>>>> .addSort("skillAbility") > > >>>>> .addSort(Entity.KEY_RESERVED_PROPERTY) > > >>>>> .addFilter("skillAbility", GREATER_THAN_EQUAL, "c++: > > >>>>> 2") // > > >>>>> range start > > >>>>> .addFilter("skillAbility", LESS_THAN, "c++:-" + > > >>>>> Character.MAX_VALUE) // range end > > >>>>> .returnResultsNow(); > > > >>>>> // now iterate through both results and only include those in both > > >>>>> iterators > > > >>>>> Again, I have not run this code so I might have made a mistake. > > >>>>> Let > > >>>>> me know how you get on! I'll be adding support for these merged > > >>>>> AND > > >>>>> queries on multiple property values soon - unless someone else > > >>>>> wants > > >>>>> to contribute it first ;) > > > >>>>> To find the similarity between two users is now simple now that > > >>>>> they > > >>>>> are just a property of the User? just do a Sets.intersection() of > > >>>>> the > > >>>>> skills. > > > >>>>> John > > > >>>>> On 15 Mar 2010, at 12:07, Max wrote: > > > >>>>>> Thanks John, > > > >>>>>> Bret Slatkins' talk is impressive. Let's say we have m skills > > >>>>>> with n > > >>>>>> levels. (i.e., m x n SkillLevel entities). Each SkillLevel entity > > >>>>>> consists of at least one SkillLevelIndex. > > > >>>>>> We define similarity between user A and User B as number of > > ... > > read more » -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-j...@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.