to query with n filters on one relation index entity
*should be*
to query with n filters on User with many UserSkill entity children

On Mar 16, 2:05 pm, Max <thebb...@gmail.com> wrote:
> Hi John,
>
> Thanks for your reply. To be honest I am quite disappointed that GAE
> can not perform such query directly.
>
> According do your suggestion, to query with n filters on one relation
> index entity, we need to:
> 1, perform n separate queries to obtain n cursors of UserSkill
> entities by some sorting order
> 2, use in-memory zig-zag method to obtain User keys match all query
> 3, use a batch get to obtain all User entities
>
> Please correct me if my understanding is wrong.
>
> Best regards,
> Max
>
> On Mar 16, 10:53 am, John Patterson <jdpatter...@gmail.com> wrote:
>
> > Hi Max, in that case definitely the model suggested by Jeff would be  
> > perfect.  So something like:
>
> > class User
> > {
> >         String name;
>
> > }
>
> > class UserSkill
> > {
> >         String skill;
> >         int ability;
>
> > }
>
> > When you store them make sure the User is the parent of the UserSkill  
> > so they can be updated in a single transaction.
>
> > datastore.store(aUser);
> > datastore.store(aSkill, aUser);  // set the parent relationship
>
> > Twig has support for this "Relation Entity" pattern like so:
>
> > // return all high skilled java developers
> > datastore.find().type(UserSkill.class)
> >         .addFilter("skill", EQUAL, "java")
> >         .addFilter("ability", GREATER_THAN, 4)
> >         .returnParentsNow();
>
> > To find Users with Java > 4 and C++ > 2 you still need to do two  
> > queries.  To avoid having to load all the results into memory make  
> > sure you sort by ability and key then "stream" the results using the  
> > "zig-zag" method.  This will soon be built into Twig.  Also do  
> > parallel queries to speed up the response.
>
> > datastore.find().type(UserSkill.class)
> >         .addSort("ability")
> >         .addSort(Entity.KEY_RESERVED_PROPERTY)
> >         .addFilter("skill", EQUAL, "java")
> >         .addFilter("ability", GREATER_THAN, 4)
> >         .returnParentsLater();
>
> > If you are expecting a lot of results you should tune this query by  
> > setting fetchResultsBy(100)  or something similar to reduce the number  
> > of trips to the datastore.  The default is 20 results returned at a  
> > time.
>
> > Hope this helps!
>
> > John
>
> > On 16 Mar 2010, at 01:06, Max wrote:
>
> > > We have already done some testings on RDBMS and the performance is not
> > > acceptable to us. (for the second query, that means self join a table
> > > with 10 million records for n times). That's why we try GAE now.
>
> > > Thank you.
>
> > > On Mar 16, 1:54 am, Max <thebb...@gmail.com> wrote:
> > >> Hi John,
>
> > >> I am designing a quite similar data model of cited *User-Skill*
> > >> problem, but not exactly the same.
>
> > >> People may not be familiar with our domain. Basically it will be a
> > >> track record system. User can perform different tasks and same task
> > >> can be done by many users. According to historical data, more than
> > >> 5000 different users will finish the same task, and it is possible  
> > >> for
> > >> some users to finish more than 5000 tasks within the data archive
> > >> period. Additionally, following queries will be performed frequently:
> > >> 1, given 2 user, A and B, find out common tasks they have done  (less
> > >> important)
> > >> 2, given several tasks, find out users who have done all these tasks
> > >> (more important)
>
> > >> Translate our problem into user-skill scenario, in this case, one  
> > >> user
> > >> can have more than 5000 skills and there could be more than 5000  
> > >> users
> > >> having the same skill.
> > >> 1, given 2 user, A and B, how to find out their mutual skills
> > >> 2, given n skills, how to find full list of users having all n skills
>
> > >> Best regards,
> > >> Max
>
> > >> On Mar 15, 5:59 pm, John Patterson <jdpatter...@gmail.com> wrote:
>
> > >>> I was meaning just put the UserSkills of the two people into the  
> > >>> set.
> > >>> Each person only has a small number of skills yeah?
>
> > >>> Perhaps I mis understood your last requirement "similarity between
> > >>> user A and User B"
>
> > >>> On 15 Mar 2010, at 14:23, Max wrote:
>
> > >>>> Hi John,
>
> > >>>> Thanks for your reply. I need some time to study and test your  
> > >>>> codes.
>
> > >>>> For the last point, Sets.intersection() means we need to load all  
> > >>>> keys
> > >>>> into memory and perform an in memory Sets.intersection(). Is that
> > >>>> possible to do this by a query directly. In other words, is that
> > >>>> possible to use more than one equality filter on a list property  
> > >>>> of a
> > >>>> relation entity index for their parents?
>
> > >>>> Best regards,
> > >>>> Max
>
> > >>>> On Mar 15, 2:45 pm, John Patterson <jdpatter...@gmail.com> wrote:
> > >>>>> Hi Max,
>
> > >>>>> Regarding your original question, a more efficient solution  
> > >>>>> would be
> > >>>>> to embed the UesrSkill in the User instance which would allow  
> > >>>>> you to
> > >>>>> find all results in a single query.  Th problem is that embedded
> > >>>>> instances can only be queried on a single value.  There would be  
> > >>>>> no
> > >>>>> way to query skill and ability on the same UserSkill - just "java
> > >>>>> and c
> > >>>>> ++ with any skill over 3 and any skill over 5"
>
> > >>>>> To solve this you could create a combined property in UserSkill  
> > >>>>> for
> > >>>>> querying "skillAbility" which would hold values such as "java:
> > >>>>> 5", "c
> > >>>>> ++:
> > >>>>> 4".  This will only work with skill from 0-9 because it depends on
> > >>>>> lexical ordering (or e.g. 000 - 999)
>
> > >>>>> Both Twig and Objectify but not JDO support embedded collections  
> > >>>>> of
> > >>>>> instances.
>
> > >>>>> In Twig it would be defined like this
>
> > >>>>> class User
> > >>>>> {
> > >>>>>         @Embed List<UserSkill> skills;
>
> > >>>>> }
>
> > >>>>> class UserSkill
> > >>>>> {
> > >>>>>         String skillAbility;
> > >>>>>         Skill skill;    // direct reference to Skill instance
> > >>>>>         int ability;
>
> > >>>>> }
>
> > >>>>> Disclaimer: I have not tried any of this code - it is just off the
> > >>>>> top
> > >>>>> of my head
>
> > >>>>> You would then do a single range query to find "java-5", "java-6,
> > >>>>> "java-7"...
>
> > >>>>> // find java developers with ability over 5 in a single query
> > >>>>> datastore.find().type(User.class)
> > >>>>>         .addFilter("skillAbility", GREATER_THAN_EQUAL, "java:
> > >>>>> 5")  // range
> > >>>>> start
> > >>>>>         .addFilter("skillAbility", LESS_THAN, "java-" +
> > >>>>> Character.MAX_VALUE)  // range end
> > >>>>>         .returnResultsNow();
>
> > >>>>> But that doesn't fully answer your question which includes an  
> > >>>>> AND on
> > >>>>> multiple property values which is not supported by the  
> > >>>>> datastore.  To
> > >>>>> do this you will need to perform two queries and merge the  
> > >>>>> results.
>
> > >>>>> Twig has support for merging only OR queries right now so you  
> > >>>>> can do:
>
> > >>>>> // find users with c++ ability > 2 OR java ability > 5
>
> > >>>>> RootFindCommand  or = datastore.find().type(User.class);  //  
> > >>>>> default
> > >>>>> (only) operator is OR
>
> > >>>>> or.addChildCommand()
> > >>>>>         .addFilter("skillAbility", GREATER_THAN_EQUAL, "java:
> > >>>>> 5")  // range
> > >>>>> start
> > >>>>>         .addFilter("skillAbility", LESS_THAN, "java-" +
> > >>>>> Character.MAX_VALUE);  // range end
>
> > >>>>> or.addChildCommand()
> > >>>>>         .addFilter("skillAbility", GREATER_THAN_EQUAL, "java:
> > >>>>> 5")  // range
> > >>>>> start
> > >>>>>         .addFilter("skillAbility", LESS_THAN, "java-" +
> > >>>>> Character.MAX_VALUE);  // end
>
> > >>>>> // merges results from both queries into a single iterator
> > >>>>> Iterator<User> results = or.returnResultsNow();
>
> > >>>>> Supporting AND merges is coming!  Add a feature request if you  
> > >>>>> like.
> > >>>>> But for now you will have to do two separate queries as in the  
> > >>>>> first
> > >>>>> example and join the results in your own code.  You should make  
> > >>>>> sure
> > >>>>> both queries are sorted by key then you can "stream" the results
> > >>>>> without loading them all into memory at once.
>
> > >>>>> // find java developers with ability over 5
> > >>>>> datastore.find().type(User.class)
> > >>>>>         .addSort("skillAbility")      // first sort required to be
> > >>>>> inequality filter
> > >>>>>         .addSort(Entity.KEY_RESERVED_PROPERTY)  // ensure results
> > >>>>> in same order
> > >>>>>         .addFilter("skillAbility", GREATER_THAN_EQUAL, "java:
> > >>>>> 5")  // range
> > >>>>> start
> > >>>>>         .addFilter("skillAbility", LESS_THAN, "java-" +
> > >>>>> Character.MAX_VALUE)  // range end
> > >>>>>         .returnResultsNow();
>
> > >>>>> // find c++ developers with ability over 2
> > >>>>> datastore.find().type(User.class)
> > >>>>>         .addSort("skillAbility")
> > >>>>>         .addSort(Entity.KEY_RESERVED_PROPERTY)
> > >>>>>         .addFilter("skillAbility", GREATER_THAN_EQUAL, "c++:
> > >>>>> 2")  //
> > >>>>> range start
> > >>>>>         .addFilter("skillAbility", LESS_THAN, "c++:-" +
> > >>>>> Character.MAX_VALUE)  // range end
> > >>>>>         .returnResultsNow();
>
> > >>>>> // now iterate through both results and only include those in both
> > >>>>> iterators
>
> > >>>>> Again, I have not run this code so I might have made a mistake.  
> > >>>>> Let
> > >>>>> me know how you get on!  I'll be adding support for these merged  
> > >>>>> AND
> > >>>>> queries on multiple property values soon - unless someone else  
> > >>>>> wants
> > >>>>> to contribute it first ;)
>
> > >>>>> To find the similarity between two users is now simple now that  
> > >>>>> they
> > >>>>> are just a property of the User?  just do a Sets.intersection() of
> > >>>>> the
> > >>>>> skills.
>
> > >>>>> John
>
> > >>>>> On 15 Mar 2010, at 12:07, Max wrote:
>
> > >>>>>> Thanks John,
>
> > >>>>>> Bret Slatkins' talk is impressive. Let's say we have m skills  
> > >>>>>> with n
> > >>>>>> levels. (i.e., m x n SkillLevel entities). Each SkillLevel entity
> > >>>>>> consists of at least one SkillLevelIndex.
>
> > >>>>>> We define similarity between user A and User B as number of
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to