After fiddling around a bit -- I think I have something of a solution.
Basically, I need to keep a count of each tag occurrence.

When doing actual filtering on the query level, I only apply one or
two filters. If the user supplies more than two tags, I filter on the
tags with lowest counts (i.e. the most restrictive). I then apply the
remaining filters in Python, hoping that the query filters removed
enough entities that the filtering is relatively efficient.

Not the most elegant solution, but it works for most use-cases. Anyone
have a different approach?

-- Andrew

On Apr 8, 4:25 pm, Andrew Fong <fongand...@gmail.com> wrote:
> Looking for some advice on how to handle the following situation:
>
> My model for a blog post has a StringListProperty full of tags -- the
> order of magnitude is about 20 tags per entity. I want to allow users
> to be able to filter on various combinations of tags (e.g. return only
> entities with tags 'T', 'A', and 'G').
>
> So far so good. I think can add on as many filters on this one list as
> I like without needing to generate a composite index.
>
> But now I want to implement paging on this (e.g. show the first 20
> entities with tags 'T', 'A', and 'G'; now show the second). In order
> to do this, I need to sort on a separate field (e.g. the __key__
> property). This does require a composite index. In fact, the dev
> server now generates a different composite index depending on the
> number of tag filters I apply.
>
> That is, my index.yaml has entries like this:
>
> - kind: Post
>   properties:
>   - name: tags
>   - name: tags
>   - name: tags
>   - name: __key__
>
> The problem is that this index explodes very quickly. My understanding
> is that there is a 5000 index cap per entity. If the tags property has
> 20 elements in it, then in the above index has 20 ** 3 = 8000 indexes,
> causing a world of hurt.
>
> I could restrict the number of tags the user filters on or lower the
> max number of tags per post, but these are not ideal solutions for me.
> Does anyone have any ideas?
>
> On a somewhat related note, does anyone know the maximum of equality
> filters (i.e. merge joins) that one can add to a query? I know they
> scale linearly, but I'm wondering if there's some sort of hard limit?
>
> Thanks.
>
> -- Andrew
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to