Exactly, you should usually design your schema to fit your queries, and if you need to retrieve all ancestors then you should index all ancestors so you can query for them easily.

If that doesn't work for you then either Solr is not the right tool for the job, or you need to rethink your schema.

The description of doing lookups within a tree structure doesn't sound at all like what you would use a text retrieval engine for, so you might want to rethink why you want to use Solr for this. But if that "transitive closure" is something you can calculate at indexing time then the correct solution is the one Upayavira provided.

If you want people to be able to help you you need to actually describe your problem (i.e. what is my data, and what are my queries) instead of diving into technical details like "reducing HTTP roundtrips". My guess is that if you need to "reduce HTTP roundtrips" you're probably doing it wrong.

HTH,
Jens

On 03/28/2013 08:15 AM, Upayavira wrote:
Why don't you index all ancestor classes with the document, as a
multivalued field, then you could get it in one hit. Am I missing
something?

Upayavira

On Thu, Mar 28, 2013, at 01:59 AM, Jack Park wrote:
Hi Otis,
That's essentially the answer I was looking for: each shard (are we
talking master + replicas?) has the plug-in custom query handler.  I
need to build it to find out.

What I mean is that there is a taxonomy, say one with a single root
for sake of illustration, which grows all the classes, subclasses, and
instances. If I have an object that is somewhere in that taxonomy,
then it has a zigzag chain of parents up that tree (I've seen that
called a "transitive closure". If class B is way up that tree from M,
no telling how many queries it will take to find it.  Hmmm...
recursive ascent, I suppose.

Many thanks
Jack

On Wed, Mar 27, 2013 at 6:52 PM, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
Hi Jack,

I don't fully understand the exact taxonomy structure and your needs,
but in terms of reducing the number of HTTP round trips, you can do it
by writing a custom SearchComponent that, upon getting the initial
request, does everything "locally", meaning that it talks to the
local/specified shard before returning to the caller.  In SolrCloud
setup with N shards, each of these N shards could be queried in such a
way in parallel, running query/queries on their local shards.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Wed, Mar 27, 2013 at 3:11 PM, Jack Park <jackp...@topicquests.org> wrote:
Hi Otis,

I fully expect to grow to SolrCloud -- many shards. For now, it's
solo. But, my thinking relates to cloud. I look for ways to reduce the
number of HTTP round trips through SolrJ. Maybe you have some ideas?

Thanks
Jack

On Wed, Mar 27, 2013 at 10:04 AM, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
Hi Jack,

Is this really about HTTP and Solr vs. SolrCloud or more whether
Solr(Cloud) is the right tool for the job and if so how to structure
the schema and queries to make such lookups efficient?

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Wed, Mar 27, 2013 at 12:53 PM, Jack Park <jackp...@topicquests.org> wrote:
This is a question about "isA?"

We want to know if M isA B   isA?(M,B)

For some M, one might be able to look into M to see its type or which
class(es) for which it is a subClass. We're talking taxonomic queries
now.
But, for some M, one might need to ripple up the "transitive closure",
looking at all the super classes, etc, recursively.

It seems unreasonable to do that over HTTP; it seems more reasonable
to grab a core and write a custom isA query handler. But, how do you
do that in a SolrCloud?

Really curious...

Many thanks in advance for ideas.
Jack



Reply via email to