How to retrieve parent documents without a nested structure (block-join)

Shamik Bandopadhyay Thu, 22 Sep 2016 10:17:06 -0700

Hi,

  I have a set of documents indexed which has a pseudo parent-child
relationship. Each child document has a reference to the parent document
through an ID. As the documents are not available to the crawler in order,
I'm not able to index them in a nested structure to support
block-join.Here's an example of a dataset in index right now.


<doc>
  <field name="id">1</field>
  <field name="title">Parent title</field>
  <field name="doc_id">123</field>
</doc>
<doc>
  <field name="id">2</field>
  <field name="title">Child title1</field>
  <field name="parent_doc_id">123</field>
</doc>
<doc>
  <field name="id">3</field>
  <field name="title">Child title2</field>
  <field name="parent_doc_id">123</field>
</doc>
<doc>
  <field name="id">4</field>
  <field name="title">Misc title2</field>
</doc>

As per my requirement, if I search on "title2", the result should bring
back the following result, the parent document (id=1) and non-related
document (id=4).

<doc>
  <field name="id">1</field>
  <field name="title">Parent title</field>
  <field name="doc_id">123</field>
</doc>
<doc>
  <field name="id">4</field>
  <field name="title">Misc title2</field>
</doc>

This is similar in lines with Block Join Parent Query Parser where I could
have fired a query like : q={!parent
which="content_type:parentDocument"}title:title2

Not sure if the Graph Query Parser can be a relevant solution in this
regard. The problem I see there is I'm running on 5.5 with 2 shard and n
number of replicas. The graph query parser seems to be designed for a
single node/single shard.

This is tad urgent for me as I'm trying to come up with an approach to deal
with this. Any pointers will be highly appreciated.

Thanks,
Shamik

How to retrieve parent documents without a nested structure (block-join)

Reply via email to