Sorry, my fault,

I bypassed this excerpt of yours: " do I get the file name included in each
snippet fragment - this again needs exploring on my end". No, the solution
I proposed doesn't address that. :(

Edward

Em seg, 17 de fev de 2020 14:03, Srijan <shree...@gmail.com> escreveu:

> You know what, I think I missed a major description in my earlier email. I
> want to be able to return additional data from stored fields alongside the
> snippets during highlighting. In this case, the filename where this snippet
> came from. Not sure your approach would address that.
>
> On Mon, Feb 17, 2020, 10:44 Edward Ribeiro <edward.ribe...@gmail.com>
> wrote:
>
> > Hi,
> >
> > You may try to create two kinds of docs forming a parent-child
> relationship
> > without nesting. Like
> >
> > <doc>
> > <id>894</id>
> > <type>parent</type>
> >
> > ...
> > <doc/>
> >
> > <doc>
> > <id>3213</id>
> > <type>child</type>
> > <parent_id>894</parent_id>
> > <metadata field 1>xxx
> > <file_content_en_US> portion of file 1
> > <file_content_en_US> remaining portion of file 1
> > ...
> > <doc/>
> >
> > Then you can add metadata for each child doc. The search can be done on
> > child docs but if you need to group you can use the join query parser (it
> > has some limitations though) or grouping by parent_id.
> >
> > Cheers,
> > Edward
> >
> >
> > Em seg, 17 de fev de 2020 12:25, Srijan <shree...@gmail.com> escreveu:
> >
> > > Hi,
> > >
> > > I have a data model where the operational "Object" can have one or more
> > > files attached. Indexing these objects in Solr means indexing all
> > metadata
> > > info and the contents of the files. For file contents what I have right
> > now
> > > is a single multi-valued field (for each locale)
> > >
> > > Example:
> > > <doc>
> > > <metadata field 1>xxx
> > > <metadata field 2>yyy
> > > <file_content_en_US> portion of file 1
> > > <file_content_en_US> remaining portion of file 1
> > > <file_content_en_US> portion of file 2
> > > <file_content_en_US> contents from file 2 again...
> > > ...
> > > </doc>
> > >
> > > Search is easy and everything's been working fine. We recently
> introduced
> > > highlighting functionality on these file content fields. Again,
> straight
> > > forward use-case. Next requirement is where things get a little tricky.
> > We
> > > want to be able to return the name of the file ( generalizing this - or
> > > some other metadata info related to the file content field). If our
> data
> > > model had a 1:1 relation between our operational object and the file it
> > > contains, the file name would have been just another field on the main
> > doc
> > > but unfortunately that's not the case - each file content field could
> > > belong to any file.
> > >
> > > There are a couple of potential solutions I have been thinking of:
> > > 1. Use nested docs to preserve the logical grouping of file content and
> > the
> > > file info where this content is coming from. This could potentially
> work
> > > but I haven't done any testing yet (I know highlighting doesn't work on
> > > nested docs for example)
> > >
> > > 2. Encode the file name in the file content fields themselves. The file
> > > name will be removed during indexing but will be stored. How do I get
> the
> > > file name included in each snippet fragment - this again needs
> exploring
> > on
> > > my end
> > >
> > > Another approach I have been thinking is extending the StoredField to
> > also
> > > store additional meta data information. So basically when a stored
> field
> > is
> > > retrieved, or a fragment is returned, I also have additional
> information
> > > associated with the stored field. Can someone tell me this is a
> terrible
> > > idea and I should not be pursuing.
> > >
> > > Is there something else I can try?
> > >
> > > Thanks a lot,
> > > Srijan
> > >
> >
>

Reply via email to