Re: getting cached terms inside UpdateRequestProcessor...

Roxana Danger Thu, 22 Oct 2015 08:20:06 -0700

Hi Erik,

Thanks for the links, but the analyzers are called correctly. The problem
is that I need to get access to the whole set of terms through a searcher,
but the request searcher cannot retrieve any terms because the commit
method has not been called already.


My idea behind this is avoid two calls: first, the importer and after the
updater. As there is an update processor chain than can be used after the
DIH, I thorough it was possible to get a real-time updater. My specific
problem is: given a set of texts, I need to make a solr index and add to
this index a graph containing certain dependences between the indexed
terms. So, my idea was to use an import (associating the appropriate
analyzer to the textual field) and use the updateProcessorchain that:
first, construct the graph, and then, add a field to link a document with a
graph node.

However, this does not seem to be a good approach (see Alexander reply),
and I am trying to call sequentially the importer and updater. Any other
proposal for avoid the double call are welcome!

I am also having trouble on calling the updateProcessor to make all changes
on the imported documents.

Thank you very much,
Roxana




On 22 October 2015 at 15:10, Erik Hatcher <erik.hatc...@gmail.com> wrote:

> Roxana -
>
> What is the purpose of doing this?  (that’ll help guide the best approach)
>
> It can be quite handy to get the terms from analysis into a field as
> stored values and to separate terms into separate fields and such.  Here’s
> a presentation where I detailed an update script trick that accomplishes
> this:
> http://www.slideshare.net/erikhatcher/solr-indexing-and-analysis-tricks <
> http://www.slideshare.net/erikhatcher/solr-indexing-and-analysis-tricks>
>
> Within Solr, the example/files area has this very trick implemented to
> pull our URLs and e-mail addresses from full text into separate specific
> fields.  See
> http://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x/solr/example/files/conf/update-script.js
> <
> http://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x/solr/example/files/conf/update-script.js>
> (“var analyzer = “… and below)
>
> Does that trick accomplish what you need?   If not, please detail what
> you’re after and we’ll try to help.
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com <http://www.lucidworks.com/>
>
>
>
>
> > On Oct 22, 2015, at 6:20 AM, Roxana Danger <
> roxana.dan...@reedonline.co.uk> wrote:
> >
> > Hello,
> >
> > I would like to create an updateRequestProcessorChain that should to be
> > executed after a DB DIH. I am extending UpdateRequestProcessorFactory and
> > the UpdateRequestProcessor classes. The method processAdd of my
> > UpdateRequestProcessor should be able to update the documents with  the
> > indexed terms associated to a field. Notice that these terms should have
> > been extracted with an analyzer before my updateRequestProcessorChain
> > processor begins to execute.
> >
> > The problem I am getting is that at the point where processAdd is
> executed
> > the field containing the terms has not been filled. To retrieve the
> terms I
> > am using the SolrIndexSearcher provided during the request
> > (req.getSearcher()). However, it seems that this searcher uses only the
> > data physically stored and does not consider any of the imported data.
> >
> > Any idea on how can I access to searcher with all indexed/cached data
> when
> > the processAdd method is executed?
> >
> > Thank you very much in advance.
>
>


-- 
Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, London,
WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> The
UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter]
<https://twitter.com/reedcouk>
<https://www.linkedin.com/company/reed.co.uk> [image:
Like us on Facebook] <https://www.facebook.com/reedcouk/>
<https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays »
<http://www.reed.co.uk/lovemondays>

Re: getting cached terms inside UpdateRequestProcessor...

Reply via email to