Thanks again.
For the moment I think it won't be a problem. I have ~500 documents.
Regards,
Francisco
El vie., 11 de sept. de 2015 a la(s) 6:08 p. m., simon
escribió:
> +1 on Sujit's recommendation: we have a similar use case (detecting drug
> names / disease entities /MeSH terms ) and have bee
+1 on Sujit's recommendation: we have a similar use case (detecting drug
names / disease entities /MeSH terms ) and have been using the
SolrTextTagger with great success.
We run a separate Solr instance as a tagging service and add the detected
tags as metadata fields to a document before it is i
Thanks!
El vie, sep 11, 2015 14:39, Sujit Pal escribió:
> Hi Francisco,
>
> >> I have many drug products leaflets, each corresponding to 1 product. In
> the
> other hand we have a medical dictionary with about 10^5 terms.
> I want to detect all the occurrences of those terms for any leaflet
> do
Hi Francisco,
>> I have many drug products leaflets, each corresponding to 1 product. In
the
other hand we have a medical dictionary with about 10^5 terms.
I want to detect all the occurrences of those terms for any leaflet
document.
Take a look at SolrTextTagger for this use case.
https://github.
Assuming the medical dictionary is constant, I would do a copyField of
text into a separate field and have that separate field use:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/miscellaneous/KeepWordFilterFactory.html
with words coming from the dictionary (normalized).
Many thanks pals.
I will walk some of those ways (and return with new questions)
;)
Best regards,
Francisco
El vie., 11 de sept. de 2015 a la(s) 5:41 a. m., Upayavira
escribió:
> It sounds to me like you are wanting to *filter* your document to only
> include terms within that medical dictionar
It sounds to me like you are wanting to *filter* your document to only
include terms within that medical dictionary. Or to have a keyword field
based upon those of your 100k terms that appear in that doc.
Synonyms are your saviour, if that's the case. Create a synonyms list
for your terms, they ca
_Assuming_ this isn't a high throughput _and_ the leaflet text isn't too big...
Index the thesaurus and fire all the terms of the query in a big OR
clause against the index as a _query_. Perhaps turn highlighting on
and highlight the entire leaflet text.
Note, this is just "off the top of my head
Yes.
I have many drug products leaflets, each corresponding to 1 product. In the
other hand we have a medical dictionary with about 10^5 terms.
I want to detect all the occurrences of those terms for any leaflet
document.
Could you give me a clue about how is the best way to perform it?
Perhaps, th
Doing a query for each term should work well. Solr is fast for queries. Write a
script.
I assume you only need to do this once. Running all the queries will probably
take less time than figuring out a different approach.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.o
If you are interested in just the number of occurences of an indexed term. The
TermsComponent will give that answer.
MArkus
-Original message-
> From:Francisco Andrés Fernández
> Sent: Thursday 10th September 2015 15:58
> To: solr-user@lucene.apache.org
> Subject: Detect term occurrenc
Can you tell us a bit more about the business case? Not the current
technical one. Because it is entirely possible Solr can solve the
higher level problem out of the box without you doing manual term
comparisons.In which case, your problem scope is not quite right.
Regards,
Alex.
Solr Anal
12 matches
Mail list logo