Thanks again.
For the moment I think it won't be a problem. I have ~500 documents.
Regards,
Francisco
El vie., 11 de sept. de 2015 a la(s) 6:08 p. m., simon
escribió:
> +1 on Sujit's recommendation: we have a similar use case (detecting drug
> names / disease entities /MeSH
Hi Francisco,
>> I have many drug products leaflets, each corresponding to 1 product. In
the
other hand we have a medical dictionary with about 10^5 terms.
I want to detect all the occurrences of those terms for any leaflet
document.
Take a look at SolrTextTagger for this use case.
It sounds to me like you are wanting to *filter* your document to only
include terms within that medical dictionary. Or to have a keyword field
based upon those of your 100k terms that appear in that doc.
Synonyms are your saviour, if that's the case. Create a synonyms list
for your terms, they
+1 on Sujit's recommendation: we have a similar use case (detecting drug
names / disease entities /MeSH terms ) and have been using the
SolrTextTagger with great success.
We run a separate Solr instance as a tagging service and add the detected
tags as metadata fields to a document before it is
Many thanks pals.
I will walk some of those ways (and return with new questions)
;)
Best regards,
Francisco
El vie., 11 de sept. de 2015 a la(s) 5:41 a. m., Upayavira
escribió:
> It sounds to me like you are wanting to *filter* your document to only
> include terms within
Thanks!
El vie, sep 11, 2015 14:39, Sujit Pal escribió:
> Hi Francisco,
>
> >> I have many drug products leaflets, each corresponding to 1 product. In
> the
> other hand we have a medical dictionary with about 10^5 terms.
> I want to detect all the occurrences of those
Assuming the medical dictionary is constant, I would do a copyField of
text into a separate field and have that separate field use:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/miscellaneous/KeepWordFilterFactory.html
with words coming from the dictionary (normalized).
:Francisco Andrés Fernández <fra...@gmail.com>
>> Sent: Thursday 10th September 2015 15:58
>> To: solr-user@lucene.apache.org
>> Subject: Detect term occurrences
>>
>> Hi all, I'm new to Solr.
>> I want to detect all ocurrences of terms existing in a thesa
e.org
> Subject: Detect term occurrences
>
> Hi all, I'm new to Solr.
> I want to detect all ocurrences of terms existing in a thesaurus into 1 or
> more documents.
> What´s the best strategy to make it?
> Doing a query for each term doesn't seem to be the best way.
> Many thanks,
>
> Francisco
>
Can you tell us a bit more about the business case? Not the current
technical one. Because it is entirely possible Solr can solve the
higher level problem out of the box without you doing manual term
comparisons.In which case, your problem scope is not quite right.
Regards,
Alex.
Solr
Hi all, I'm new to Solr.
I want to detect all ocurrences of terms existing in a thesaurus into 1 or
more documents.
What´s the best strategy to make it?
Doing a query for each term doesn't seem to be the best way.
Many thanks,
Francisco
_Assuming_ this isn't a high throughput _and_ the leaflet text isn't too big...
Index the thesaurus and fire all the terms of the query in a big OR
clause against the index as a _query_. Perhaps turn highlighting on
and highlight the entire leaflet text.
Note, this is just "off the top of my
Yes.
I have many drug products leaflets, each corresponding to 1 product. In the
other hand we have a medical dictionary with about 10^5 terms.
I want to detect all the occurrences of those terms for any leaflet
document.
Could you give me a clue about how is the best way to perform it?
Perhaps,
13 matches
Mail list logo