Facing problem to integrate UIMA in SOLR
Hello all, I am facing problem to integrate the UIMA in SOLR. I followed the following steps, provided in README file shipped along with Uima to integrate it in Solr Step1. I set lib/ tags in solrconfig.xml appropriately to point the jar files. lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Step2. modified my schema.xml adding the fields I wanted to hold metadata specifying proper values for type, indexed, stored and multiValued options as follows: field name=language type=string indexed=true stored=true required=false/ field name=concept type=string indexed=true stored=true multiValued=true required=false/ field name=sentence type=text indexed=true stored=true multiValued=true required=false / Step3. modified my solrconfig.xml adding the following snippet: updateRequestProcessorChain name=uima default=true processor class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory lst name=uimaConfig lst name=runtimeParameters str name=keyword_apikeyVALID_ALCHEMYAPI_KEY/str str name=concept_apikeyVALID_ALCHEMYAPI_KEY/str str name=lang_apikeyVALID_ALCHEMYAPI_KEY/str str name=cat_apikeyVALID_ALCHEMYAPI_KEY/str str name=entities_apikeyVALID_ALCHEMYAPI_KEY/str str name=oc_licenseIDVALID_OPENCALAIS_KEY/str /lst str name=analysisEngine/org/apache/uima/desc/OverridingParamsExtServicesAE.xml/str bool name=ignoreErrorstrue/bool lst name=analyzeFields bool name=mergefalse/bool arr name=fields strtext/str /arr /lst lst name=fieldMappings lst name=type str name=nameorg.apache.uima.alchemy.ts.concept.ConceptFS/str lst name=mapping str name=featuretext/str str name=fieldconcept/str /lst /lst lst name=type str name=nameorg.apache.uima.alchemy.ts.language.LanguageFS/str lst name=mapping str name=featurelanguage/str str name=fieldlanguage/str /lst /lst lst name=type str name=nameorg.apache.uima.SentenceAnnotation/str lst name=mapping str name=featurecoveredText/str str name=fieldsentence/str /lst /lst /lst /lst /processor processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain Step 4: And finally created a new UpdateRequestHandler with the following: requestHandler name=/update class=solr.XmlUpdateRequestHandler lst name=defaults str name=update.processoruima/str /lst Further I indexed a word file called text.docx using the following command: curl http://localhost:8983/solr/update/extract?fmap.content=contentliteral.id=doc47commit=true; -F file=@test.docx When I searched the same document with http://localhost:8983/solr/select?q=id:doc47; command, got the following result i.e. not getting the additional UIMA fields in the response. result name=response numFound=1 start=0 doc str name=authordivakar/str arr name=content_type str application/vnd.openxmlformats-officedocument.wordprocessingml.document /str /arr str name=iddoc47/str date name=last_modified2012-04-18T14:19:00Z/date /doc /result Can anyone how to solve this problem? With Regds Thanks Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Facing-problem-to-integrate-UIMA-in-SOLR-tp3985089.html Sent from the Solr - User mailing list archive at Nabble.com.
Facing problem to integrate UIMA in SOLR
Hello all, I am facing problem to integrate the UIMA in SOLR. I followed the following steps, provided in README file shipped along with Uima to integrate it in Solr Step1. I set lib/ tags in solrconfig.xml appropriately to point the jar files. lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Step2. modified my schema.xml adding the fields I wanted to hold metadata specifying proper values for type, indexed, stored and multiValued options as follows: field name=language type=string indexed=true stored=true required=false/ field name=concept type=string indexed=true stored=true multiValued=true required=false/ field name=sentence type=text indexed=true stored=true multiValued=true required=false / Step3. modified my solrconfig.xml adding the following snippet: updateRequestProcessorChain name=uima default=true processor class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory lst name=uimaConfig lst name=runtimeParameters str name=keyword_apikeyVALID_ALCHEMYAPI_KEY/str str name=concept_apikeyVALID_ALCHEMYAPI_KEY/str str name=lang_apikeyVALID_ALCHEMYAPI_KEY/str str name=cat_apikeyVALID_ALCHEMYAPI_KEY/str str name=entities_apikeyVALID_ALCHEMYAPI_KEY/str str name=oc_licenseIDVALID_OPENCALAIS_KEY/str /lst str name=analysisEngine/org/apache/uima/desc/OverridingParamsExtServicesAE.xml/str bool name=ignoreErrorstrue/bool lst name=analyzeFields bool name=mergefalse/bool arr name=fields strtext/str /arr /lst lst name=fieldMappings lst name=type str name=nameorg.apache.uima.alchemy.ts.concept.ConceptFS/str lst name=mapping str name=featuretext/str str name=fieldconcept/str /lst /lst lst name=type str name=nameorg.apache.uima.alchemy.ts.language.LanguageFS/str lst name=mapping str name=featurelanguage/str str name=fieldlanguage/str /lst /lst lst name=type str name=nameorg.apache.uima.SentenceAnnotation/str lst name=mapping str name=featurecoveredText/str str name=fieldsentence/str /lst /lst /lst /lst /processor processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain Step 4: And finally created a new UpdateRequestHandler with the following: requestHandler name=/update class=solr.XmlUpdateRequestHandler lst name=defaults str name=update.processoruima/str /lst Further I indexed a word file called text.docx using the following command: curl http://localhost:8983/solr/update/extract?fmap.content=contentliteral.id=doc47commit=true; -F file=@test.docx When I searched the same document with http://localhost:8983/solr/select?q=id:doc47; command, got the following result i.e. not getting the additional UIMA fields in the response. result name=response numFound=1 start=0 doc str name=authordivakar/str arr name=content_type str application/vnd.openxmlformats-officedocument.wordprocessingml.document /str /arr str name=iddoc47/str date name=last_modified2012-04-18T14:19:00Z/date /doc /result Can anyone help to fix this problem. With Regds Thanks Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Facing-problem-to-integrate-UIMA-in-SOLR-tp3932008p3932008.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with UIMA
Hi Rahul, Thank you for the reply. I tried by modifying the updateRequestProcessorChain as follows: updateRequestProcessorChain name=uima default=true But still I am not able to see the UIMA fields in the result. I executed the following curl command to index a file named test.docx curl http://localhost:8983/solr/update/extract?fmap.content=contentliteral.id=doc47commit=true; -F file=@test.docx When I searched the same document with http://localhost:8983/solr/select?q=id:doc47; command, got the following result. result name=response numFound=1 start=0 doc str name=authordivakar/str arr name=content_type str application/vnd.openxmlformats-officedocument.wordprocessingml.document /str /arr str name=iddoc47/str date name=last_modified2012-04-18T14:19:00Z/date /doc /result Could you please help where I am wrong? With Thaks Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3925670.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with UIMA
Hi Chris, Are you been able to get success to integrate the UIMA in SOLR. I too tried to integrate Uima in Solr by following the instructions provided in README i.e. the following four steps: Step1. I set lib/ tags in solrconfig.xml appropriately to point the jar files. lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Step2. modified my schema.xml adding the fields I wanted to hold metadata specifying proper values for type, indexed, stored and multiValued options as follows: field name=language type=string indexed=true stored=true required=false/ field name=concept type=string indexed=true stored=true multiValued=true required=false/ field name=sentence type=text indexed=true stored=true multiValued=true required=false / Step3. modified my solrconfig.xml adding the following snippet: updateRequestProcessorChain name=uima processor class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory lst name=uimaConfig lst name=runtimeParameters str name=keyword_apikeyVALID_ALCHEMYAPI_KEY/str str name=concept_apikeyVALID_ALCHEMYAPI_KEY/str str name=lang_apikeyVALID_ALCHEMYAPI_KEY/str str name=cat_apikeyVALID_ALCHEMYAPI_KEY/str str name=entities_apikeyVALID_ALCHEMYAPI_KEY/str str name=oc_licenseIDVALID_OPENCALAIS_KEY/str /lst str name=analysisEngine/org/apache/uima/desc/OverridingParamsExtServicesAE.xml/str bool name=ignoreErrorstrue/bool lst name=analyzeFields bool name=mergefalse/bool arr name=fields strtext/str /arr /lst lst name=fieldMappings lst name=type str name=nameorg.apache.uima.alchemy.ts.concept.ConceptFS/str lst name=mapping str name=featuretext/str str name=fieldconcept/str /lst /lst lst name=type str name=nameorg.apache.uima.alchemy.ts.language.LanguageFS/str lst name=mapping str name=featurelanguage/str str name=fieldlanguage/str /lst /lst lst name=type str name=nameorg.apache.uima.SentenceAnnotation/str lst name=mapping str name=featurecoveredText/str str name=fieldsentence/str /lst /lst /lst /lst /processor processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain Step 4: and finally created a new UpdateRequestHandler with the following: requestHandler name=/update class=solr.XmlUpdateRequestHandler lst name=defaults str name=update.processoruima/str /lst Further I indexed a word file called text.docx using the following command: curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true; -F myfile=@UIMA_sample_test.docx When I searched the file I am not able to see the additional UIMA fields. Can you please help if you been able to solve the problem. With Regds Thanks Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3923443.html Sent from the Solr - User mailing list archive at Nabble.com.
What Interface to use for programming compatible filters in SOLR?
Dear all, I want to include my own filter for analysis of tokens while indexing the documents in SOLR. Is there any explicit interface for programming compatible filters in SOLR? Please let me know the steps to be followed to use my own filters in Schema.xml file. I mean, If I create a java class for a filter as per my requirement then how I can use/integrate it in SOLR schema.xml file for indexing the documents. Thanks in advance. With regds: Divakar Yadav -- View this message in context: http://lucene.472066.n3.nabble.com/What-Interface-to-use-for-programming-compatible-filters-in-SOLR-tp3772171p3772171.html Sent from the Solr - User mailing list archive at Nabble.com.
Do SOLR supports Lemmatization
Dear all, I want to know, do SOLR support Lemmatization? If yes, which in-built Lemmatizer class should be included in SOLR schema file to analyze the tokens using lemmatization rather than stemming. Thanks in advance. With Thanks Regds: Divakar Yadav -- View this message in context: http://lucene.472066.n3.nabble.com/Do-SOLR-supports-Lemmatization-tp3763139p3763139.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to index documents in SOLR running in Window XP envronment
Dear Gora and all, Thank you very much for replying. My question is how to index documents (.XML, .pdf, .doc files) in Solr. I was trying using curl but it is not working in Windows XP environment. Do any one of you have any ready made program/DIH which I can use to index these types of files. Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-documents-in-SOLR-running-in-Window-XP-envronment-tp3632488p3634507.html Sent from the Solr - User mailing list archive at Nabble.com.
Unable to getting started with SOLR
Hi all, Sorry for the in convenience caused if to anyone but I need reply for following. I want to work in Solr and for the same I downloaded it and started to follow the instruction provided in the Tutorial available at http://lucene.apache.org/solr/tutorial.html; to execute some examples first. but when I tried to check whether Solr is running or not bye using http://localhost:8983/solr/admin/; in the web browser I found the following message. I will be thankful if one can suggest some solution for it. Message: Unable to connect Firefox can't establish a connection to the server at localhost:8983. The site could be temporarily unavailable or too busy. Try again in a few moments. If you are unable to load any pages, check your computer's network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web. _ With Regds: Divakar -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html Sent from the Solr - User mailing list archive at Nabble.com.