all searches return 0 hits - what have I done wrong?
Hi, I am starting my solr instance with the command java -Dsolr.solr.home=./test1/solr/ -jar start.jar where I have a solr.xml file ?xml version=1.0 encoding=UTF-8 standalone=yes? solr sharedLib=lib persistent=true cores adminPath=/admin/cores core default=false instanceDir=tester name=tester/ /cores /solr In the folder tester I have configurations - adapted from the rss examples DataImporter.xml dataConfig dataSource name=myfilereader type=FileDataSource/ document entity name=jc rootEntity=false dataSource=null processor=FileListEntityProcessor fileName=^.*\.xml$ recursive=true baseDir=/projects/solrtest/transformedimport entity name=x rootEntity=true dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} stream=false forEach=/ARTIKEL transformer=DateFormatTransformer,TemplateTransformer,RegexTransformer,LogTransformer logTemplate=processing ${jc.fileAbsolutePath} logLevel=info field column=title xpath=/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/AKROP/TXT / /entity /entity /document /dataConfig solrconfig.xml - same as the rss example only removed elevate components. schema.xml fields field name=title type=text indexed=true stored=true / field name=txt type=text indexed=true stored=true / field name=all_text type=text indexed=true stored=true multiValued=true / copyField source=title dest=all_text / copyField source=txt dest=all_text / /fields removed the uniqueKey constraint. When I go to http://localhost:8983/solr/tester/admin/ I get the admin page. When I run http://localhost:8983/solr/tester/dataimport?command=full-import it says response − lst name=responseHeader int name=status0/int int name=QTime16/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=commandfull-import/str str name=statusidle/str str name=importResponse/ lst name=statusMessages/ − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response When I look at the log of that it says a bunch of stuff like: INFO: processing c:\projects\solrtest\transformed\1.xml org.apache.solr.common.util.XMLErrorLogger report WARNING: XmL parser reported xml declaration in null, line 1, column 38: Inconsistent text encoding; declared as utf-8 in xml declaration, application had passed Cp1252 Here is one of the processed documents ?xml version=1.0 encoding=utf-8 ? - ARTIKEL ID=MM2010ADMINISTRATIONSYDELSER - DOKTITEL OVERSKRIFT1Administrationsydelser (MomsManual)/OVERSKRIFT1 /DOKTITEL - AKROP TXTAdministrationsydelser er momspligtige. Dette gælder også når de faktureres koncerninternt, f.eks. fra et moderselskab (holdingselskab) til et datterselskab./TXT TXTDer er fradragsret for moms vedrørende køb af administrationsydelser i samme omfang, som virksomheden kan fratrække momsen af øvrige fællesomkostninger./TXT TXTHvis administrationsydelser faktureres på tværs af landegrænserne, f.eks. indenfor internationale koncerner, kan der gælde forskellige principper for momsberegningen i de enkelte EU-lande. Hvis en administrationsydelse faktureres fra Danmark til et datterselskab i et andet land, herunder også i andre EU-lande, er det myndighedernes holdning, at der skal faktureres med dansk moms./TXT TXTHvis en administrationsydelse faktureres mellem et selskab og dets filial/-er, skal faktura altid udstedes uden moms. Handel med ydelser mellem et selskab og dets filial/-er anses ikke for at udgøre momspligtige transaktioner./TXT TXTORegler/TXTO - TXT LR IDREF=LBKG2005966.§15 CREATOR=autolink TARGETTYPE=RELML § 15/LR /TXT /AKROP /ARTIKEL If I search for the text Administrationsydelser http://localhost:8983/solr/tester/select/?q=Administrationsydelserversion=2.2start=0rows=10indent=on I get response − lst name=responseHeader int name=status0/int int name=QTime0/int − lst name=params str name=indenton/str str name=start0/str str name=qAdministrationsydelser/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response There is a segments.gen and a segments_4 file in my index but nothing else. Tried looking with Luke but it seems not to be compatible with the newest versions of Lucene... version of solr is 3.1.0 Thanks, Bryan Rasmussen
Re: all searches return 0 hits - what have I done wrong?
Also if I check solr/tester/dataimport it responds: response − lst name=responseHeader int name=status0/int int name=QTime0/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=statusidle/str str name=importResponse/ − lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched1634/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-04-18 11:55:47/str − str name= Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. /str str name=Committed2011-04-18 11:55:48/str str name=Optimized2011-04-18 11:55:48/str str name=Total Documents Processed0/str str name=Time taken 0:0:0.922/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response On Mon, Apr 18, 2011 at 11:46 AM, bryan rasmussen rasmussen.br...@gmail.com wrote: Hi, I am starting my solr instance with the command java -Dsolr.solr.home=./test1/solr/ -jar start.jar where I have a solr.xml file ?xml version=1.0 encoding=UTF-8 standalone=yes? solr sharedLib=lib persistent=true cores adminPath=/admin/cores core default=false instanceDir=tester name=tester/ /cores /solr In the folder tester I have configurations - adapted from the rss examples DataImporter.xml dataConfig dataSource name=myfilereader type=FileDataSource/ document entity name=jc rootEntity=false dataSource=null processor=FileListEntityProcessor fileName=^.*\.xml$ recursive=true baseDir=/projects/solrtest/transformedimport entity name=x rootEntity=true dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} stream=false forEach=/ARTIKEL transformer=DateFormatTransformer,TemplateTransformer,RegexTransformer,LogTransformer logTemplate=processing ${jc.fileAbsolutePath} logLevel=info field column=title xpath=/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/AKROP/TXT / /entity /entity /document /dataConfig solrconfig.xml - same as the rss example only removed elevate components. schema.xml fields field name=title type=text indexed=true stored=true / field name=txt type=text indexed=true stored=true / field name=all_text type=text indexed=true stored=true multiValued=true / copyField source=title dest=all_text / copyField source=txt dest=all_text / /fields removed the uniqueKey constraint. When I go to http://localhost:8983/solr/tester/admin/ I get the admin page. When I run http://localhost:8983/solr/tester/dataimport?command=full-import it says response − lst name=responseHeader int name=status0/int int name=QTime16/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=commandfull-import/str str name=statusidle/str str name=importResponse/ lst name=statusMessages/ − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response When I look at the log of that it says a bunch of stuff like: INFO: processing c:\projects\solrtest\transformed\1.xml org.apache.solr.common.util.XMLErrorLogger report WARNING: XmL parser reported xml declaration in null, line 1, column 38: Inconsistent text encoding; declared as utf-8 in xml declaration, application had passed Cp1252 Here is one of the processed documents ?xml version=1.0 encoding=utf-8 ? - ARTIKEL ID=MM2010ADMINISTRATIONSYDELSER - DOKTITEL OVERSKRIFT1Administrationsydelser (MomsManual)/OVERSKRIFT1 /DOKTITEL - AKROP TXTAdministrationsydelser er momspligtige. Dette gælder også når de faktureres koncerninternt, f.eks. fra et moderselskab (holdingselskab) til et datterselskab./TXT TXTDer er fradragsret for moms vedrørende køb af administrationsydelser i samme omfang, som virksomheden kan fratrække momsen af øvrige fællesomkostninger./TXT TXTHvis administrationsydelser faktureres på tværs af landegrænserne, f.eks. indenfor internationale koncerner, kan der gælde forskellige principper for momsberegningen i de enkelte EU-lande. Hvis en administrationsydelse faktureres fra Danmark til et datterselskab i et andet land, herunder også i andre EU-lande, er det myndighedernes holdning, at der skal faktureres med dansk moms./TXT TXTHvis en administrationsydelse faktureres mellem et selskab og dets filial/-er, skal faktura altid udstedes uden moms. Handel med ydelser mellem et selskab og dets filial/-er anses ikke for at udgøre momspligtige transaktioner./TXT TXTORegler/TXTO - TXT LR IDREF=LBKG2005966.§15 CREATOR=autolink TARGETTYPE=RELML § 15/LR /TXT /AKROP /ARTIKEL If I search for the text Administrationsydelser
Re: all searches return 0 hits - what have I done wrong?
did you try with the comlete xpath ? field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/ARTIKEL/AKROP/TXT / Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833798.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
hah, actually I tried with complete xpaths earlier but they weren't working but that was because I had made a mistake in my foreach.. and then I decided that probably the foreach and the other xpaths were being concatenated. however it is not absolutely correct yet, if I run http://localhost:8983/solr/tester/dataimport?command=full-importdebug=true I get response − lst name=responseHeader int name=status0/int int name=QTime422/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=commandfull-import/str str name=modedebug/str − arr name=documents − lst − arr name=title strForord (MomsManual)/str /arr /lst − lst − arr name=title strAbonnementsudgifter (MomsManual)/str /arr /lst − lst − arr name=title strAb skf (MomsManual)/str /arr /lst − lst − arr name=title strAcontobeløb (MomsManual)/str /arr /lst − lst − arr name=title strAdgang til arrangementer (MomsManual)/str /arr /lst − lst − arr name=title strAdministration, fast ejendom (MomsManual)/str /arr /lst − lst − arr name=title strAdministrationsfællesskab (MomsManual)/str /arr /lst − lst − arr name=title strAdministrationsydelser (MomsManual)/str /arr /lst − lst − arr name=title strAdsl (MomsManual)/str /arr /lst − lst − arr name=title strAdvokatomkostninger (MomsManual)/str /arr /lst − lst − arr name=title strAfbestillingsgebyrer (MomsManual)/str /arr /lst /arr lst name=verbose-output/ str name=statusidle/str str name=importResponseConfiguration Re-loaded sucessfully/str − lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched22/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-04-18 12:26:52/str − str name= Indexing completed. Added/Updated: 11 documents. Deleted 0 documents. /str str name=Total Documents Processed11/str str name=Time taken 0:0:0.406/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response so the title fields field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / are being added but not the the text fields field column=text xpath=/ARTIKEL/AKROP/TXT / The most salient difference between these two is that will be more than one TXT, I just tried with the parent element however and it didn't do anything. But when I do a search for MomsManual which you can see is in all the title fields I get response − lst name=responseHeader int name=status0/int int name=QTime0/int − lst name=params str name=indenton/str str name=start0/str str name=qMomsManual/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response :( Thanks, Bryan Rasmussen On Mon, Apr 18, 2011 at 12:23 PM, lboutros boutr...@gmail.com wrote: did you try with the comlete xpath ? field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/ARTIKEL/AKROP/TXT / Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833798.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
If a document contains multiple 'txt' fields, it should be marked as 'multiValued'. field name=txt type=text indexed=true stored=true multiValued=true/ But if I'm understanding well, you also tried this ? : field column=text xpath=/ARTIKEL/AKROP / And for your search (MomsManual), could you give us your analyzer from the schema.xml please ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833876.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype /types fields field name=title type=text indexed=true stored=true / field name=txt type=text indexed=true stored=true / field name=all_text type=text indexed=true stored=true multiValued=true / copyField source=title dest=all_text / copyField source=txt dest=all_text / /fields defaultSearchFieldall_text/defaultSearchField solrQueryParser defaultOperator=AND/ /schema the protwords.txt and stopwords.txt are also from the rss example. thanks, Bryan Rasmussen On Mon, Apr 18, 2011 at 12:55 PM, lboutros boutr...@gmail.com wrote: If a document contains multiple 'txt' fields, it should be marked as 'multiValued'. field name=txt type=text indexed=true stored=true multiValued=true/ But if I'm understanding well, you also tried this ? : field column=text xpath=/ARTIKEL/AKROP / And for your search (MomsManual), could you give us your analyzer from the schema.xml please ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833876.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype /types fields field name=title type=text indexed=true stored=true / field name=txt type=text indexed=true stored=true / field name=all_text type=text indexed=true stored=true multiValued=true / copyField source=title dest=all_text / copyField source=txt dest=all_text / /fields defaultSearchFieldall_text/defaultSearchField solrQueryParser defaultOperator=AND/ /schema the protwords.txt and stopwords.txt are also from the rss example. thanks, Bryan Rasmussen On Mon, Apr 18, 2011 at 12:55 PM, lboutros boutr...@gmail.com wrote: If a document contains multiple 'txt' fields, it should be marked as 'multiValued'. field name=txt type=text indexed=true stored=true multiValued=true/ But if I'm understanding well, you also tried this ? : field column=text xpath=/ARTIKEL/AKROP / And for your search (MomsManual), could you give us your analyzer from the schema.xml please ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833876.html Sent from the Solr - User mailing list archive at Nabble.com.