Re: all searches return 0 hits - what have I done wrong?
Also if I check solr/tester/dataimport it responds: response − lst name=responseHeader int name=status0/int int name=QTime0/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=statusidle/str str name=importResponse/ − lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched1634/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-04-18 11:55:47/str − str name= Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. /str str name=Committed2011-04-18 11:55:48/str str name=Optimized2011-04-18 11:55:48/str str name=Total Documents Processed0/str str name=Time taken 0:0:0.922/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response On Mon, Apr 18, 2011 at 11:46 AM, bryan rasmussen rasmussen.br...@gmail.com wrote: Hi, I am starting my solr instance with the command java -Dsolr.solr.home=./test1/solr/ -jar start.jar where I have a solr.xml file ?xml version=1.0 encoding=UTF-8 standalone=yes? solr sharedLib=lib persistent=true cores adminPath=/admin/cores core default=false instanceDir=tester name=tester/ /cores /solr In the folder tester I have configurations - adapted from the rss examples DataImporter.xml dataConfig dataSource name=myfilereader type=FileDataSource/ document entity name=jc rootEntity=false dataSource=null processor=FileListEntityProcessor fileName=^.*\.xml$ recursive=true baseDir=/projects/solrtest/transformedimport entity name=x rootEntity=true dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} stream=false forEach=/ARTIKEL transformer=DateFormatTransformer,TemplateTransformer,RegexTransformer,LogTransformer logTemplate=processing ${jc.fileAbsolutePath} logLevel=info field column=title xpath=/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/AKROP/TXT / /entity /entity /document /dataConfig solrconfig.xml - same as the rss example only removed elevate components. schema.xml fields field name=title type=text indexed=true stored=true / field name=txt type=text indexed=true stored=true / field name=all_text type=text indexed=true stored=true multiValued=true / copyField source=title dest=all_text / copyField source=txt dest=all_text / /fields removed the uniqueKey constraint. When I go to http://localhost:8983/solr/tester/admin/ I get the admin page. When I run http://localhost:8983/solr/tester/dataimport?command=full-import it says response − lst name=responseHeader int name=status0/int int name=QTime16/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=commandfull-import/str str name=statusidle/str str name=importResponse/ lst name=statusMessages/ − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response When I look at the log of that it says a bunch of stuff like: INFO: processing c:\projects\solrtest\transformed\1.xml org.apache.solr.common.util.XMLErrorLogger report WARNING: XmL parser reported xml declaration in null, line 1, column 38: Inconsistent text encoding; declared as utf-8 in xml declaration, application had passed Cp1252 Here is one of the processed documents ?xml version=1.0 encoding=utf-8 ? - ARTIKEL ID=MM2010ADMINISTRATIONSYDELSER - DOKTITEL OVERSKRIFT1Administrationsydelser (MomsManual)/OVERSKRIFT1 /DOKTITEL - AKROP TXTAdministrationsydelser er momspligtige. Dette gælder også når de faktureres koncerninternt, f.eks. fra et moderselskab (holdingselskab) til et datterselskab./TXT TXTDer er fradragsret for moms vedrørende køb af administrationsydelser i samme omfang, som virksomheden kan fratrække momsen af øvrige fællesomkostninger./TXT TXTHvis administrationsydelser faktureres på tværs af landegrænserne, f.eks. indenfor internationale koncerner, kan der gælde forskellige principper for momsberegningen i de enkelte EU-lande. Hvis en administrationsydelse faktureres fra Danmark til et datterselskab i et andet land, herunder også i andre EU-lande, er det myndighedernes holdning, at der skal faktureres med dansk moms./TXT TXTHvis en administrationsydelse faktureres mellem et selskab og dets filial/-er, skal faktura altid udstedes uden moms. Handel med ydelser mellem et selskab og dets filial/-er anses ikke for at udgøre momspligtige transaktioner./TXT TXTORegler/TXTO - TXT LR IDREF=LBKG2005966.§15 CREATOR=autolink TARGETTYPE=RELML § 15/LR /TXT /AKROP /ARTIKEL If I search for the text Administrationsydelser
Re: all searches return 0 hits - what have I done wrong?
did you try with the comlete xpath ? field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/ARTIKEL/AKROP/TXT / Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833798.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
hah, actually I tried with complete xpaths earlier but they weren't working but that was because I had made a mistake in my foreach.. and then I decided that probably the foreach and the other xpaths were being concatenated. however it is not absolutely correct yet, if I run http://localhost:8983/solr/tester/dataimport?command=full-importdebug=true I get response − lst name=responseHeader int name=status0/int int name=QTime422/int /lst − lst name=initArgs − lst name=defaults str name=configdataimporter.xml/str /lst /lst str name=commandfull-import/str str name=modedebug/str − arr name=documents − lst − arr name=title strForord (MomsManual)/str /arr /lst − lst − arr name=title strAbonnementsudgifter (MomsManual)/str /arr /lst − lst − arr name=title strAb skf (MomsManual)/str /arr /lst − lst − arr name=title strAcontobeløb (MomsManual)/str /arr /lst − lst − arr name=title strAdgang til arrangementer (MomsManual)/str /arr /lst − lst − arr name=title strAdministration, fast ejendom (MomsManual)/str /arr /lst − lst − arr name=title strAdministrationsfællesskab (MomsManual)/str /arr /lst − lst − arr name=title strAdministrationsydelser (MomsManual)/str /arr /lst − lst − arr name=title strAdsl (MomsManual)/str /arr /lst − lst − arr name=title strAdvokatomkostninger (MomsManual)/str /arr /lst − lst − arr name=title strAfbestillingsgebyrer (MomsManual)/str /arr /lst /arr lst name=verbose-output/ str name=statusidle/str str name=importResponseConfiguration Re-loaded sucessfully/str − lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched22/str str name=Total Documents Skipped0/str str name=Full Dump Started2011-04-18 12:26:52/str − str name= Indexing completed. Added/Updated: 11 documents. Deleted 0 documents. /str str name=Total Documents Processed11/str str name=Time taken 0:0:0.406/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response so the title fields field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / are being added but not the the text fields field column=text xpath=/ARTIKEL/AKROP/TXT / The most salient difference between these two is that will be more than one TXT, I just tried with the parent element however and it didn't do anything. But when I do a search for MomsManual which you can see is in all the title fields I get response − lst name=responseHeader int name=status0/int int name=QTime0/int − lst name=params str name=indenton/str str name=start0/str str name=qMomsManual/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response :( Thanks, Bryan Rasmussen On Mon, Apr 18, 2011 at 12:23 PM, lboutros boutr...@gmail.com wrote: did you try with the comlete xpath ? field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/ARTIKEL/AKROP/TXT / Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833798.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
If a document contains multiple 'txt' fields, it should be marked as 'multiValued'. field name=txt type=text indexed=true stored=true multiValued=true/ But if I'm understanding well, you also tried this ? : field column=text xpath=/ARTIKEL/AKROP / And for your search (MomsManual), could you give us your analyzer from the schema.xml please ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/all-searches-return-0-hits-what-have-I-done-wrong-tp2833706p2833876.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: all searches return 0 hits - what have I done wrong?
well basically I copied out the RSS example as I figured that would be the closest to what I wanted to do ?xml version=1.0 encoding=UTF-8 ? schema name=tester version=1.1 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ fieldType name=integer class=solr.IntField omitNorms=true/ fieldType name=long class=solr.LongField omitNorms=true/ fieldType name=float class=solr.FloatField omitNorms=true/ fieldType name=double class=solr.DoubleField omitNorms=true/ fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ fieldType name=date class=solr.DateField sortMissingLast=true omitNorms=true/ fieldType name=random class=solr.RandomSortField indexed=true / fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType !-- Less flexible matching, but less false matches. Probably not ideal for product names, but may be good for SKUs. Can insert dashes in the wrong place and still match. -- fieldType name=textTight class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.EnglishMinimalStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / !-- The TrimFilter removes any leading or trailing whitespace -- filter class=solr.TrimFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=([^a-z]) replacement= replace=all / /analyzer /fieldType fieldtype name=ignored stored=false indexed=false class=solr.StrField / fieldtype name=html stored=true indexed=true class=solr.TextField analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer
Re: all searches return 0 hits - what have I done wrong?
Hmm, ok I see the schema was wrong - I was calling the TEXT field txt... also now I am getting results on my title search after another restart and reindex - setting the TXT fields to be multiValued. Thanks, Bryan Rasmussen On Mon, Apr 18, 2011 at 1:09 PM, bryan rasmussen rasmussen.br...@gmail.com wrote: well basically I copied out the RSS example as I figured that would be the closest to what I wanted to do ?xml version=1.0 encoding=UTF-8 ? schema name=tester version=1.1 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ fieldType name=integer class=solr.IntField omitNorms=true/ fieldType name=long class=solr.LongField omitNorms=true/ fieldType name=float class=solr.FloatField omitNorms=true/ fieldType name=double class=solr.DoubleField omitNorms=true/ fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ fieldType name=date class=solr.DateField sortMissingLast=true omitNorms=true/ fieldType name=random class=solr.RandomSortField indexed=true / fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType !-- Less flexible matching, but less false matches. Probably not ideal for product names, but may be good for SKUs. Can insert dashes in the wrong place and still match. -- fieldType name=textTight class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.EnglishMinimalStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / !-- The TrimFilter removes any leading or trailing whitespace -- filter class=solr.TrimFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=([^a-z]) replacement= replace=all / /analyzer /fieldType fieldtype name=ignored stored=false indexed=false class=solr.StrField / fieldtype name=html stored=true indexed=true class=solr.TextField analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/