I think you're still missing the critical bit. Highlighting is
completely separate from searching. In other words, you can search on
one field and highlight another. What field is searched is governed by
the "qf" parameter when using edismax and by the the "df" parameter
configured in your request handler in solrconfig.xml. These defaults
are overridden when you do a "fielded search" like

q=content:nietava

So this: q=content:nietava&hl=true&hl.fl=content
is searching the "content" field. The word you're looking for isn't in
the content field so naturally no docs are returned. And no
highlighting either.

This: q=nietava&hl=true&hl.fl=content

is searching somewhere else, thus getting the hit. We already know
that "nietava" is not in the content field because the first search
failed. You need to find out what field is being matched (probably
something like "text") and then try highlighting on _that_ field. Try
adding "debug=query" to the URL and look at the "parsed_query" section
of the return and you'll see what field(s) is/are actually being
searched against.

NOTE: The field you highlight on _must_ have stored="true" in schema.xml.

As to why "nietava" isn't being found in the content field, probably
you have some kind of analysis chain configured for that field that
isn't searching as you expect. See the admin/analysis page for some
insight into why that would be. The most frequent reason is that the
field is a "string" type which is not broken up into words. Another
possibility is that your analysis chain is leaving in the quotes or
something similar. As James says, looking at admin/analysis is a good
way to figure this out.

I still strongly recommend you go from the stock techproducts example
and get familiar with how Solr (and highlighting) work before jumping
in and changing things. There are a number of ways things can be
mis-configured and trying to change several things at once is a fine
way to go mad. The admin UI>>schema browser is another way you can see
what kind of terms are _actually_ in your index in a particular field.

Best,
Erick




On Wed, Dec 16, 2015 at 12:26 PM, Teague James <teag...@insystechinc.com> wrote:
> Sorry to hear that didn't work! Let me ask a couple of questions...
>
> Have you tried the analyzer inside of the Admin Interface? It has helped me 
> sort out a number of highlighting issues in the past. To access it, go to 
> your Admin interface, select your core, then select Analysis from the list of 
> options on the left. In the analyzer, enter the term you are indexing in the 
> top left (in other words the term in the document you are indexing that you 
> expect to get a hit on) and right input fields. Select the field that it is 
> destined for (in your case that would be 'content'), then hit analyze. Helps 
> if you have a big screen!
>
> This will show you the impact of the various filter factories that you have 
> engaged and their effect on whether or not a 'hit' is being generated. Hits 
> are idietified by a very feint highlight. (PSST... Developers... It would be 
> really cool if the highlight color were more visible or customizable... 
> Thanks y'all) If it looks like you're getting hits, but not getting 
> highlighting, then open up a new tab with the Admin's query interface. Same 
> place on the left as the analyzer. Replace the "*:*" with your search term 
> (assuming you already indexed your document) and if necessary you can put 
> something in the FQ like "id:123456" to target a specific record.
>
> Did you get a hit? If no, then it's not highlighting that's the issue. If 
> yes, then try dumping this in your address bar (using your URL/IP, search 
> term, and core name of course. The fq= is an example) :
> http://[URL/IP]/solr/[CORE-NAME]/select?fq=id:123456&q="[SEARCH-TERM]";
>
> That will dump Solr's output to your browser where you can see exactly what 
> is getting hit.
>
> Hope that helps! Let me know how it goes. Good luck.
>
> -Teague
>
> -----Original Message-----
> From: Evert R. [mailto:evert.ra...@gmail.com]
> Sent: Wednesday, December 16, 2015 1:46 PM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Solr Basic Configuration - Highlight - Begginer
>
> Hi Teague!
>
> I configured the solrconf.xml and schema.xml exactly the way you did, only 
> substituting the word 'documentText' per 'content' used by the techproducts 
> sample, I reindex through :
>
>  curl '
> http://localhost:8983/solr/techproducts/update/extract?literal.id=pdf1&commit=true'
> -F "Emmanuel=@/home/solr/dados/teste/Emmanuel.pdf"
>
> with the same result.... no highlight in the respond as below:
>
> "highlighting": { "pdf1": {} }
>
> =(
>
> Really... do not know what to do...
>
> Thanks for your time, if you have any more suggestion where I could be 
> missing something... please let me know.
>
>
> Best regards,
>
> *Evert*
>
> 2015-12-16 15:30 GMT-02:00 Teague James <teag...@insystechinc.com>:
>
>> Hi Evert,
>>
>> I recently needed help with phrase highlighting and was pointed to the
>> FastVectorHighlighter which worked out great. I just made a change to
>> the configuration to add generateWordParts="0" and
>> generateNumberParts="0" so that searches for things like "1a" would
>> get highlighted correctly. You may or may not need that feature. You
>> can always remove them or change the value to "1" to switch them on 
>> explicitly. Anyway, hope this helps!
>>
>> solrconfig.xml (partial snip)
>> <requestHandler name="/select" class="solr.SearchHandler">
>>                 <lst name="defaults">
>>                         <str name="wt">xml</str>
>>                         <str name="echoParams">explicit</str>
>>                         <int name="rows">10</int>
>>                         <str name="df">documentText</str>
>>                         <str name="hl">on</str>
>>                         <str name="hl.fl">text</str>
>>                         <str name="hl.useFastVectorHighlighter">true</str>
>>                         <str name="hl.snippets">100</str>
>>                         <str name="hl.tag.pre"><b></str>
>>                         <str name="hl.tag.post"></b></str>
>>                 </lst>
>> </requestHandler>
>>
>> schema.xml (partial snip)
>>    <field name="id" type="string" indexed="true" stored="true"
>> required="true" multiValued="false" />
>>    <field name="documentText" type="text_general" indexed="true"
>> multivalued="true" termVectors="true" termOffsets="true"
>> termPositions="true" />
>>
>> <fieldType name="text_general" class="solr.TextField"
>> positionIncrementGap="100">
>>         <analyzer type="index">
>>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>                 <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" />
>>                 <filter class="solr.WordDelimiterFilterFactory"
>> catenateAll="1" preserveOriginal="1" generateNumberParts="0"
>> generateWordParts="0" />
>>                 <filter class="solr.SynonymFilterFactory"
>> synonyms="index_synonyms.txt" ignoreCase="true" expand="true"/>
>>                 <filter class="solr.LowerCaseFilterFactory"/>
>>                 <filter class="solr.PorterStemFilterFactory"/>
>>                 <filter class="solr.ApostropheFilterFactory"/>
>>         </analyzer>
>>         <analyzer type="query">
>>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>                 <filter class="solr.WordDelimiterFilterFactory"
>> catenateAll="1" preserveOriginal="1" generateWordParts="0" />
>>                 <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" />
>>                 <filter class="solr.LowerCaseFilterFactory"/>
>>                 <filter class="solr.ApostropheFilterFactory"/>
>>         </analyzer>
>> </fieldType>
>>
>> -Teague
>>
>> From: Evert R. [mailto:evert.ra...@gmail.com]
>> Sent: Tuesday, December 15, 2015 6:25 AM
>> To: solr-user@lucene.apache.org
>> Subject: Solr Basic Configuration - Highlight - Begginer
>>
>> Hi there!
>>
>> It´s my first installation, not sure if here is the right channel...
>>
>> Here is my steps:
>>
>> 1. Set up a basic install of solr 5.4.0
>>
>> 2. Create a new core through command line (bin/solr create -c test)
>>
>> 3. Post 2 files: 1 .docx and 2 .pdf (bin/post -c test /docs/test/)
>>
>> 4. Query over the browser and it brings the correct search, but it
>> does not show the part of the text I am querying, the highlight.
>>
>>   I have already flagled the 'hl' option. But still it does not word...
>>
>> Exemple: I am looking for the word 'peace' in my pdf file (book) I
>> have 4 matches for this word, it shows me the book name (pdf file) but
>> does not bring which part of the text it has the word peace on it.
>>
>>
>> I am problably missing some configuration in schema.xml, which is
>> missing from my folder.... /solr/server/solr/test/conf/
>>
>> Or even the solrconfig.xml...
>>
>> I have read a bunch of things about highlight check these files,
>> copied the standard schema.xml to my core/conf folder, but still it
>> does not bring the highlight.
>>
>>
>> Attached a copy of my solrconfig.xml file.
>>
>>
>> I am very sorry for this, probably, dumb and too basic question...
>> First time I see solr in live.
>>
>>
>> Any help will be appreciated.
>>
>>
>>
>> Best regards,
>>
>>
>> Evert Ramos
>>
>> mailto:evert.ra...@gmail.com
>>
>>
>>
>

Reply via email to