bq: You mention the I was searching for concord and that its not in
any documents.  But the results below clearly show 3 hits

Right, as you figured out I _really_ meant "concord in any stored
fields you were including in the hl.fl parameter". That could have
been clearer.

bq: Is there a problem with storing _text_  so I can get a highlight
fragment when a hit is found there?

No, you can store the data in the _text_ field just fine, you'll have
to re-index after the change though. It's often more useful to a user
to see the highlights in specific fields though, so I wouldn't throw
the rest of the highlighting away.

You should probably see the FastVectorHighlighter though. If you don't
use FVH, highlighting re-analyzes the raw text to produce the snippets
which may be expensive for large text fields.

Best,
Erick


On Wed, Aug 12, 2015 at 8:46 AM, Scott Derrick <sc...@tnstaafl.net> wrote:
> Erick,
>
> that explains it. I figured I didn't understand how solr handled highlight
> fragments.
>
> Most of my documents are just text. or as solr specifies that content
> _text_, which is not stored, by default.
>
> You mention the I was searching for concord and that its not in any
> documents.  But the results below clearly show 3 hits
>
>>>    "response":{"numFound":3,"start":0,"docs":[
>
> the problem is the hits are in _text_
>
> Is there a problem with storing _text_  so I can get a highlight fragment
> when a hit is found there?
>
> Scott
>
> -------- Original Message --------
> Subject: Re: Highlighting, all matches show empty {}
> From: Erick Erickson <erickerick...@gmail.com>
> To: solr-user@lucene.apache.org
> Date: 08/12/2015 09:27 AM
>
>> Well, the example you just showed shouldn't show any highlighting. Your
>> query is
>> q=concord
>> so it's trying to highlight "concord" which isn't in any of your
>> documents. hl.q can be
>> used to highlight something other than your q parameter.
>>
>> I did notice in some of your other examples that you seemed to be
>> searching for
>> terms that were in the fields so I suspect this isn't really your root
>> problem though.
>>
>> do note that fields _must_ be stored to have highlighting work. Is it
>> possible that your
>> matches are on fields that aren't stored?
>>
>> Let's build it up slowly though, try searching on one term in one
>> field that you _know_
>> is stored and see if you get anything back. While the query with
>> hl.fl=* and fl=field1, field2,
>> should be fine, let's start as simply as possible and work up maybe?
>>
>> Best,
>> Erick
>>
>> On Wed, Aug 12, 2015 at 7:59 AM, Scott Derrick <sc...@tnstaafl.net> wrote:
>>>
>>> I think the highlighter is actually running, but I'm not getting the
>>> results??
>>>
>>> with this request
>>>
>>>
>>> http://localhost:8983/solr/mbepp/select?q=concord&fl=accession%2C+title%2C+author%2C+date&wt=json&indent=true&hl=true&hl.fl=*
>>>
>>>
>>> I get this response
>>>
>>> {
>>>    "responseHeader":{
>>>      "status":0,
>>>      "QTime":3,
>>>      "params":{
>>>        "q":"concord",
>>>        "hl":"true",
>>>        "indent":"true",
>>>        "fl":"accession, title, author, date",
>>>        "hl.fl":"*",
>>>        "wt":"json"}},
>>>    "response":{"numFound":3,"start":0,"docs":[
>>>        {
>>>          "date":"1890-02-26",
>>>          "author":"Mary Baker Eddy",
>>>          "accession":"L13943",
>>>          "title":["Mary Baker Eddy to Joseph E. Adams,"]},
>>>        {
>>>          "date":"1896-01-13",
>>>          "author":"Mary Baker Eddy",
>>>          "accession":"L03453",
>>>          "title":["Mary Baker Eddy to Ira O. Knapp,"]},
>>>        {
>>>          "date":"1902-06-15",
>>>          "author":"Mary Baker Eddy",
>>>          "accession":"A10145",
>>>          "title":["Message of the Pastor Emeritus to The First Church of
>>> Christ, Scientist, Boston, Mass., June 15, 1902"]}]
>>>    },
>>>    "highlighting":{
>>>
>>>
>>> "/home/scott/workspace/mbel-work/tei2html/build/web/L13943/L13943.html":{},
>>>
>>>
>>> "/home/scott/workspace/mbel-work/tei2html/build/web/L03453/L03453.html":{},
>>>
>>>
>>> "/home/scott/workspace/mbel-work/tei2html/build/web/A10145/A10145.html":{}}}
>>>
>>> When I ran the request.
>>> In the admin plubins/Stats I set "Watch Changes" before processing the
>>> request.  Highlighting showed 2 changes, the gapFragmenter and
>>> HTMLFormatter
>>>
>>> here are the reported changes
>>>
>>> org.apache.solr.highlight.GapFragmenter
>>>      class: org.apache.solr.highlight.GapFragmenter
>>>      version: 5.2.1
>>>      description: GapFragmenter
>>>      stats: requests: Was: 117, Now: 156, Delta: 39
>>>
>>> org.apache.solr.highlight.HtmlFormatter
>>>      class: org.apache.solr.highlight.HtmlFormatter
>>>      version:5.2.1
>>>      description:HtmlFormatter
>>>      stats: requests: Was: 117, Now: 156, Delta: 39
>>>
>>> Looks to me like there were 39 fragments or something processed, yet you
>>> can
>>> see above the highlights are empty {}???
>>>
>>> though all the the other libraries in the highlighter showed no changes.
>>>
>>> which are these...
>>>
>>>      org.apache.solr.highlight.BreakIteratorBoundaryScanner
>>>      org.apache.solr.highlight.HtmlEncoder
>>>      org.apache.solr.highlight.RegexFragmenter
>>>      org.apache.solr.highlight.ScoreOrderFragmentsBuilder
>>>      org.apache.solr.highlight.SimpleBoundaryScanner
>>>      org.apache.solr.highlight.SimpleFragListBuilder
>>>      org.apache.solr.highlight.SingleFragListBuilder
>>>      org.apache.solr.highlight.WeightedFragListBuilder
>>>
>>>
>>> Scott
>>>
>>> -------- Original Message --------
>>> Subject: Highlighting, all matches show empty {}
>>> From: Scott Derrick <sc...@tnstaafl.net>
>>> To: solr-user@lucene.apache.org
>>> Date: 08/12/2015 08:20 AM
>>>
>>>> Tried submitting a filed for hl.fl still empty {}
>>>>
>>>> here are the query terms
>>>>
>>>> "responseHeader": {
>>>>       "status": 0,
>>>>       "QTime": 8,
>>>>       "params": {
>>>>         "q": "mary or calvin",
>>>>         "hl": "true",
>>>>         "hl.simple.post": "</em>",
>>>>         "indent": "true",
>>>>         "fl": "accession, title, author, date",
>>>>         "hl.fl": "*",
>>>>         "wt": "json",
>>>>         "hl.simple.pre": "<em>",
>>>>         "_": "1439388969240"
>>>>       }
>>>>
>>>> here is one of the responses, there were 135
>>>>
>>>> {
>>>>           "date": "1886-07-06",
>>>>           "author": "Mary Baker Eddy",
>>>>           "accession": "L02634",
>>>>           "title": [
>>>>             "Mary Baker Eddy to Josephine C. Woodbury, July 6, 1886"
>>>>           ]
>>>> },
>>>>
>>>> here is the highlight section listing the first 10 matches, still empty
>>>> {}
>>>>
>>>> "highlighting": {
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L02634/L02634.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10720/A10720.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L07894/L07894.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L09828/L09828.html":
>>>> {},
>>>>
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10636D/A10636D.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L13943/L13943.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html":
>>>> {},
>>>>
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10879/A10879.html":
>>>> {},
>>>>
>>>>
>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html":
>>>> {}
>>>>     }
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: Highlighting
>>>> From: Scott Derrick <sc...@tnstaafl.net>
>>>> To: solr-user@lucene.apache.org
>>>> Date: 08/12/2015 06:39 AM
>>>>
>>>>> I was pretty sure I tried that, though I thought if you don't specify
>>>>> it
>>>>> just uses the search terms?
>>>>>
>>>>> If I just search for "calvin" and don't specify a field, what do I
>>>>> assign hl.fl?
>>>>>
>>>>> Scott
>>>>>
>>>>> On 8/11/2015 7:27 PM, Erik Hatcher wrote:
>>>>>>
>>>>>>
>>>>>> Scott - doesn’t look you’ve specified hl.fl specifying which field(s)
>>>>>> to highlight.
>>>>>>
>>>>>> p.s. Erick Erickson surely likes your e-mail domain :)
>>>>>>
>>>>>>
>>>>>> —
>>>>>> Erik Hatcher, Senior Solutions Architect
>>>>>> http://www.lucidworks.com <http://www.lucidworks.com/>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Aug 11, 2015, at 9:02 PM, Scott Derrick <sc...@tnstaafl.net>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I guess I really don't get Highlighting in Solr.
>>>>>>>
>>>>>>> We are transitioning from Google Custom Search which generally sucks,
>>>>>>> but does return nicely formatted highlighted fragment.
>>>>>>>
>>>>>>> I turn highlighting on hl=true in the query and I get a highlighting
>>>>>>> section returned at the bottom of the page, each identified by the
>>>>>>> document file name with a empty {} .  It doesn't matter what I search
>>>>>>> for, plain text, a field, I get a list of documents followed by an
>>>>>>> empty brace?
>>>>>>>
>>>>>>> "highlighting": {
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html":
>>>>>>>
>>>>>>> {},
>>>>>>>
>>>>>>>
>>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html":
>>>>>>>
>>>>>>> {}
>>>>>>>    }
>>>>>>>
>>>>>>> I haven't made any changes to the default settings
>>>>>>>
>>>>>>>     <highlighting>
>>>>>>>        <!-- Configure the standard fragmenter -->
>>>>>>>        <!-- This could most likely be commented out in the "default"
>>>>>>> case -->
>>>>>>>        <fragmenter name="gap"
>>>>>>>                    default="true"
>>>>>>>                    class="solr.highlight.GapFragmenter">
>>>>>>>          <lst name="defaults">
>>>>>>>            <int name="hl.fragsize">100</int>
>>>>>>>          </lst>
>>>>>>>        </fragmenter>
>>>>>>>
>>>>>>>        <!-- A regular-expression-based fragmenter
>>>>>>>             (for sentence extraction)
>>>>>>>          -->
>>>>>>>        <fragmenter name="regex"
>>>>>>>                    class="solr.highlight.RegexFragmenter">
>>>>>>>          <lst name="defaults">
>>>>>>>            <!-- slightly smaller fragsizes work better because of
>>>>>>> slop
>>>>>>> -->
>>>>>>>            <int name="hl.fragsize">70</int>
>>>>>>>            <!-- allow 50% slop on fragment sizes -->
>>>>>>>            <float name="hl.regex.slop">0.5</float>
>>>>>>>            <!-- a basic sentence pattern -->
>>>>>>>            <str name="hl.regex.pattern">[-\w
>>>>>>> ,/\n\&quot;&apos;]{20,200}</str>
>>>>>>>          </lst>
>>>>>>>        </fragmenter>
>>>>>>>
>>>>>>>        <!-- Configure the standard formatter -->
>>>>>>>        <formatter name="html"
>>>>>>>                   default="true"
>>>>>>>                   class="solr.highlight.HtmlFormatter">
>>>>>>>          <lst name="defaults">
>>>>>>>            <str name="hl.simple.pre"><![CDATA[<em>]]></str>
>>>>>>>            <str name="hl.simple.post"><![CDATA[</em>]]></str>
>>>>>>>          </lst>
>>>>>>>        </formatter>
>>>>>>>
>>>>>>>        <!-- Configure the standard encoder -->
>>>>>>>        <encoder name="html"
>>>>>>>                 class="solr.highlight.HtmlEncoder" />
>>>>>>>
>>>>>>>        <!-- Configure the standard fragListBuilder -->
>>>>>>>        <fragListBuilder name="simple"
>>>>>>>
>>>>>>> class="solr.highlight.SimpleFragListBuilder"/>
>>>>>>>
>>>>>>>        <!-- Configure the single fragListBuilder -->
>>>>>>>        <fragListBuilder name="single"
>>>>>>>
>>>>>>> class="solr.highlight.SingleFragListBuilder"/>
>>>>>>>
>>>>>>>        <!-- Configure the weighted fragListBuilder -->
>>>>>>>        <fragListBuilder name="weighted"
>>>>>>>                         default="true"
>>>>>>>
>>>>>>> class="solr.highlight.WeightedFragListBuilder"/>
>>>>>>>
>>>>>>>        <!-- default tag FragmentsBuilder -->
>>>>>>>        <fragmentsBuilder name="default"
>>>>>>>                          default="true"
>>>>>>>
>>>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder">
>>>>>>>          <!--
>>>>>>>          <lst name="defaults">
>>>>>>>            <str name="hl.multiValuedSeparatorChar">/</str>
>>>>>>>          </lst>
>>>>>>>          -->
>>>>>>>        </fragmentsBuilder>
>>>>>>>
>>>>>>>        <!-- multi-colored tag FragmentsBuilder -->
>>>>>>>        <fragmentsBuilder name="colored"
>>>>>>>
>>>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder">
>>>>>>>          <lst name="defaults">
>>>>>>>            <str name="hl.tag.pre"><![CDATA[
>>>>>>>                 <b style="background:yellow">,<b
>>>>>>> style="background:lawgreen">,
>>>>>>>                 <b style="background:aquamarine">,<b
>>>>>>> style="background:magenta">,
>>>>>>>                 <b style="background:palegreen">,<b
>>>>>>> style="background:coral">,
>>>>>>>                 <b style="background:wheat">,<b
>>>>>>> style="background:khaki">,
>>>>>>>                 <b style="background:lime">,<b
>>>>>>> style="background:deepskyblue">]]></str>
>>>>>>>            <str name="hl.tag.post"><![CDATA[</b>]]></str>
>>>>>>>          </lst>
>>>>>>>        </fragmentsBuilder>
>>>>>>>
>>>>>>>        <boundaryScanner name="default"
>>>>>>>                         default="true"
>>>>>>>                         class="solr.highlight.SimpleBoundaryScanner">
>>>>>>>          <lst name="defaults">
>>>>>>>            <str name="hl.bs.maxScan">10</str>
>>>>>>>            <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>>>>>>>          </lst>
>>>>>>>        </boundaryScanner>
>>>>>>>
>>>>>>>        <boundaryScanner name="breakIterator"
>>>>>>>
>>>>>>> class="solr.highlight.BreakIteratorBoundaryScanner">
>>>>>>>          <lst name="defaults">
>>>>>>>            <!-- type should be one of CHARACTER, WORD(default), LINE
>>>>>>> and SENTENCE -->
>>>>>>>            <str name="hl.bs.type">WORD</str>
>>>>>>>            <!-- language and country are used when constructing
>>>>>>> Locale
>>>>>>> object.  -->
>>>>>>>            <!-- And the Locale object will be used when getting
>>>>>>> instance of BreakIterator -->
>>>>>>>            <str name="hl.bs.language">en</str>
>>>>>>>            <str name="hl.bs.country">US</str>
>>>>>>>          </lst>
>>>>>>>        </boundaryScanner>
>>>>>>>      </highlighting>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> This email has been checked for viruses by Avast antivirus software.
>>>>> https://www.avast.com/antivirus
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> One man's "magic" is another man's engineering. "Supernatural" is a null
>>> word.”
>>> Robert A. Heinlein
>>>
>>
>
> --
> He who knows others is wise;
> He who know himself is enlightened.
> Lao-tzu
>

Reply via email to