bq: You mention the I was searching for concord and that its not in any documents. But the results below clearly show 3 hits
Right, as you figured out I _really_ meant "concord in any stored fields you were including in the hl.fl parameter". That could have been clearer. bq: Is there a problem with storing _text_ so I can get a highlight fragment when a hit is found there? No, you can store the data in the _text_ field just fine, you'll have to re-index after the change though. It's often more useful to a user to see the highlights in specific fields though, so I wouldn't throw the rest of the highlighting away. You should probably see the FastVectorHighlighter though. If you don't use FVH, highlighting re-analyzes the raw text to produce the snippets which may be expensive for large text fields. Best, Erick On Wed, Aug 12, 2015 at 8:46 AM, Scott Derrick <sc...@tnstaafl.net> wrote: > Erick, > > that explains it. I figured I didn't understand how solr handled highlight > fragments. > > Most of my documents are just text. or as solr specifies that content > _text_, which is not stored, by default. > > You mention the I was searching for concord and that its not in any > documents. But the results below clearly show 3 hits > >>> "response":{"numFound":3,"start":0,"docs":[ > > the problem is the hits are in _text_ > > Is there a problem with storing _text_ so I can get a highlight fragment > when a hit is found there? > > Scott > > -------- Original Message -------- > Subject: Re: Highlighting, all matches show empty {} > From: Erick Erickson <erickerick...@gmail.com> > To: solr-user@lucene.apache.org > Date: 08/12/2015 09:27 AM > >> Well, the example you just showed shouldn't show any highlighting. Your >> query is >> q=concord >> so it's trying to highlight "concord" which isn't in any of your >> documents. hl.q can be >> used to highlight something other than your q parameter. >> >> I did notice in some of your other examples that you seemed to be >> searching for >> terms that were in the fields so I suspect this isn't really your root >> problem though. >> >> do note that fields _must_ be stored to have highlighting work. Is it >> possible that your >> matches are on fields that aren't stored? >> >> Let's build it up slowly though, try searching on one term in one >> field that you _know_ >> is stored and see if you get anything back. While the query with >> hl.fl=* and fl=field1, field2, >> should be fine, let's start as simply as possible and work up maybe? >> >> Best, >> Erick >> >> On Wed, Aug 12, 2015 at 7:59 AM, Scott Derrick <sc...@tnstaafl.net> wrote: >>> >>> I think the highlighter is actually running, but I'm not getting the >>> results?? >>> >>> with this request >>> >>> >>> http://localhost:8983/solr/mbepp/select?q=concord&fl=accession%2C+title%2C+author%2C+date&wt=json&indent=true&hl=true&hl.fl=* >>> >>> >>> I get this response >>> >>> { >>> "responseHeader":{ >>> "status":0, >>> "QTime":3, >>> "params":{ >>> "q":"concord", >>> "hl":"true", >>> "indent":"true", >>> "fl":"accession, title, author, date", >>> "hl.fl":"*", >>> "wt":"json"}}, >>> "response":{"numFound":3,"start":0,"docs":[ >>> { >>> "date":"1890-02-26", >>> "author":"Mary Baker Eddy", >>> "accession":"L13943", >>> "title":["Mary Baker Eddy to Joseph E. Adams,"]}, >>> { >>> "date":"1896-01-13", >>> "author":"Mary Baker Eddy", >>> "accession":"L03453", >>> "title":["Mary Baker Eddy to Ira O. Knapp,"]}, >>> { >>> "date":"1902-06-15", >>> "author":"Mary Baker Eddy", >>> "accession":"A10145", >>> "title":["Message of the Pastor Emeritus to The First Church of >>> Christ, Scientist, Boston, Mass., June 15, 1902"]}] >>> }, >>> "highlighting":{ >>> >>> >>> "/home/scott/workspace/mbel-work/tei2html/build/web/L13943/L13943.html":{}, >>> >>> >>> "/home/scott/workspace/mbel-work/tei2html/build/web/L03453/L03453.html":{}, >>> >>> >>> "/home/scott/workspace/mbel-work/tei2html/build/web/A10145/A10145.html":{}}} >>> >>> When I ran the request. >>> In the admin plubins/Stats I set "Watch Changes" before processing the >>> request. Highlighting showed 2 changes, the gapFragmenter and >>> HTMLFormatter >>> >>> here are the reported changes >>> >>> org.apache.solr.highlight.GapFragmenter >>> class: org.apache.solr.highlight.GapFragmenter >>> version: 5.2.1 >>> description: GapFragmenter >>> stats: requests: Was: 117, Now: 156, Delta: 39 >>> >>> org.apache.solr.highlight.HtmlFormatter >>> class: org.apache.solr.highlight.HtmlFormatter >>> version:5.2.1 >>> description:HtmlFormatter >>> stats: requests: Was: 117, Now: 156, Delta: 39 >>> >>> Looks to me like there were 39 fragments or something processed, yet you >>> can >>> see above the highlights are empty {}??? >>> >>> though all the the other libraries in the highlighter showed no changes. >>> >>> which are these... >>> >>> org.apache.solr.highlight.BreakIteratorBoundaryScanner >>> org.apache.solr.highlight.HtmlEncoder >>> org.apache.solr.highlight.RegexFragmenter >>> org.apache.solr.highlight.ScoreOrderFragmentsBuilder >>> org.apache.solr.highlight.SimpleBoundaryScanner >>> org.apache.solr.highlight.SimpleFragListBuilder >>> org.apache.solr.highlight.SingleFragListBuilder >>> org.apache.solr.highlight.WeightedFragListBuilder >>> >>> >>> Scott >>> >>> -------- Original Message -------- >>> Subject: Highlighting, all matches show empty {} >>> From: Scott Derrick <sc...@tnstaafl.net> >>> To: solr-user@lucene.apache.org >>> Date: 08/12/2015 08:20 AM >>> >>>> Tried submitting a filed for hl.fl still empty {} >>>> >>>> here are the query terms >>>> >>>> "responseHeader": { >>>> "status": 0, >>>> "QTime": 8, >>>> "params": { >>>> "q": "mary or calvin", >>>> "hl": "true", >>>> "hl.simple.post": "</em>", >>>> "indent": "true", >>>> "fl": "accession, title, author, date", >>>> "hl.fl": "*", >>>> "wt": "json", >>>> "hl.simple.pre": "<em>", >>>> "_": "1439388969240" >>>> } >>>> >>>> here is one of the responses, there were 135 >>>> >>>> { >>>> "date": "1886-07-06", >>>> "author": "Mary Baker Eddy", >>>> "accession": "L02634", >>>> "title": [ >>>> "Mary Baker Eddy to Josephine C. Woodbury, July 6, 1886" >>>> ] >>>> }, >>>> >>>> here is the highlight section listing the first 10 matches, still empty >>>> {} >>>> >>>> "highlighting": { >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L02634/L02634.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10720/A10720.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L07894/L07894.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L09828/L09828.html": >>>> {}, >>>> >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10636D/A10636D.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L13943/L13943.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html": >>>> {}, >>>> >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10879/A10879.html": >>>> {}, >>>> >>>> >>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html": >>>> {} >>>> } >>>> >>>> >>>> -------- Original Message -------- >>>> Subject: Re: Highlighting >>>> From: Scott Derrick <sc...@tnstaafl.net> >>>> To: solr-user@lucene.apache.org >>>> Date: 08/12/2015 06:39 AM >>>> >>>>> I was pretty sure I tried that, though I thought if you don't specify >>>>> it >>>>> just uses the search terms? >>>>> >>>>> If I just search for "calvin" and don't specify a field, what do I >>>>> assign hl.fl? >>>>> >>>>> Scott >>>>> >>>>> On 8/11/2015 7:27 PM, Erik Hatcher wrote: >>>>>> >>>>>> >>>>>> Scott - doesn’t look you’ve specified hl.fl specifying which field(s) >>>>>> to highlight. >>>>>> >>>>>> p.s. Erick Erickson surely likes your e-mail domain :) >>>>>> >>>>>> >>>>>> — >>>>>> Erik Hatcher, Senior Solutions Architect >>>>>> http://www.lucidworks.com <http://www.lucidworks.com/> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Aug 11, 2015, at 9:02 PM, Scott Derrick <sc...@tnstaafl.net> >>>>>>> wrote: >>>>>>> >>>>>>> I guess I really don't get Highlighting in Solr. >>>>>>> >>>>>>> We are transitioning from Google Custom Search which generally sucks, >>>>>>> but does return nicely formatted highlighted fragment. >>>>>>> >>>>>>> I turn highlighting on hl=true in the query and I get a highlighting >>>>>>> section returned at the bottom of the page, each identified by the >>>>>>> document file name with a empty {} . It doesn't matter what I search >>>>>>> for, plain text, a field, I get a list of documents followed by an >>>>>>> empty brace? >>>>>>> >>>>>>> "highlighting": { >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html": >>>>>>> >>>>>>> {}, >>>>>>> >>>>>>> >>>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html": >>>>>>> >>>>>>> {} >>>>>>> } >>>>>>> >>>>>>> I haven't made any changes to the default settings >>>>>>> >>>>>>> <highlighting> >>>>>>> <!-- Configure the standard fragmenter --> >>>>>>> <!-- This could most likely be commented out in the "default" >>>>>>> case --> >>>>>>> <fragmenter name="gap" >>>>>>> default="true" >>>>>>> class="solr.highlight.GapFragmenter"> >>>>>>> <lst name="defaults"> >>>>>>> <int name="hl.fragsize">100</int> >>>>>>> </lst> >>>>>>> </fragmenter> >>>>>>> >>>>>>> <!-- A regular-expression-based fragmenter >>>>>>> (for sentence extraction) >>>>>>> --> >>>>>>> <fragmenter name="regex" >>>>>>> class="solr.highlight.RegexFragmenter"> >>>>>>> <lst name="defaults"> >>>>>>> <!-- slightly smaller fragsizes work better because of >>>>>>> slop >>>>>>> --> >>>>>>> <int name="hl.fragsize">70</int> >>>>>>> <!-- allow 50% slop on fragment sizes --> >>>>>>> <float name="hl.regex.slop">0.5</float> >>>>>>> <!-- a basic sentence pattern --> >>>>>>> <str name="hl.regex.pattern">[-\w >>>>>>> ,/\n\"']{20,200}</str> >>>>>>> </lst> >>>>>>> </fragmenter> >>>>>>> >>>>>>> <!-- Configure the standard formatter --> >>>>>>> <formatter name="html" >>>>>>> default="true" >>>>>>> class="solr.highlight.HtmlFormatter"> >>>>>>> <lst name="defaults"> >>>>>>> <str name="hl.simple.pre"><![CDATA[<em>]]></str> >>>>>>> <str name="hl.simple.post"><![CDATA[</em>]]></str> >>>>>>> </lst> >>>>>>> </formatter> >>>>>>> >>>>>>> <!-- Configure the standard encoder --> >>>>>>> <encoder name="html" >>>>>>> class="solr.highlight.HtmlEncoder" /> >>>>>>> >>>>>>> <!-- Configure the standard fragListBuilder --> >>>>>>> <fragListBuilder name="simple" >>>>>>> >>>>>>> class="solr.highlight.SimpleFragListBuilder"/> >>>>>>> >>>>>>> <!-- Configure the single fragListBuilder --> >>>>>>> <fragListBuilder name="single" >>>>>>> >>>>>>> class="solr.highlight.SingleFragListBuilder"/> >>>>>>> >>>>>>> <!-- Configure the weighted fragListBuilder --> >>>>>>> <fragListBuilder name="weighted" >>>>>>> default="true" >>>>>>> >>>>>>> class="solr.highlight.WeightedFragListBuilder"/> >>>>>>> >>>>>>> <!-- default tag FragmentsBuilder --> >>>>>>> <fragmentsBuilder name="default" >>>>>>> default="true" >>>>>>> >>>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder"> >>>>>>> <!-- >>>>>>> <lst name="defaults"> >>>>>>> <str name="hl.multiValuedSeparatorChar">/</str> >>>>>>> </lst> >>>>>>> --> >>>>>>> </fragmentsBuilder> >>>>>>> >>>>>>> <!-- multi-colored tag FragmentsBuilder --> >>>>>>> <fragmentsBuilder name="colored" >>>>>>> >>>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder"> >>>>>>> <lst name="defaults"> >>>>>>> <str name="hl.tag.pre"><![CDATA[ >>>>>>> <b style="background:yellow">,<b >>>>>>> style="background:lawgreen">, >>>>>>> <b style="background:aquamarine">,<b >>>>>>> style="background:magenta">, >>>>>>> <b style="background:palegreen">,<b >>>>>>> style="background:coral">, >>>>>>> <b style="background:wheat">,<b >>>>>>> style="background:khaki">, >>>>>>> <b style="background:lime">,<b >>>>>>> style="background:deepskyblue">]]></str> >>>>>>> <str name="hl.tag.post"><![CDATA[</b>]]></str> >>>>>>> </lst> >>>>>>> </fragmentsBuilder> >>>>>>> >>>>>>> <boundaryScanner name="default" >>>>>>> default="true" >>>>>>> class="solr.highlight.SimpleBoundaryScanner"> >>>>>>> <lst name="defaults"> >>>>>>> <str name="hl.bs.maxScan">10</str> >>>>>>> <str name="hl.bs.chars">.,!? 	 </str> >>>>>>> </lst> >>>>>>> </boundaryScanner> >>>>>>> >>>>>>> <boundaryScanner name="breakIterator" >>>>>>> >>>>>>> class="solr.highlight.BreakIteratorBoundaryScanner"> >>>>>>> <lst name="defaults"> >>>>>>> <!-- type should be one of CHARACTER, WORD(default), LINE >>>>>>> and SENTENCE --> >>>>>>> <str name="hl.bs.type">WORD</str> >>>>>>> <!-- language and country are used when constructing >>>>>>> Locale >>>>>>> object. --> >>>>>>> <!-- And the Locale object will be used when getting >>>>>>> instance of BreakIterator --> >>>>>>> <str name="hl.bs.language">en</str> >>>>>>> <str name="hl.bs.country">US</str> >>>>>>> </lst> >>>>>>> </boundaryScanner> >>>>>>> </highlighting> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> --- >>>>> This email has been checked for viruses by Avast antivirus software. >>>>> https://www.avast.com/antivirus >>>>> >>>>> >>>> >>> >>> -- >>> One man's "magic" is another man's engineering. "Supernatural" is a null >>> word.” >>> Robert A. Heinlein >>> >> > > -- > He who knows others is wise; > He who know himself is enlightened. > Lao-tzu >