Hi Erick, I think you are right!
When I use the form 'features:accents' in my case 'content:nietava', it show as if there was not matching words... but if I take the field off having only the 'q=searchword' (q=nietava) it brings the pdf content file, as below (in XML out type): #partial snip: <arr name="content"> <str> Microsoft Word - André Luiz - Sexo e Destino _Chico e Waldo_.doc Francisco Cândido Xavier e Waldo Vieira Sexo e Destino 12o livro da Coleção “A Vida no Mundo Espiritual” Ditado pelo Espírito André Luiz FEDERAÇÃO ESPÍRITA BRASILEIRA DEPARTAMENTO EDITORIAL Rua Souza Valente, 17 20941-040 - Rio - RJ - Brasil http://www.febnet.org.br/ Francisco Cândido Xavier - Sexo e Destino - pelo Espírito André Luiz 2 Coleção “A Vida no Mundo Espiritual” 01 - Nosso Lar 02 - Os Mensageiros 03 - Missionários da Luz 04 - Obreiros da Vida Eterna 05 - No Mundo Maior 06 - Libertação 07 - Entre a Terra e o Céu 08 - Nos Domínios da Mediunidade 09 - Ação e Reação 10 - Evolução em Dois Mundos 11 - Mecanismos da Mediunidade 12 - Sexo e Destino 13 - E a Vida Continua... Francisco Cândid So, using: 1. q=content:nietava&hl=true&hl.fl=content -> results: <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">3</int> <lst name="params"> <str name="q">content:nietava</str> <str name="hl">true</str> <str name="hl.fl">content</str> </lst> </lst> <result name="response" numFound="0" start="0"/> <lst name="highlighting"/> </response> 2.q=nietava&hl=true&hl.fl=content -> results: <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">93</int> <lst name="params"> <str name="q">nietava</str> <str name="hl">true</str> <str name="hl.fl">content</str> </lst> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="id">pdf1</str> <date name="last_modified">2011-07-28T20:39:26Z</date> <arr name="title"> <str> Microsoft Word - André Luiz - Sexo e Destino _Chico e Waldo_.doc </str> </arr> <arr name="content_type"> <str>application/pdf</str> </arr> <str name="author">Wander</str> <str name="author_s">Wander</str> <arr name="content"> <str> Microsoft Word - André Luiz - Sexo e Destino _Chico e Waldo_.doc Francisco Cândido Xavier e Waldo Vieira Sexo e Destino 12o livro da Coleção “A Vida no Mundo Espiritual” Ditado pelo Espírito André Luiz FEDERAÇÃO ESPÍRITA BRASILEIRA DEPARTAMENTO EDITORIAL Rua Souza Valente, 17 20941-040 - Rio - RJ - Brasil http://www.febnet.org.br/ Francisco Cândido Xavier - Sexo e Destino - pelo Espírito André Luiz 2 Coleção “A Vida no Mundo Espiritual” 01 - Nosso Lar 02 - Os Mensageiros 03 - Missionários da Luz 04 - Obreiros da Vida Eterna 05 - No Mundo Maior 06 - Libertação 07 - Entre a Terra e o Céu 08 - Nos Domínios da Mediunidade 09 - Ação e Reação 10 - Evolução em Dois Mundos 11 - Mecanismos da Mediunidade 12 - Sexo e Destino 13 - E a Vida Continua... Francisco Cândido Xavier - ...........(long text... including the word 'nietava' </str> </arr> <long name="_version_">1520731379641352192</long> </doc> </result> <lst name="highlighting"> <lst name="pdf1"/> </lst> </response> .... =( Thanks! *Evert* 2015-12-16 15:17 GMT-02:00 Erick Erickson <erickerick...@gmail.com>: > Ok, you're getting confused by all the options, an easy thing to do. > You're trying to do too many things at once without making sure > the basics work.... > > 1> Forget all about the f.content.hl.... stuff. That's there in case > you want to specify different parameters for different fields in the same > highlight request. That's an advanced option for later.... > > 2> start with the basic techproducts example. Then this should show > you hightlights: > q=features:accents&hl=true&hl.fl=features > > That's about as basic as you get. It's searching for "accents" in the > features field and returning highlights on the features field. > > Once that's working, _then_ refine. > > Best, > Erick > > On Wed, Dec 16, 2015 at 8:21 AM, Evert R. <evert.ra...@gmail.com> wrote: > > Hi Andrea, > > > > ok, let´s do it: > > > > 1. it does has the 'nietava' term, so it brings the only book (pdf file) > > has this word, and all its content as my previous message to Erick, so > the > > content field is there. > > > > 2. using content:nietava it does not show any result.... as below: > > > > { "responseHeader": { "status": 400, "QTime": 12, "params": { "q": > > "contents:nietava", "indent": "true", "fl": "id", "wt": "json", "_": > > "1450282631352" } }, "error": { "msg": "undefined field contents", > "code": > > 400 } } > > > > 3. Here is what I found when grepping 'content' from the techproducts > conf > > folder: > > > > schema.xml: <field name="content_type" type="string" indexed="true" > > stored="true" multiValued="true"/> schema.xml: <field name="content" > > type="text_general" indexed="false" stored="true" multiValued="true"/> > > schema.xml: <copyField source="content" dest="text"/> schema.xml: > > <copyField source="content_type" dest="text"/> solrconfig.xml: <str > > name="facet.field">content_type</str> solrconfig.xml: <str > > name="hl.fl">content features title name</str> solrconfig.xml: <str > > name="f.content.hl.snippets">3</str> solrconfig.xml: <str > > name="f.content.hl.fragsize">200</str> solrconfig.xml: <str > > name="f.content.hl.alternateField">content</str> solrconfig.xml: <str > > name="f.content.hl.maxAlternateFieldLength">750</str> solrconfig.xml: > <str > > name="stream.contentType">application/json</str> solrconfig.xml: <str > > name="stream.contentType">application/csv</str> solrconfig.xml: <str > > name="content-type">text/plain; charset=UTF-8</str> > > > > and the grep on 'content_type': > > > > schema.xml: <field name="content_type" type="string" indexed="true" > > stored="true" multiValued="true"/> > > schema.xml: <copyField source="content_type" dest="text"/> > > solrconfig.xml: <str name="facet.field">content_type</str> > > > > =) > > > > Thanks for checking out. > > > > > > > > *Evert * > > > > 2015-12-16 12:59 GMT-02:00 Andrea Gazzarini <a.gazzar...@gmail.com>: > > > >> hl=f.content.hl.content (I guess) is definitely wrong. Some questions: > >> > >> - First, sorry, the obvious question: are you sure the documents > contain > >> the "nietava" term? > >> - Could you try to use q=content:nietaval? > >> - Could you paste the definition (field & fieldtype) of the content > >> field? > >> > >> > Should I have this configuration in the XML file? > >> > >> You could, but it's up to you and it strongly depends on your context. > The > >> simple thing is that if you have those parameters within the > configuration > >> you can avoid to pass them (as part of the requests), but probably in > this > >> phase, where you are testing, it's better to have them there (in the > >> request). > >> > >> Andrea > >> > >> 2015-12-16 15:28 GMT+01:00 Evert R. <evert.ra...@gmail.com>: > >> > >> > Hi Andrea, > >> > > >> > Thanks for the reply! > >> > > >> > I tried with the hl.fl parameter as well, using as below: > >> > > >> > > >> > > >> > http://localhost:8983/solr/techproducts/select?q=nietava&fl=id%2C+content&wt=json&indent=true&hl=true& > >> > > >> > > >> > hl.fl=f.content.hl.content%3D4&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E > >> > > >> > with the parameter under the hl field in the solr ui: > >> > > >> > 1. f.content.hl.snnipets=2 > >> > 2. f.content.hl.content=4 > >> > 3. content > >> > > >> > with no success... > >> > > >> > Should I have this configuration in the XML file? > >> > > >> > Regards, > >> > > >> > *Evert * > >> > > >> > 2015-12-16 11:23 GMT-02:00 Andrea Gazzarini <a.gazzar...@gmail.com>: > >> > > >> > > Hi Evert, > >> > > what is the configuration of the default request handler? Did you > set > >> the > >> > > hl.fl parameter? > >> > > > >> > > Please check here [1] the parameters that the highlighting component > >> > > expects. Required parameters should be in the query string or > declared > >> > > within the request handler which answers to your query. > >> > > > >> > > Andrea > >> > > > >> > > [1] https://wiki.apache.org/solr/HighlightingParameters > >> > > > >> > > > >> > > > >> > > > >> > > 2015-12-16 12:51 GMT+01:00 Evert R. <evert.ra...@gmail.com>: > >> > > > >> > > > Hi everyone! > >> > > > > >> > > > I think I should not have posted my server name... never had that > >> many > >> > > > access attempts... > >> > > > > >> > > > > >> > > > > >> > > > 2015-12-16 9:03 GMT-02:00 Evert R. <evert.ra...@gmail.com>: > >> > > > > >> > > > > Hello Erick, > >> > > > > > >> > > > > Thanks again for your time. > >> > > > > > >> > > > > Here is as far as I have gone: > >> > > > > > >> > > > > 1. I started a fresh install and did the following: > >> > > > > > >> > > > > [evert@nix]$ bin/solr start -e techproducts > >> > > > > [evert@nix]$ curl ' > >> > > > > > >> > > > > >> > > > >> > > >> > http://localhost:8983/solr/techproducts/update/extract?literal.id=pdf1&commit=true > >> > > > ' > >> > > > > -F "Emmanuel=@/home/solr/dados/teste/Emmanuel.pdf" > >> > > > > > >> > > > > 2. I am using only the Solr Admin UI to check the query respond, > >> here > >> > > is > >> > > > > an example: > >> > > > > > >> > > > > Query: http:// > >> > > > > localhost > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > :8983/solr/techproducts/select?q=nietava&fl=id%2C+author%2C+content&wt=json&indent=true&hl=true&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E > >> > > > > > >> > > > > Result: { > >> > > > > "responseHeader": { > >> > > > > "status": 0, > >> > > > > "QTime": 14, > >> > > > > "params": { > >> > > > > "q": "nietava", > >> > > > > "hl": "true", > >> > > > > "hl.simple.post": "</em>", > >> > > > > "indent": "true", > >> > > > > "fl": "id, author, content", > >> > > > > "wt": "json", > >> > > > > "hl.simple.pre": "<em>", > >> > > > > "_": "1450262674102" > >> > > > > } > >> > > > > }, > >> > > > > "response": { > >> > > > > "numFound": 1, > >> > > > > "start": 0, > >> > > > > "docs": [ > >> > > > > { > >> > > > > "id": "pdf1", > >> > > > > "author": "Wander", > >> > > > > "content": [ > >> > > > > "André Luiz - Sexo e Destino _Chico e Waldo_.doc \n \n > >> > \n > >> > > > > Francisco Cândido Xavier \ne \n \n Waldo Vieira \n \n \n \n \n > >> Sexo e > >> > > > > Destino \n \n \n \n 12o livro da Coleção \n“A Vida no Mundo > >> > Espiritual” > >> > > > \n > >> > > > > \n \n \n \n \n Ditado pelo Espírito \nAndré Luiz \n \n \n \n > \n > >> \n > >> > \n > >> > > > \n > >> > > > > \n FEDERAÇÃO ESPÍRITA BRASILEIRA \nDEPARTAMENTO EDITORIAL \n \n > Rua > >> > > Souza > >> > > > > Valente, 17 \n20941-040 - Rio - RJ - Brasil \n \n \nhttp:// > >> > > > > www.febnet.org.br/ \n \n \n \n Francisco Cândido Xavier - > >> Sexo e > >> > > > > Destino - pelo Espírito André Luiz \n \n \n2 \n \n \n \n \n \n > >> > > Coleção > >> > > > > \n“A Vida no Mundo Espiritual” \n" > >> > > > > ] > >> > > > > } > >> > > > > ] > >> > > > > }, > >> > > > > "highlighting": { > >> > > > > "pdf1": {} > >> > > > > } > >> > > > > } > >> > > > > > >> > > > > **On the content it brings the whole pdf content (book), and > notice > >> > > that > >> > > > > in the highlight it shows empty. > >> > > > > > >> > > > > I tried creating a new core with bin/solr create -c test, using > the > >> > > > > schema.xml and solrconfig.xml standard found in > >> > > > > /solr/server/solr/configsets/basic_configs/conf > >> > > > > > >> > > > > But even though... not working as expected (I think). > >> > > > > > >> > > > > > >> > > > > Would you know how to set this techproducts example to bring the > >> > > snnipets > >> > > > > of text? > >> > > > > > >> > > > > The server only allows specific ip address for this port, if you > >> > > would, I > >> > > > > could get it open for you to check. > >> > > > > > >> > > > > > >> > > > > Thanks again and best regards! > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > *Evert > >> > > > > > >> > > > > > >> > > > > 2015-12-15 18:14 GMT-02:00 Erick Erickson < > erickerick...@gmail.com > >> >: > >> > > > > > >> > > > >> No, that's not what I meant. The highlight component adds a > >> special > >> > > > >> section to the return packet that will contain "snippets" of > text > >> > with > >> > > > >> highlights. You control how big those snippets are via various > >> > > > >> parameters in the highlight component and they'll have the tags > >> you > >> > > > >> specify for highlighting. > >> > > > >> > >> > > > >> Your app needs to pull the information from the highlight > portion > >> of > >> > > > >> the response packet rather than the document list. Just execute > >> your > >> > > > >> queries via cURL or a browser to see the structure of a > response > >> to > >> > > > >> see what I mean. > >> > > > >> > >> > > > >> And note that you do _not_ need to return the fields you're > >> > > > >> highlighting in the "fl" list so you do _not_ need to return > the > >> > > > >> entire document contents. > >> > > > >> > >> > > > >> What are you using to display the results anyway? > >> > > > >> > >> > > > >> Best, > >> > > > >> Erick > >> > > > >> > >> > > > >> On Tue, Dec 15, 2015 at 10:02 AM, Evert R. < > evert.ra...@gmail.com > >> > > >> > > > wrote: > >> > > > >> > Hi Erick, > >> > > > >> > > >> > > > >> > Thank you very much for the reply!! > >> > > > >> > > >> > > > >> > I do get back the full text, autor, and a whole lots of stuff > >> > which > >> > > > >> doesn´t > >> > > > >> > really matter for my project. > >> > > > >> > > >> > > > >> > So, what you are saying is that the solr gets me back the > full > >> > > content > >> > > > >> and > >> > > > >> > my application will fix the rest? Which means for me that > all my > >> > > books > >> > > > >> (pdf > >> > > > >> > files) when searching for an specific word it will bring me > the > >> > > whole > >> > > > >> book > >> > > > >> > content that has the requested query. And my application > (php) > >> in > >> > > this > >> > > > >> > case... will take care of show only part of the text (such > as in > >> > > > >> highlight, > >> > > > >> > as I was understandind) and hightlight the key word I was > >> looking > >> > > for? > >> > > > >> > > >> > > > >> > If so, Erick, you gave me a big help clearing out... I > thought I > >> > > would > >> > > > >> do > >> > > > >> > that with Solr in an easy way. =) > >> > > > >> > > >> > > > >> > Thanks for the attachements tip! > >> > > > >> > > >> > > > >> > Best regards, > >> > > > >> > > >> > > > >> > Evert > >> > > > >> > > >> > > > >> > 2015-12-15 14:56 GMT-02:00 Erick Erickson < > >> > erickerick...@gmail.com > >> > > >: > >> > > > >> > > >> > > > >> >> How are you trying to display the results? Highlighting is a > >> bit > >> > of > >> > > > an > >> > > > >> >> odd beast. Assuming it's correctly configured, the response > >> > packet > >> > > > >> >> will have a separate highlight section, it's the > application's > >> > > > >> >> responsibility to present that pleasingly. > >> > > > >> >> > >> > > > >> >> What _do_ you get bak in the response? > >> > > > >> >> > >> > > > >> >> BTW, the mail sever pretty aggressively strips attachments, > >> > your's > >> > > > >> >> didn't come through. > >> > > > >> >> > >> > > > >> >> Best, > >> > > > >> >> Erick > >> > > > >> >> > >> > > > >> >> On Tue, Dec 15, 2015 at 3:25 AM, Evert R. < > >> evert.ra...@gmail.com > >> > > > >> > > > >> wrote: > >> > > > >> >> > Hi there! > >> > > > >> >> > > >> > > > >> >> > It´s my first installation, not sure if here is the right > >> > > > channel... > >> > > > >> >> > > >> > > > >> >> > Here is my steps: > >> > > > >> >> > > >> > > > >> >> > 1. Set up a basic install of solr 5.4.0 > >> > > > >> >> > > >> > > > >> >> > 2. Create a new core through command line (bin/solr > create -c > >> > > test) > >> > > > >> >> > > >> > > > >> >> > 3. Post 2 files: 1 .docx and 2 .pdf (bin/post -c test > >> > > /docs/test/) > >> > > > >> >> > > >> > > > >> >> > 4. Query over the browser and it brings the correct > search, > >> but > >> > > it > >> > > > >> does > >> > > > >> >> not > >> > > > >> >> > show the part of the text I am querying, the highlight. > >> > > > >> >> > > >> > > > >> >> > I have already flagled the 'hl' option. But still it > does > >> not > >> > > > >> word... > >> > > > >> >> > > >> > > > >> >> > Exemple: I am looking for the word 'peace' in my pdf file > >> > (book) > >> > > I > >> > > > >> have 4 > >> > > > >> >> > matches for this word, it shows me the book name (pdf > file) > >> but > >> > > > does > >> > > > >> not > >> > > > >> >> > bring which part of the text it has the word peace on it. > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > I am problably missing some configuration in schema.xml, > >> which > >> > is > >> > > > >> missing > >> > > > >> >> > from my folder.... /solr/server/solr/test/conf/ > >> > > > >> >> > > >> > > > >> >> > Or even the solrconfig.xml... > >> > > > >> >> > > >> > > > >> >> > I have read a bunch of things about highlight check these > >> > files, > >> > > > >> copied > >> > > > >> >> the > >> > > > >> >> > standard schema.xml to my core/conf folder, but still it > does > >> > not > >> > > > >> bring > >> > > > >> >> the > >> > > > >> >> > highlight. > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > Attached a copy of my solrconfig.xml file. > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > I am very sorry for this, probably, dumb and too basic > >> > > question... > >> > > > >> First > >> > > > >> >> > time I see solr in live. > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > Any help will be appreciated. > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > Best regards, > >> > > > >> >> > > >> > > > >> >> > > >> > > > >> >> > Evert Ramos > >> > > > >> >> > > >> > > > >> >> > evert.ra...@gmail.com > >> > > > >> >> > > >> > > > >> >> > >> > > > >> > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> >