Displaying a particular field on result
Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu]
Re: Displaying a particular field on result
Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu]
Re: Displaying a particular field on result
Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
Are you looking for the 'fl' parameter by any chance: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter ? It's in the Admin UI as well. If not, then you really do need to rephrase your question. Maybe by giving a very specific example. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 8:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
Hi, I don't know nutch, but I will answer as if you were using solr cell : http://wiki.apache.org/solr/ExtractingRequestHandler When a pdf file is sent to extracting request handler, several meta data are extracted from pdf. These metadata are assigned to fields. I usually enable dynamic field * to capture all metadata and see their accociated field names and values. Afterwards I select useful ones (and define them in schema.xml like you did for author) and forward remaining ones to an ignored dynamic field. Wiki page has all info to manipulate metadata generated by extraction. Hope this helps. On Wednesday, June 4, 2014 4:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
Hi Alexandre, I've already play with fl parameter in Admin UI but the result is not I expected. From what I understand that Solr database structure is defined on Solr's schema.xml. On that file we defined in example author field to store author content in Solr database. Even I put author as fl paramater in Admin UI, the query will never show the contents, even I have (PDF/doc) document having author content. How to display that field? Or take a previous step, how to ensure or check that field is already stored on Solr? On Wed, Jun 4, 2014 at 8:59 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Are you looking for the 'fl' parameter by any chance: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter ? It's in the Admin UI as well. If not, then you really do need to rephrase your question. Maybe by giving a very specific example. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 8:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
Ok, the question was if I understood it now: I am importing data from Nutch into Solr. One of the fields is author and I have defined it in Solr's schema.xml. Unfortunately, it is always empty when I check the records in the Solr's AdminUI. How can I confirm that the field was actually indexed into Solr? In which case, Bayu's answer should solve it. Nutch is either not extracting the field or trying to push it to Solr with a wrong name. Try switching the * field definition to catch it (stored=true). Alternatively, disable * definition all together and see what fields will fail (might be a lot of them though). Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 9:08 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Alexandre, I've already play with fl parameter in Admin UI but the result is not I expected. From what I understand that Solr database structure is defined on Solr's schema.xml. On that file we defined in example author field to store author content in Solr database. Even I put author as fl paramater in Admin UI, the query will never show the contents, even I have (PDF/doc) document having author content. How to display that field? Or take a previous step, how to ensure or check that field is already stored on Solr? On Wed, Jun 4, 2014 at 8:59 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Are you looking for the 'fl' parameter by any chance: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter ? It's in the Admin UI as well. If not, then you really do need to rephrase your question. Maybe by giving a very specific example. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 8:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
I meant of course, Ahmet's answer. Sorry, both. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 9:14 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Ok, the question was if I understood it now: I am importing data from Nutch into Solr. One of the fields is author and I have defined it in Solr's schema.xml. Unfortunately, it is always empty when I check the records in the Solr's AdminUI. How can I confirm that the field was actually indexed into Solr? In which case, Bayu's answer should solve it. Nutch is either not extracting the field or trying to push it to Solr with a wrong name. Try switching the * field definition to catch it (stored=true). Alternatively, disable * definition all together and see what fields will fail (might be a lot of them though). Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 9:08 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Alexandre, I've already play with fl parameter in Admin UI but the result is not I expected. From what I understand that Solr database structure is defined on Solr's schema.xml. On that file we defined in example author field to store author content in Solr database. Even I put author as fl paramater in Admin UI, the query will never show the contents, even I have (PDF/doc) document having author content. How to display that field? Or take a previous step, how to ensure or check that field is already stored on Solr? On Wed, Jun 4, 2014 at 8:59 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Are you looking for the 'fl' parameter by any chance: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter ? It's in the Admin UI as well. If not, then you really do need to rephrase your question. Maybe by giving a very specific example. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 8:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu] -- wassalam, [bayu]
Re: Displaying a particular field on result
Thank you Alexandre! I will check my configurations again. On Wed, Jun 4, 2014 at 9:14 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Ok, the question was if I understood it now: I am importing data from Nutch into Solr. One of the fields is author and I have defined it in Solr's schema.xml. Unfortunately, it is always empty when I check the records in the Solr's AdminUI. How can I confirm that the field was actually indexed into Solr? In which case, Bayu's answer should solve it. Nutch is either not extracting the field or trying to push it to Solr with a wrong name. Try switching the * field definition to catch it (stored=true). Alternatively, disable * definition all together and see what fields will fail (might be a lot of them though). Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 9:08 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Alexandre, I've already play with fl parameter in Admin UI but the result is not I expected. From what I understand that Solr database structure is defined on Solr's schema.xml. On that file we defined in example author field to store author content in Solr database. Even I put author as fl paramater in Admin UI, the query will never show the contents, even I have (PDF/doc) document having author content. How to display that field? Or take a previous step, how to ensure or check that field is already stored on Solr? On Wed, Jun 4, 2014 at 8:59 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Are you looking for the 'fl' parameter by any chance: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter ? It's in the Admin UI as well. If not, then you really do need to rephrase your question. Maybe by giving a very specific example. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Jun 4, 2014 at 8:51 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi Ahmet, I just refering to Solr's schema.xml which described this field definition. In this case for example author field. Then also refer to Solr query's result which I queried through Solr Admin page that didn't response author field. CMIIW. Thanks.- On Wed, Jun 4, 2014 at 5:19 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Bayu, I think this is a nutch question, no? Ahmet On Wednesday, June 4, 2014 1:13 AM, Bayu Widyasanyata bwidyasany...@gmail.com wrote: Hi, I'm sorry if this is a frequently asked question. In default Solr's schema.xml file we define an author field like following: field name=author type=text_general stored=true indexed=true/ But this field seems not parsed (by nutch) and indexed (by Solr). My query is always return null result for author field even some documents (PDF) are have author contents. How to display them? What should I prepared during fetch parsing which I missed out? Any documents/links for this issue? Thanks in advance. -- wassalam, [bayu] -- wassalam, [bayu] -- wassalam, [bayu] -- wassalam, [bayu]