Re: Is Solr a good candidate to index 100s of nodes in one XML file?
I got the RSS DIH example to work with my own RSS feed and it works great - thanks for the help. On 1/22/15, 11:20 AM, Carl Roberts wrote: Thanks. I am looking at the RSS DIH example right now. On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote: Solr is just fine for this. It even ships with an example of how to read an RSS file under the DIH directory. DIH is also most likely what you will use for the first implementation. Don't need to worry about Stax or anything, unless your file format is very weird or has overlapping namespaces (DIH XML parser does not care about namespaces). Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 21 January 2015 at 14:53, Carl Roberts wrote: Hi, Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Regards, Joe
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
Thanks for the input. I think one benefit of using Solr is also that I can provide a REST API to search the indexed records. Regards, Joe On 1/21/15, 3:17 PM, Shawn Heisey wrote: On 1/21/2015 12:53 PM, Carl Roberts wrote: Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Effectively, Solr *is* Lucene. You edit configuration files instead of writing Lucene code, because Solr is a fully customizable search server, not a programming API. That also means that it's not as flexible as Lucene ... but it's a lot easier. If you're capable of writing Lucene code, chances are that you'll be able to write an application that is highly tailored to your situation that will have better performance than Solr ... but you'll be writing the entire program yourself. Solr lets you install an existing program and just change the configuration. Thanks, Shawn
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
Thanks. I am looking at the RSS DIH example right now. On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote: Solr is just fine for this. It even ships with an example of how to read an RSS file under the DIH directory. DIH is also most likely what you will use for the first implementation. Don't need to worry about Stax or anything, unless your file format is very weird or has overlapping namespaces (DIH XML parser does not care about namespaces). Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 21 January 2015 at 14:53, Carl Roberts wrote: Hi, Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Regards, Joe
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
On 1/21/2015 12:53 PM, Carl Roberts wrote: > Is Solr a good candidate to index 100s of nodes in one XML file? > > I have an RSS feed XML file that has 100s of nodes with several > elements in each node that I have to index, so I was planning to parse > the XML with Stax and extract the data from each node and add it to > Solr. There will always be only one one file to start with and then a > second file as the RSS feeds supplies updates. I want to return > certain fields of each node when I search certain fields of the same > node. Is Solr overkill in this case? Should I just use Lucene instead? Effectively, Solr *is* Lucene. You edit configuration files instead of writing Lucene code, because Solr is a fully customizable search server, not a programming API. That also means that it's not as flexible as Lucene ... but it's a lot easier. If you're capable of writing Lucene code, chances are that you'll be able to write an application that is highly tailored to your situation that will have better performance than Solr ... but you'll be writing the entire program yourself. Solr lets you install an existing program and just change the configuration. Thanks, Shawn
Re: Is Solr a good candidate to index 100s of nodes in one XML file?
Solr is just fine for this. It even ships with an example of how to read an RSS file under the DIH directory. DIH is also most likely what you will use for the first implementation. Don't need to worry about Stax or anything, unless your file format is very weird or has overlapping namespaces (DIH XML parser does not care about namespaces). Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 21 January 2015 at 14:53, Carl Roberts wrote: > Hi, > > Is Solr a good candidate to index 100s of nodes in one XML file? > > I have an RSS feed XML file that has 100s of nodes with several elements in > each node that I have to index, so I was planning to parse the XML with Stax > and extract the data from each node and add it to Solr. There will always > be only one one file to start with and then a second file as the RSS feeds > supplies updates. I want to return certain fields of each node when I > search certain fields of the same node. Is Solr overkill in this case? > Should I just use Lucene instead? > > Regards, > > Joe
Is Solr a good candidate to index 100s of nodes in one XML file?
Hi, Is Solr a good candidate to index 100s of nodes in one XML file? I have an RSS feed XML file that has 100s of nodes with several elements in each node that I have to index, so I was planning to parse the XML with Stax and extract the data from each node and add it to Solr. There will always be only one one file to start with and then a second file as the RSS feeds supplies updates. I want to return certain fields of each node when I search certain fields of the same node. Is Solr overkill in this case? Should I just use Lucene instead? Regards, Joe