Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-23 Thread Carl Roberts
I got the RSS DIH example to work with my own RSS feed and it works 
great - thanks for the help.


On 1/22/15, 11:20 AM, Carl Roberts wrote:

Thanks. I am looking at the RSS DIH example right now.


On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote:

Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts 
 wrote:

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several 
elements in
each node that I have to index, so I was planning to parse the XML 
with Stax
and extract the data from each node and add it to Solr.  There will 
always
be only one one file to start with and then a second file as the RSS 
feeds

supplies updates.  I want to return certain fields of each node when I
search certain fields of the same node.  Is Solr overkill in this case?
Should I just use Lucene instead?

Regards,

Joe






Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts
Thanks for the input.  I think one benefit of using Solr is also that I 
can provide a REST API to search the indexed records.


Regards,

Joe
On 1/21/15, 3:17 PM, Shawn Heisey wrote:

On 1/21/2015 12:53 PM, Carl Roberts wrote:

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several
elements in each node that I have to index, so I was planning to parse
the XML with Stax and extract the data from each node and add it to
Solr.  There will always be only one one file to start with and then a
second file as the RSS feeds supplies updates.  I want to return
certain fields of each node when I search certain fields of the same
node.  Is Solr overkill in this case?  Should I just use Lucene instead?

Effectively, Solr *is* Lucene.  You edit configuration files instead of
writing Lucene code, because Solr is a fully customizable search server,
not a programming API.  That also means that it's not as flexible as
Lucene ... but it's a lot easier.

If you're capable of writing Lucene code, chances are that you'll be
able to write an application that is highly tailored to your situation
that will have better performance than Solr ... but you'll be writing
the entire program yourself.  Solr lets you install an existing program
and just change the configuration.

Thanks,
Shawn





Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts

Thanks.  I am looking at the RSS DIH example right now.


On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote:

Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts  wrote:

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements in
each node that I have to index, so I was planning to parse the XML with Stax
and extract the data from each node and add it to Solr.  There will always
be only one one file to start with and then a second file as the RSS feeds
supplies updates.  I want to return certain fields of each node when I
search certain fields of the same node.  Is Solr overkill in this case?
Should I just use Lucene instead?

Regards,

Joe




Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Shawn Heisey
On 1/21/2015 12:53 PM, Carl Roberts wrote:
> Is Solr a good candidate to index 100s of nodes in one XML file?
>
> I have an RSS feed XML file that has 100s of nodes with several
> elements in each node that I have to index, so I was planning to parse
> the XML with Stax and extract the data from each node and add it to
> Solr.  There will always be only one one file to start with and then a
> second file as the RSS feeds supplies updates.  I want to return
> certain fields of each node when I search certain fields of the same
> node.  Is Solr overkill in this case?  Should I just use Lucene instead?

Effectively, Solr *is* Lucene.  You edit configuration files instead of
writing Lucene code, because Solr is a fully customizable search server,
not a programming API.  That also means that it's not as flexible as
Lucene ... but it's a lot easier.

If you're capable of writing Lucene code, chances are that you'll be
able to write an application that is highly tailored to your situation
that will have better performance than Solr ... but you'll be writing
the entire program yourself.  Solr lets you install an existing program
and just change the configuration.

Thanks,
Shawn



Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Alexandre Rafalovitch
Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
  Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts  wrote:
> Hi,
>
> Is Solr a good candidate to index 100s of nodes in one XML file?
>
> I have an RSS feed XML file that has 100s of nodes with several elements in
> each node that I have to index, so I was planning to parse the XML with Stax
> and extract the data from each node and add it to Solr.  There will always
> be only one one file to start with and then a second file as the RSS feeds
> supplies updates.  I want to return certain fields of each node when I
> search certain fields of the same node.  Is Solr overkill in this case?
> Should I just use Lucene instead?
>
> Regards,
>
> Joe


Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-21 Thread Carl Roberts

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements 
in each node that I have to index, so I was planning to parse the XML 
with Stax and extract the data from each node and add it to Solr.  There 
will always be only one one file to start with and then a second file as 
the RSS feeds supplies updates.  I want to return certain fields of each 
node when I search certain fields of the same node.  Is Solr overkill in 
this case?  Should I just use Lucene instead?


Regards,

Joe