where ignoreerrors=break means that an error in Inner
#2 would prevent Inner #3.
Lance
-Original Message-
From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2008 8:39 PM
To: solr-user@lucene.apache.org
Subject: Re: Large Data Set Suggestions
Hi
??? ?? [mailto:[EMAIL PROTECTED]
Sent: Thu 11/6/2008 11:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Large Data Set Suggestions
Hi Lance,
This is one area we left open in DIH. What is the best way to handle
this. On error it should give up or continue with the next?
--
--Noble Paul
Ideally, it would be a configuration option.
Also, it would be great to have a hook to log or process an exception.
Steve
-Original Message-
From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED]
Sent: Thu 11/6/2008 11:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Large Data
[mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2008 8:39 PM
To: solr-user@lucene.apache.org
Subject: Re: Large Data Set Suggestions
Hi Lance,
This is one area we left open in DIH. What is the best way to handle this. On
error it should give up or continue with the next?
On Fri, Nov 7
The performance of DIH is likely to be faster than SolrJ.
Because , it does not have the overhead of an http request.
Understood. However, we may not have the option of co-locating the data
to be injested with the Solr server.
What is your data source? I am assuming it is xml.
Yes.
In that case you may put the file in a mounted NFS directory
or you can serve it out with an apache server.
That's one option although someone else on the list mentioned that
performance was 10x slower in their NFS experience.
Another option is to serve up the files via Apache and pull them
On Thu, Nov 6, 2008 at 7:04 PM, Steven Anderson [EMAIL PROTECTED] wrote:
The performance of DIH is likely to be faster than SolrJ.
Because , it does not have the overhead of an http request.
Understood. However, we may not have the option of co-locating the data
to be injested with the Solr
100X, not 10X. And with the index on NFS. Reading the input data from
NFS would be slower than local, but probably not 10X. --wunder
On 11/6/08 5:56 AM, Steven Anderson [EMAIL PROTECTED] wrote:
That's one option although someone else on the list mentioned that
performance was 10x slower in
on an error.
Lance
-Original Message-
From: Steven Anderson [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 06, 2008 5:57 AM
To: solr-user@lucene.apache.org
Subject: RE: Large Data Set Suggestions
In that case you may put the file in a mounted NFS directory or you
can serve it out
PROTECTED]
Sent: Thursday, November 06, 2008 5:57 AM
To: solr-user@lucene.apache.org
Subject: RE: Large Data Set Suggestions
In that case you may put the file in a mounted NFS directory or you
can serve it out with an apache server.
That's one option although someone else on the list
Greetings!
I've been asked to do some indexing performance testing on Solr 1.3
using large XML document data sets (10M-60M docs) with DIH versus SolrJ.
Does anyone have any suggestions where I might find a good data set this
size?
I saw the wikipedia dump reference in the DIH wiki, but
Greetings!
I've been asked to do some indexing performance testing on Solr 1.3
using large XML document data sets (10M-60M docs) with DIH versus SolrJ.
Does anyone have any suggestions where I might find a good data set this
size?
I saw the wikipedia dump reference in the DIH wiki, but
: Re: Large Data Set Suggestions
Greetings!
I've been asked to do some indexing performance testing on Solr 1.3
using large XML document data sets (10M-60M docs) with DIH versus SolrJ.
Does anyone have any suggestions where I might find a good data set this
size?
I saw the wikipedia dump
The performance of DIH is likely to be faster than SolrJ. Because , it
does not have the overhead of an http request.
What is your data source? I am assuming it is xml. SolrJ cannot
directly index xml . You may need to read docs from xml before solrj
can index it.
--Noble
On Wed, Nov 5, 2008
14 matches
Mail list logo