Steve,
It seems to me your task has a lot in common with mines. I tell about
several approaches at next week
http://www.apachecon.eu/schedule/presentation/18/ .
Thanks
On Sun, Nov 4, 2012 at 6:43 AM, SR r.steve@gmail.com wrote:
Thanks Otis.
By we you mean Lucid works?
Is there a
Hi,
By we I meant Sematext, not LW.
You'll have to ask LW about open-sourcing their implementation.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 3, 2012 10:43 PM, SR r.steve@gmail.com wrote:
Thanks Otis.
By we you mean Lucid works?
Is there a chance to get it
On Fri, Nov 2, 2012 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote:
Well, I'm at my wits end. I tried your field definitions (using the
exampledocs XML) and they work just fine. As far as if you mess up the date
on the way in, you should be seeing stack traces in your log files.
On Sat, Nov 3, 2012 at 4:23 AM, Lance Norskog goks...@gmail.com wrote:
If any value is in a bogus format, the entire document batch in that HTTP
request fails. That is the right timestamp format.
The index may be corrupted somehow. Can you try removing all of the fields in
data/ and trying
Correct. There was a good thread on this topic on the ElasticSearch ML.
Search for oversharding and my name. Same ideas apply to SolrCloud.
Neither server offer automatic rebalancing yet, though ES lets you move
shards around on demand.
Otis
--
Performance Monitoring - http://sematext.com/spm
On
Thanks for the reply both and apologies if this is a recurring question.
From the sounds of it I am sure an optimize overnight when app traffic is
low will suffice. This will massively help with server perfomance I am
sure.
--
View this message in context:
Or, don't optimize (force merge) at all. Really. This is a manual override
for an automatic process, merging.
I can only think of one case where a forced merge makes sense:
1. All documents are reindexed.
2. Traditional Solr replication is used (not SolrCloud).
3. Replication is manually timed
Measure / monitor first :)
You may not need to optimize at all, especially if your index is always
being modified.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 4, 2012 3:03 PM, tictacs hed...@tactics.co.uk wrote:
Thanks for the reply both and apologies if this is a recurring
Yes. I can guarantee that a force merge will not massively help. It might not
even measurably help.
wunder
On Nov 4, 2012, at 1:05 PM, Otis Gospodnetic wrote:
Measure / monitor first :)
You may not need to optimize at all, especially if your index is always
being modified.
Otis
--
Thanks Everyone.
As Shawn mentioned, it was a memory issue. I reduced the amount allocated to
Java to 6 GB. And its been working pretty good.
I am re-indexing one of the SolrCloud. I was having trouble with optimizing
the data when I indexed last time
I am hoping optimizing will not be an
Hi All,
I was testing my solr on MMapDirectory, and while indexing, I get this error
lines in the log:
10:27:41.003 [commitScheduler-4-thread-1] ERROR
org.apache.solr.update.CommitTracker - auto commit
error...:org.apache.solr.common.SolrException: Error opening new searcher
at
Otis,
I believe I found the thread which contains a link about elasticsearch
and big data.
http://www.elasticsearch.org/videos/2012/06/05/big-data-search-and-analytics.html
We are dealing with data that is searched using time ranges. Does the
time data flow concept work in SOLR? Does it
Depends what you really need. Index aliases are very handy for having a
sliding last N days type search. Solr doesn't have that yetbut it may
be in jira.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 4, 2012 11:34 PM, Nathan Findley nat...@zenlok.com wrote:
Otis,
I
Hello all
i have a csv file of size 10 gb which i have to index using solr
my question is how to index the csv in such a way so that
i can get two separate index files of which one of the index is the index
for the first half of the csv and the second index is the index for the
second half of
Thanks Eric for the explanation. It helps me a lot :).
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-AutoSharding-In-enterprise-environment-tp4017036p4018194.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello all
i have a csv file of size 10 gb which i have to index using solr
my question is how to index the csv in such a way so that
i can get two separate index files of which one of the index is the index
for the first half of the csv and the second index is the index for the
second half of
On 5 November 2012 11:11, mitra mitra.re...@ornext.com wrote:
Hello all
i have a csv file of size 10 gb which i have to index using solr
my question is how to index the csv in such a way so that
i can get two separate index files of which one of the index is the index
for the first half of
I would use the Unix split command. You can give it a line count.
% split -l 1400 myfile.csv
You can use wc -l to count the lines.
wunder
On Nov 4, 2012, at 10:23 PM, Gora Mohanty wrote:
On 5 November 2012 11:11, mitra mitra.re...@ornext.com wrote:
Hello all
i have a csv file of
Michael Della Bitta-2 wrote
No, RAMDirectory doesn't work for replication. Use MMapDirectory... it
ends up storing the index in RAM and more efficiently so, plus it's
backed by disk.
Just be sure to not set a big heap because MMapDirectory works outside of
heap.
for my tests, i dont think
Hello all,
I have a situation for solr grouping where I want group my products into
top categories for a ecommerce application. The number of groups here is
less than 10 and total number of docs in the index is 10 Million. Will solr
goruping is an issue here, we have seen OOM issue when we
On 11/4/2012 11:41 PM, deniz wrote:
Michael Della Bitta-2 wrote
No, RAMDirectory doesn't work for replication. Use MMapDirectory... it
ends up storing the index in RAM and more efficiently so, plus it's
backed by disk.
Just be sure to not set a big heap because MMapDirectory works outside of
21 matches
Mail list logo