that there are plans to have Zoie use Lucene 2.9. How long would you
say before it's available?
Thanks,
E
-Original Message-
From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
Sent: Sat 10/10/2009 12:16 PM
To: java-user@lucene.apache.org
Subject: Re: Realtime distributed
John
...@gmail.com]
Sent: Sat 10/10/2009 12:16 PM
To: java-user@lucene.apache.org
Subject: Re: Realtime distributed
John,
Actually everyone is entitled to their technical opinion and
none of the comments were misleading. Jake and yourself
validated that they are true in your comments. I'm simply trying
: Realtime distributed
John,
Actually everyone is entitled to their technical opinion and
none of the comments were misleading. Jake and yourself
validated that they are true in your comments. I'm simply trying
to create better technology as is everyone on here. The process
takes time
Ok nevermind actually - the simultaneous indexing was something done in zoie
1.3,
and was changed in 1.4 to addIndexesNoOptimize() on the RAMDirectory indexes
as soon as they are big enough.
It's still true that you can throw away the RAMDirectory once the disk index
is
reopened though.
-jake
John,
Actually everyone is entitled to their technical opinion and
none of the comments were misleading. Jake and yourself
validated that they are true in your comments. I'm simply trying
to create better technology as is everyone on here. The process
takes time and coordination between many
Jake and John,
It would be interesting and enlightening to see NRT performance
numbers in a variety of configurations. The best way to go about
this is to post benchmarks that others may run in their
environment which can then be tweaked for their unique edge
cases. I wish I had more time to work
Jason,
We've been running some perf/load/stress tests lately, but on a suggestion
from Ted Dunning, I've been trying to come up with a more realistic set of
stress
tests and indexing rates to see where NRT performs well and where it does
not,
instead of just indexing at maximum rate, looping
The dimensions sound good. It's unclear if you're going to post a
chart again, numbers, or code? There's a LUCENE-1577 Jira issue for
code.
On Fri, Oct 9, 2009 at 12:37 PM, Jake Mannix jake.man...@gmail.com wrote:
Jason,
We've been running some perf/load/stress tests lately, but on a
I can provide some preliminary numbers (we will need to do some detailed
analysis and post it somewhere):
Dataset: medline
starting index: empty.
add only, no update, for 30 min.
maximum indexing load, 1000 docs/ sec
Under stress, we take indexing events (add only) and stream into both
systems:
Hey Eric,
My consulting company specializes in scalable, real-time search with
distributed Lucene. I'm more than happy to chat, if you'd like! :)
Cheers,
Bradford
On Thu, Oct 8, 2009 at 7:00 PM, Angel, Eric ean...@business.com wrote:
Does anyone have any recommendations? I've looked at
My deepest apologies for the spam, everyone. I slipped on my G-mail button :)
On Fri, Oct 9, 2009 at 9:09 PM, Bradford Stephens
bradfordsteph...@gmail.com wrote:
Hey Eric,
My consulting company specializes in scalable, real-time search with
distributed Lucene. I'm more than happy to chat, if
Hi Jake,
Zoie looks like a a really cool project. I'd like to learn more about
the distributed part of the setup. Any way you could describe that
here or on the wiki?
-Mike
On Thu, Oct 8, 2009 at 9:24 PM, Jake Mannix jake.man...@gmail.com wrote:
On Thu, Oct 8, 2009 at 7:00 PM, Angel, Eric
Hi Mike,
Zoie itself doesn't do anything with the new with the distributed
side of things - it just plays nicely with it. Zoie, at its core,
exposes a couple of primary interfaces (well, this is a slightly
simplified form of them) :
interface IndexReaderFactory { List getIndexReaders();
Does anyone have any recommendations? I've looked at Katta, but it
doesn't seem to support realtime searching. It also uses hdfs, which
I've heard can be slow. I'm looking to serve 40gb of indexes and
support about 1 million updates per day.
Thx
Eric,
Katta doesn't require HDFS which would be slow to search on,
though Katta can be used to copy indexes out of HDFS onto local
servers. The best bet is hardware that uses SSDs because merges
and update latency will greatly decrease and there won't be a
synchronous IO issue as there is with
Jason,
On Thu, Oct 8, 2009 at 7:56 PM, Jason Rutherglen jason.rutherg...@gmail.com
wrote:
Today near realtime search (with or without SSDs) comes at a
price, that is reduced indexing speed due to continued in RAM
merging. People typically hack something together where indexes
are held in a
On Thu, Oct 8, 2009 at 7:00 PM, Angel, Eric ean...@business.com wrote:
Does anyone have any recommendations? I've looked at Katta, but it doesn't
seem to support realtime searching. It also uses hdfs, which I've heard can
be slow. I'm looking to serve 40gb of indexes and support about 1
On Thu, Oct 8, 2009 at 7:56 PM, Jason Rutherglen jason.rutherg...@gmail.com
wrote:
There is the Zoie system which uses the RAMDir
solution,
Also, to clarify: zoie does not index into a RAMDir and then periodically
merge that
down to disk, as for one thing, this has a bad failure mode when
Jason:
I would really appreciate it if you would stop making false
statements and misinformation. Everyone is entitled to his/her opinions on
technologies, but deliberately making misleading and false information on
such a distribution is just unethical, and you'll end up just discrediting
19 matches
Mail list logo