On Tue, Jan 12, 2010 at 10:14 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> Jake,
>
> I wonder how often people need reliable transactions for
> realtime search? Maybe Mysql's t-log could be used sans the
> database part?
>
A reliable message queue - I'd imagine all the time! Trans
Jake,
I wonder how often people need reliable transactions for
realtime search? Maybe Mysql's t-log could be used sans the
database part?
The created_at column for near realtime seems like it could hurt
the database due to excessive polling? Has anyone tried it yet?
> I wrote up a simple file-ba
On Tue, Jan 12, 2010 at 10:46:29PM -0500, DM Smith wrote:
> So starting at 0, the size is 0.
> 0 => 0
> 0 + 1 => 4
> 4 + 1 => 8
> 8 + 1 => 16
> 16 + 1 => 25
> 25 + 1 => 35
> ...
>
> So I think the copied python comment is correct but not obviously correct.
So those numbers are supposed to be whe
On Tue, Jan 12, 2010 at 8:55 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> > Zoie keeps track of an "index version" on disk alongside the Lucene index
> which it uses to decide where it must reindex from to "catch up" if it there
> have been incoming indexing events while the server
Actually, unless IW.commit is called, all changes after the last
commit will be lost (because the segment infos file will not have been
written).
On Tue, Jan 12, 2010 at 3:37 PM, Jason Rutherglen
wrote:
> Greetin's John,
>
> 2.9 and 3.0 don't use a RAMDir... Deletes are held in RAM however so
> o
> Zoie keeps track of an "index version" on disk alongside the Lucene index
> which it uses to decide where it must reindex from to "catch up" if it there
> have been incoming indexing events while the server was out of commission.
This begs a little more clarity... Sounds like a transaction log
On Tue, Jan 12, 2010 at 8:15 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> John, you should have a look at Zoie. I just finished adding LinkedIn's
> case study about Zoie to Lucene in Action 2, so this is fresh in my mind.
:)
>
Yep, Zoie ( http://zoie.googlecode.com ) will handle
Heh, yeah, I forgot about that. Pick the lesser evil? I like speedier
defaults.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
> From: Mark Miller
> To: java-dev@lucene.apache.org
> Sent: Tue, January 12, 2010 5:35:49 PM
> Subject: Re: Compoun
John, you should have a look at Zoie. I just finished adding LinkedIn's case
study about Zoie to Lucene in Action 2, so this is fresh in my mind. :)
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
> From: jchang
> To: java-dev@lucene.apache.org
John,
Yes, you should get 2.9.0 or 3.0.0, their indexing is faster. Still, even with
2.4.0 you shouldn't run into problems if you are really using just 1
IndexWriter. Still, I'd try upgrading first. Oh, and java-user is the place
to ask.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lu
On Jan 12, 2010, at 6:27 PM, Marvin Humphrey wrote:
> Greets,
>
> I've been trying to understand this comment regarding ArrayUtil.getNextSize():
>
> * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
>
> Maybe I'm missing something, but I can't see how the formula yields su
Hi,
I am trying to optimize the index which would merge different segment together.
Let say the index folder is 1Gb in total, I need each segmentation to be no
larger than 200Mb. I tried to use LogByteSizeMergePolicy and setMaxMergeMB(100)
to ensure no segment after merging would be 200Mb. Howe
> And generally, if anybody has any advice on high-throughput indexing with
> Lucene and what kind of numbers I can acheive, I'd welcome the feedback.
I believe it's directly related to how often IW.getReader is called.
The longer the duration between calls, larger the resultant new
segment is (wh
Greetin's John,
2.9 and 3.0 don't use a RAMDir... Deletes are held in RAM however so
on power off, those would be lost.
Jason
On Tue, Jan 12, 2010 at 3:10 PM, jchang wrote:
>
> Lucene 2.9.0 has near real time indexing, writing to a RAMDir which gets
> flushed to disk when you do a search.
>
> D
I always turn CFS off because it's extra work (no payoff), how's it
possible to run into an out of fd limit with a merge factor of 10?
On Tue, Jan 12, 2010 at 2:35 PM, Mark Miller wrote:
> Otis Gospodnetic wrote:
>> At the same time, seeing how some people benchmark systems without tuning
>> the
Greets,
I've been trying to understand this comment regarding ArrayUtil.getNextSize():
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
Maybe I'm missing something, but I can't see how the formula yields such a
growth pattern:
return (targetSize >> 3) + (targetSize <
Hello,
I am using Lucene 2.4.0 and am getting
org.apache.lucene.store.LockObtainFailedException's when I have a backed up
queue of items to index (with multiple concurrent writers). Of course, if I
throttle all my writer threads to 1, I don't get the exception, but I'm
hoping to write faster th
[
https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799459#action_12799459
]
Uwe Schindler commented on LUCENE-2193:
---
Two opinions:
- the top-level "/backwards/
Lucene 2.9.0 has near real time indexing, writing to a RAMDir which gets
flushed to disk when you do a search.
Does anybody know how this works out with service restarts (both orderly
shutdown and a crash)? If the service goes down while indexed items are in
RAMDir but not on disk, are they lost
[
https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799433#action_12799433
]
Uwe Schindler edited comment on LUCENE-2193 at 1/12/10 10:50 PM:
---
Otis Gospodnetic wrote:
> At the same time, seeing how some people benchmark systems without tuning
> them and then publish their results, cfs may be safer.
>
>
Though at the same time you get nailed with a 10-15% indexing speed hit.
--
- Mark
http://www.lucidimagination.com
I think what has changed is that a lot more people hit this problem, and a
number of people provided answers, so it's much easier now for a new person to
learn what to do when this limit is hit.
At the same time, seeing how some people benchmark systems without tuning them
and then publish thei
[
https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2193:
--
Attachment: LUCENE-2193.patch
Here a patch that implements the above. It contains a lot variab
Can you check the same for the benchmark package as it also downloads external
resources, that must appear in svn:ignore?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Simon Willnauer [mailto:simon.wil
Thanks uwe - this was itching me for a while now.
simon
On Tue, Jan 12, 2010 at 9:57 PM, wrote:
> Author: uschindler
> Date: Tue Jan 12 20:57:56 2010
> New Revision: 898510
>
> URL: http://svn.apache.org/viewvc?rev=898510&view=rev
> Log:
> add missing svn props as preparation for LUCENE-2193
>
[
https://issues.apache.org/jira/browse/LUCENE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799387#action_12799387
]
Simon Willnauer commented on LUCENE-2188:
-
good stuff uwe, I will fix LUCENE-2183
[
https://issues.apache.org/jira/browse/LUCENE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796027#action_12796027
]
Uwe Schindler edited comment on LUCENE-2183 at 1/12/10 9:05 PM:
[
https://issues.apache.org/jira/browse/LUCENE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler resolved LUCENE-2188.
---
Resolution: Fixed
Committed revision: 898507
> A handy utility class for tracking deprecate
[
https://issues.apache.org/jira/browse/LUCENE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2188:
--
Attachment: LUCENE-2188.patch
Changed javadocs and changes.txt.
Will commit this now.
> A ha
[
https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir resolved LUCENE-2181.
-
Resolution: Fixed
Fix Version/s: 3.1
Committed revision 898491.
Thanks Steven!
> benchm
[
https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-2201:
Attachment: LUCENE-2201.patch
here is a patch showing what i mean, it seems almost silly but appea
[
https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799335#action_12799335
]
Robert Muir commented on LUCENE-2201:
-
Hello, I was looking at this, and it causes pro
[
https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir reassigned LUCENE-2201:
---
Assignee: Robert Muir
> more performance improvements for snowball
> ---
On Tue, Jan 12, 2010 at 11:05:13AM -0500, Grant Ingersoll wrote:
> At any rate, I feel pretty safe assuming no one is running a production
> system on a MBP...
I don't really care whether Lucene defaults to the compound file format or not
(KS does, Lucy will, and that's good enough for me), but i
Dear Developers
We are looking for Java/Lucene/Nutch developers with over 2-3 years of
experience for a
project we are currently working on.
The location is Zurich, Switzerland onsite and the job is as employee or
contractor.
Please reply me privately with your contact details and experienc
[
https://issues.apache.org/jira/browse/LUCENE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799261#action_12799261
]
Grant Ingersoll commented on LUCENE-2127:
-
So, my patch seems to be faster than or
I'm not sure that it's safe to assume that production use of Lucene is
not on a laptop or that it is always on big iron.
It makes sense that Lucene is embedded in all sorts of desktop
applications that might run on small machines. That certainly describes
the application that I work on.
I'm
My MBP has 7168.
Maybe something like MySQL or other tools modify it, but I'm pretty positive I
didn't.
At any rate, I feel pretty safe assuming no one is running a production system
on a MBP...
I suppose if we wanted to get really fancy, we could, on *NIX systems, exec
ulimit and parse the
On Tue, Jan 12, 2010 at 09:49:09AM -0500, Grant Ingersoll wrote:
> My Mac (non-laptop) reports:
> ulimit -n
> 2560
>
> And I know I didn't change it.
Before I posted, I had a few officemates corroborate. 4 people had 256 --
three on 10.6 and me on 10.5. I think these were all Mac Book Pros. T
[
https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799225#action_12799225
]
Robert Muir commented on LUCENE-2203:
-
here is a link to the bug with an example:
htt
[
https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799221#action_12799221
]
Robert Muir commented on LUCENE-2203:
-
Hello, I ran some tests today, and the problems
256 here (MBP)
On Tue, Jan 12, 2010 at 17:49, Grant Ingersoll wrote:
>
> On Jan 11, 2010, at 4:25 PM, Marvin Humphrey wrote:
>
>> On Mon, Jan 11, 2010 at 03:20:17PM -0500, Grant Ingersoll wrote:
>>> Should we really still be defaulting to true for setUseCompoundFile? Do
>>> people still run out
On Jan 11, 2010, at 4:25 PM, Marvin Humphrey wrote:
> On Mon, Jan 11, 2010 at 03:20:17PM -0500, Grant Ingersoll wrote:
>> Should we really still be defaulting to true for setUseCompoundFile? Do
>> people still run out of file handles?
>
> Yep. You're going to smack up against that limit pretty
[
https://issues.apache.org/jira/browse/LUCENE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Sekiguchi resolved LUCENE-2204.
Resolution: Fixed
Committed revision 898323.
> FastVectorHighlighter: some classes and me
[
https://issues.apache.org/jira/browse/LUCENE-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799167#action_12799167
]
Uwe Schindler edited comment on LUCENE-2191 at 1/12/10 12:04 PM:
---
[
https://issues.apache.org/jira/browse/LUCENE-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799167#action_12799167
]
Uwe Schindler commented on LUCENE-2191:
---
The transition to the new name still has pr
[
https://issues.apache.org/jira/browse/LUCENE-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2188:
--
Attachment: LUCENE-2188.patch
I renamed the class to VirtualMethod as suggested by Simon and a
47 matches
Mail list logo