[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-09-14 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take8.patch

Attached take8, incorporating Ning's feedback plus some small
refactoring and fixing one case where optimize() would do an
unecessary merge.

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, 
> LUCENE-847.take5.patch, LUCENE-847.take6.patch, LUCENE-847.take7.patch, 
> LUCENE-847.take8.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-09-12 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take7.patch

New patch (take 7).

I folded in Ning's comments (above) and Yonik's comments from
LUCENE-845, added javadocs & fixed Javadoc warnings and fixed two
other small issues.  All tests pass on Linux, OS X, win32, with either
SerialMergeScheduler or ConcurrentMergeScheduler as the default.

I plan to commit in a few days time...


> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, 
> LUCENE-847.take5.patch, LUCENE-847.take6.patch, LUCENE-847.take7.patch, 
> LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-09-10 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take6.patch

OK, another rev of the patch (take6).  I think it's close!

This patch passes all unit tests with SerialMergeScheduler (left as
the default for now) and also passes all unit tests once you switch
the default to ConcurrentMergeScheduler instead.

I made one simplification to the approach: IndexWriter now keeps track
of "pendingMerges" (merges that mergePolicy has declared are necessary
but have not yet been started), and "runningMerges" (merges currently
in flight).  Then MergeScheduler just asks IndexWriter for the next
pending merge when it's ready to run it.  This also cleaned up how
cascading works.

Other changes:

  * Optimize: optimize is now fully concurrent (it can run multiple
merges at once, new segments can be flushed during an optimize,
etc).  Optimize will optimize only those segments present when it
started (newly flushed segments may remain separate).

  * New API: optimize(boolean doWait) allows you to not wait for
optimize to complete (it runs in background).  This only works
when MergeScheduler uses threads.

  * New API: close(boolean doWait) allows you to not wait for running
merges if you want to "close in a hurry".  Also only works when
MergeScheduler uses threads.

  * I fixed LogMergePolicy to expose merge concurrency during optimize
by first calling the "normal" merge policy to see if it requires
merges and returning those merges if so, and then falling back to
the normal "merge the tail <= mergeFactor segments until there is
only 1 left".

  * Because IndexModifier synchronizes on directory, it can't use
ConcurrentMergeScheduler since this quickly leads to deadlock at
least during IndexWriter.close.  So I set it back to
SerialMergeScheduler (it is deprecated anyway).

  * Added private IndexWriter.message(...) that prints message to the
infoStream prefixed by the thread name and changed all
infoStream.print*'s to message(...).  Also added more messages in
the exceptional cases to aid future diagnostics.

  * Added more unit tests


> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, 
> LUCENE-847.take5.patch, LUCENE-847.take6.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-09-07 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take5.patch


Attached new patch (take5) incorporating Ning's feedback.

This patch includes LUCENE-845 (a new merge default merge policy plus
a "merge by size in bytes of segment" merge policy), LUCENE-847
(factor merge policy/scheduling out of IndexWriter) and LUCENE-870
(ConcurrentMergeScheduler).

The one thing remaining after these are done, that I'll open a
separate issue for and commit separately, is to switch IndexWriter to
flush by RAM usage by default (instead of by docCount == 10) as well
as merge by size-in-bytes by default.

I broke out a separate MergeScheduler interface.  SerialMergeScheduler
is the default (matches how merges are executed today: sequentially,
using the calling thread).  ConcurrentMergeScheduler runs the merges
as separate threads (up to a max number at which point the extras are
done sequentially).

Other changes:

  - Allow multiple threads to call optimize().  I added a unit test
for this.

  - Tightnened calls to deleter.refresh(), which remove partially
created files on an exception, to remove only those files that the
given piece of code would create.  This is very important because
otherwise refresh() could remove the files being created by a
background merge.

  - Added some unit tests


> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, 
> LUCENE-847.take5.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-27 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take4.patch

OK new patch:

  - Added the missing MergePolicy.java from last time that Ning caught
(thanks!)

  - Fixed some javadocs

  - Relaxed synchronization of merging so that merges can run
concurrently with flushing if you are using multiple thread to do
indexing.  This gains concurrency of merging even if you are not
using CMPW.  But I left flushing as synchronized; I think we can
relax this at some point in the future.

  - Fixed some concurrency issues

  - Added "minMergeDocs" to LogDocMergePolicy and "minMergeMB" to
LogByteSizeMergePolicy; set their defaults as described in
LUCENE-845.

Still a few small things to do.  I think it's getting close.


> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, 
> LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-27 Thread Michael McCandless

Woops, sorry!  I will post a new patch.

Mike

"Ning Li" <[EMAIL PROTECTED]> wrote:
> Hi Mike,
> 
> I cannot apply the patch cleanly. MergePolicy.java, e.g., seems to be
> missing from the patch.
> 
> 
> On 8/24/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
> >
> >  [ 
> > https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> >  ]
> >
> > Michael McCandless updated LUCENE-847:
> > --
> >
> > Attachment: LUCENE-847.take3.patch
> >
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-27 Thread Ning Li
Hi Mike,

I cannot apply the patch cleanly. MergePolicy.java, e.g., seems to be
missing from the patch.


On 8/24/07, Michael McCandless (JIRA) <[EMAIL PROTECTED]> wrote:
>
>  [ 
> https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>  ]
>
> Michael McCandless updated LUCENE-847:
> --
>
> Attachment: LUCENE-847.take3.patch
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Fix Version/s: 2.3

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Fix For: 2.3
>
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-24 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-847:
--

Attachment: LUCENE-847.take3.patch

OK I started from the original patch and made the changes described
below.

This is still a work in progress, but I think I think the new
stateless approach works very well.

All unit tests pass (one assert had to be changed in
TestAddIndexesNoOptimize).

I created a ConcurrentMergePolicyWrapper along with this (I'll post
patch to LUCENE-870).

I've also included the two merge policies from LUCENE-845 (still
defaulting to LogDocMergePolicy).

Here are the changes:

  - Renamed merge -> maybeMerge

  - Changed the API to be "stateless" meaning the merge policy is no
longer responsible for running the merges itself.  Instead, it
quickly returns the specification, which describes which merges
are needed, back to the writer and the writer then runs them.  I
also changed MergeSpecification to contain a list of OneMerge
instances.

  - Removed IndexMerger interface (just use IndexWriter instead)

  - Put isOptimized() logic into LogMergePolicy: on thinking about
this more (and seeing response to a thread on java-dev), I now
agree with Steve that this logically belongs in LogMergePolicy
because each MergePolicy is free to define just what it considers
"optimized" to mean.  Then I removed the MergePolicyBase.

  - Un-deprecated {get/set}{UseCompoundFile,MergeFactor,MaxMergeDocs}.
But I did leave the static constants deprecated.

  - IndexWriter keeps track of which segments are involved in running
merges and throws a MergeException if it's asked to initiate a
merge that involves a segment that's already being merged.

  - Fixed LogMergePolicy to return all possible merges (exposes
concurrency).

  - Implemented the "merge deletes when commiting the merge" algorithm
that Ning suggested (this is in commitMerge).

  - Assert that the merge request is in fact contiguous (at start &
finish of merge) & throw MergeException if not.

  - Fixed a number of sneaky concurrency issues so that CMPW would
work.  Broke "merge" into mergeInit, mergeMiddle and mergeFinish.
The first & last are carefully sychronized.

  - I put copyDirFiles in IW and call this in addIndexesNoOptimize
before committing new segments file: we can't let mergePolicy
leave the index inconsistent.

  - I reverted the changes to addIndexes(IndexReader[]): I think the
change here wasn't valid: you can't assume that you can re-create
any IndexReader instance by loading from its directory; I put the
original back for this method.

  - the changes to addIndexes I'm not sure are good.

  - Fixed LogMergePolicy to return more than 1 merge

  - Made CMPW

  - Renamed replace -> commitMerge; made it private.



> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-15 Thread Steven Parkes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Parkes updated LUCENE-847:
-

Attachment: LUCENE-847.patch.txt

Updated patch:

* Don't call deprecated methods
  - note: currently renamed with "_" prepended to make easy to find; don't 
commit
those
* Factor MergePolicyBase
* comments to remind to delete before commit (though might still have missed 
some)
* Make LDMP casts not throw bad cast
* Get rid of releaseMergePolicy and add doClose parameter on set

* Didn't factor copy from other dirs: requires compound file choices
* Didn't (yet) rename merge -> maybeMerge
   - Does this mean optimize -> maybeOptimize, too?

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.patch.txt, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-08-06 Thread Steven Parkes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Parkes updated LUCENE-847:
-

Attachment: LUCENE-847.patch.txt

Here's an update to the patch. I wouldn't say it's ready to be committed, but I 
think it's significantly closer than it was.

The concurrent and other misc. stuff have been pulled out (that part still 
needs work, figuring out how to get the concurrency right.)

The new patch works against trunk, which means it handles docswriter and is 
more compatible with merging by # of docs or merging by ram (or size, to be 
more accurate?)

My take on the migration path here was that we could well be going towards 
merging by size but need to keep merging by # docs for parallel index cases. 
The current patch still only does merging by # docs.

I think I commented on a couple of other things dev, but to reiterate:

There's a small change in the test results because the new merge policy 
simplifies the treatatement of addIndexes operations. The change is understood 
and shouldn't be a problem.

useCompoundFile is delegated to the merge policy so a smart merge policy could 
make decisions looking at the state of all segments rather than all-or-nothing. 
There are a couple of fixme's in IndexWriter related to this and the segments 
being created by the docswriter.

I'm going to look at that, plus the concurrent stuff: Ning's stuff plus by old 
approach (which has to change, given the new docswriter stuff).

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, 
> LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-05-28 Thread Michael Busch (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-847:
-

Component/s: Index

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Steven Parkes
>Assignee: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-03-28 Thread Ning Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Li updated LUCENE-847:
---

Attachment: concurrentMerge.patch

Here is a patch for concurrent merge as discussed in:
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651

I put it under this issue because it helps design and verify a factored merge 
policy which would provide good support for concurrent merge.

As described before, a merge thread is started when a writer is created and 
stopped when the writer is closed. The merge process consists of three steps: 
first, create a merge task/spec; then, carry out the actual merge; finally, 
"commit" the merged segment (replace segments it merged in segmentInfos), but 
only after appropriate deletes are applied. The first and last steps are fast 
and synchronous. The second step is where concurrency is achieved. Does it make 
sense to capture them as separate steps in the factored merge policy?

As discussed in 
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651:
 documents can be buffered while segments are merged, but no more than 
maxBufferedDocs can be buffered at any time. So this version provides limited 
concurrency. The main goal is to achieve short ingestion hiccups, especially 
when the ingestion rate is low. After the factored merge policy, we could 
provide different versions of concurrent merge policies which provide different 
levels of concurrency. :-)

All unit tests pass. If IndexWriter is replaced with 
IndexWriterConcurrentMerge, all unit tests pass except the following:
  - TestAddIndexesNoOptimize and TestIndexWriter*
This is because they check segment sizes expecting all merges are done. 
These tests pass if these checks are performed after the concurrent merges 
finish. The modified tests (with waits for concurrent merges to finish) are in 
TestIndexWriterConcurrentMerge*.
  - testExactFieldNames in TestBackwardCompatibility and 
testDeleteLeftoverFiles in TestIndexFileDeleter
In both cases, file name segments_a is expected, but the actual is 
segments_7. This is because with concurrent merge, if compound file is used, 
only the compound version is "committed" (added to segmentInfos), not the 
non-compound version, thus the lower segments generation number.

Cheers,
Ning


> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Steven Parkes
> Assigned To: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter

2007-03-23 Thread Steven Parkes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Parkes updated LUCENE-847:
-

Attachment: LUCENE-847.txt

Here's a first cut at a factored merge policy.

It's not polished. Sparsely commented and there are probably a few changes that 
should be backed out.

It factors a merge policy interface out of IndexWriter and creates an 
implementation of the existing merge policy.

Actually, it's a tweak on the existing merge policy. Currently the merge policy 
is implemented in ways that assume certain things about the existing list of 
segments. The factored version doesn't make these assumptions. It simplifies 
the interface but I'm not yet sure if there are bad side effects. Among other 
things I want to run performance tests.

There is part of a pass at a concurrent version of the current merge policy. 
It's not complete. I've been pushing it to see if I understand the issues 
around concurrent merges. Interesting topics are 1) how to control the merges 
2) how/when to cascade merges if they are happening in a parallel and 3) how to 
handle synchronization of IndexWriter#segmentInfos. That last one in particular 
is a bit touchy.

I did a quick implementation of KS's fib merge policy but it's incomplete in 
that IndexWriter won't merge non-contiguous segment lists, but I think I can 
fix that fairly easily with no major side effects. The factored merge policy 
makes this plug in pretty clean ...

> Factor merge policy out of IndexWriter
> --
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Steven Parkes
> Assigned To: Steven Parkes
> Attachments: LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]