[ 
https://issues.apache.org/jira/browse/LUCENE-9406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242858#comment-17242858
 ] 

Zach Chen commented on LUCENE-9406:
-----------------------------------

Hi [~mikemccand], I'm trying to find a new task to work on and see this. I took 
a look at the comments in the PR and see that you already have some work in 
progress code such as the interface. Just curious, has the discussion been 
carried out further in any way after that PR, and if this task is ready to be 
picked up again at this point (following your original approach to use 
*IndexWriterEvents* class maybe) ?

> Make it simpler to track IndexWriter's events
> ---------------------------------------------
>
>                 Key: LUCENE-9406
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9406
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Priority: Major
>
> This is the second spinoff from a [controversial PR to add a new index-time 
> feature to Lucene to merge small segments during 
> commit|https://github.com/apache/lucene-solr/pull/1552].  That change can 
> substantially reduce the number of small index segments to search.
> In that PR, there was a new proposed interface, {{IndexWriterEvents}}, giving 
> the application a chance to track when {{IndexWriter}} kicked off merges 
> during commit, how many, how long it waited, how often it gave up waiting, 
> etc.
> Such telemetry from production usage is really helpful when tuning settings 
> like which merges (e.g. a size threshold) to attempt on commit, and how long 
> to wait during commit, etc.
> I am splitting out this issue to explore possible approaches to do this.  
> E.g. [~simonw] proposed using a statistics class instead, but if I understood 
> that correctly, I think that would put the role of aggregation inside 
> {{IndexWriter}}, which is not ideal.
> Many interesting events, e.g. how many merges are being requested, how large 
> are they, how long did they take to complete or fail, etc., can be gleaned by 
> wrapping expert Lucene classes like {{MergePolicy}} and {{MergeScheduler}}.  
> But for those events that cannot (e.g. {{IndexWriter}} stopped waiting for 
> merges during commit), it would be very helpful to have some simple way to 
> track so applications can better tune.
> It is also possible to subclass {{IndexWriter}} and override key methods, but 
> I think that is inherently risky as {{IndexWriter}}'s protected methods are 
> not considered to be a stable API, and the synchronization used by 
> {{IndexWriter}} is confusing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to