[ 
https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2573:
------------------------------------

    Attachment: LUCENE-2573.patch

here is my current state on this issue. I did't add all JDocs needed (by far) 
and I will wait until we settled on the API for FlushPolicy.

* I removed the complex TieredFlushPolicy entirely and added one 
DefaultFlushPolicy that flushes at IWC.getRAMBufferSizeMB() / sets biggest DWPT 
pending.
* DW will stall threads if we reach 2 x maxNetRam which is retrieved from 
FlushPolicy so folks can lower that depending on their env.

* DWFlushControl checks if a single DWPT grows too large and sets it forcefully 
pending once its ram consumption is > 1.9 GB. That should be enough buffer to 
not reach the 2048MB limit. We should consider making this configurable.

* FlushPolicy has now three methods onInsert, onUpdate and onDelete while 
DefaultFlushPolicy only implements onInsert and onDelete, the Abstract base 
class just calls those on an update.

* I removed FlushControl from IW
* added documentation on IWC for FlushPolicy and removed the jdocs for the RAM 
limit. I think we should add some lines about how RAM is now used and that 
users should balance the RAM with the number of threads they are using. Will do 
that later on though.

* For testing I added a ThrottledIndexOutput that makes flushing slow so I can 
test if we are stalled and / or blocked. This is passed to 
MockDirectoryWrapper. Its currently under util but it rather should go under 
store, no?

* byte consumption is now committed before FlushPolicy is called since we don't 
have the multitier flush which required that to reliably proceed across tier 
boundaries (not required but it was easier to test really). So FP doesn't need 
to take care of the delta

* FlushPolicy now also flushes on maxBufferedDeleteTerms while the buffered 
delete terms is not yet connected to the DW#getNumBufferedDeleteTerms() which 
causes some failures though. I added //nocommit & @Ignore to those tests.

* this patch also contains a @Ignore on TestPersistentSnapshotDeletionPolicy 
which I couldn't figure out why it is failing but it could be due to an old 
version of LUCENE-2881 on this branch. I will see if it still fails once we 
merged.

* Healthiness now doesn't stall if we are not flushing on RAM consumption to 
ensure we don't lock in threads. 


over all this seems much closer now. I will start writing jdocs. Flush on 
buffered delete terms might need some tests and I should also write a more 
reliable test for Healthiness... current it relies on that the 
ThrottledIndexOutput is slowing down indexing enough to block which might not 
be true all the time. It didn't fail yet. 



> Tiered flushing of DWPTs by RAM with low/high water marks
> ---------------------------------------------------------
>
>                 Key: LUCENE-2573
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2573
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, 
> LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, 
> LUCENE-2573.patch
>
>
> Now that we have DocumentsWriterPerThreads we need to track total consumed 
> RAM across all DWPTs.
> A flushing strategy idea that was discussed in LUCENE-2324 was to use a 
> tiered approach:  
> - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM)
> - Flush all DWPTs at a high water mark (e.g. at 110%)
> - Use linear steps in between high and low watermark:  E.g. when 5 DWPTs are 
> used, flush at 90%, 95%, 100%, 105% and 110%.
> Should we allow the user to configure the low and high water mark values 
> explicitly using total values (e.g. low water mark at 120MB, high water mark 
> at 140MB)?  Or shall we keep for simplicity the single setRAMBufferSizeMB() 
> config method and use something like 90% and 110% for the water marks?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to