[ 
https://issues.apache.org/jira/browse/SOLR-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renaud Delbru updated SOLR-6460:
--------------------------------
    Attachment: SOLR-6460.patch

Here is the latest patch which includes an optimisation to reduce the number of 
opened files and some code cleaning. To summarise, the current patch provides 
the following:

h4. Cleaning of Old Transaction Logs

The CdcrUpdateLog removes old tlogs based on pointers instead of a fixed size 
limit.

h4. Log Reader

The CdcrUpdateLog provides a log reader with scan and seek operations. A log 
reader is associated to a log pointer, and is taking care of the life-cycle of 
the pointer.

h4. Log Index

To improve the efficiency of the seek operation of the log reader, an index of 
transaction log files have been added. This index enables to quickly lookup a 
tlog file based on a version number. This index is implemented by adding a 
version number to the tlog filename and by leveraging the file system index. 
This solution was choosen as it was simpler and more robust than managing a 
separate disk-based index.

h4. Number of Opened Files

TransactionLog has been extended to automatically (1) close the output stream 
when its refeference count reach 0, and (2) reopen the output stream on demand. 
The new tlog (the current tlog being written) is kept open at all time. When a 
transaction log is pushed to the old tlog list, its reference count is 
decremented, which might trigger the closing of the output stream. 
The output stream is reopened in two cases:
* during recovery, to write a commit to the end of an uncapped tlog file;
* when a log reader is accessing it.

At the moment, the logic is splitted into two classes (TransactionLog and 
CdcrTransactionLog). We should probably merge the two in the final version.

h4. Integration within the UpdateHandler

There is a nocommit in the UpdateHandler to force the instantiation of the 
CdcrUpdateLog instead of the UpdateLog. We need to decide how user will 
configure this and modify the UpdateHandler appropriately.


> Keep transaction logs around longer
> -----------------------------------
>
>                 Key: SOLR-6460
>                 URL: https://issues.apache.org/jira/browse/SOLR-6460
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Yonik Seeley
>         Attachments: SOLR-6460.patch, SOLR-6460.patch, SOLR-6460.patch
>
>
> Transaction logs are currently deleted relatively quickly... but we need to 
> keep them around much longer to be used as a source for cross-datacenter 
> recovery.  This will also be useful in the future for enabling peer-sync to 
> use more historical updates before falling back to replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to