[ https://issues.apache.org/jira/browse/SOLR-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Renaud Delbru updated SOLR-6460: -------------------------------- Attachment: SOLR-6460.patch Here is the latest patch which includes an optimisation to reduce the number of opened files and some code cleaning. To summarise, the current patch provides the following: h4. Cleaning of Old Transaction Logs The CdcrUpdateLog removes old tlogs based on pointers instead of a fixed size limit. h4. Log Reader The CdcrUpdateLog provides a log reader with scan and seek operations. A log reader is associated to a log pointer, and is taking care of the life-cycle of the pointer. h4. Log Index To improve the efficiency of the seek operation of the log reader, an index of transaction log files have been added. This index enables to quickly lookup a tlog file based on a version number. This index is implemented by adding a version number to the tlog filename and by leveraging the file system index. This solution was choosen as it was simpler and more robust than managing a separate disk-based index. h4. Number of Opened Files TransactionLog has been extended to automatically (1) close the output stream when its refeference count reach 0, and (2) reopen the output stream on demand. The new tlog (the current tlog being written) is kept open at all time. When a transaction log is pushed to the old tlog list, its reference count is decremented, which might trigger the closing of the output stream. The output stream is reopened in two cases: * during recovery, to write a commit to the end of an uncapped tlog file; * when a log reader is accessing it. At the moment, the logic is splitted into two classes (TransactionLog and CdcrTransactionLog). We should probably merge the two in the final version. h4. Integration within the UpdateHandler There is a nocommit in the UpdateHandler to force the instantiation of the CdcrUpdateLog instead of the UpdateLog. We need to decide how user will configure this and modify the UpdateHandler appropriately. > Keep transaction logs around longer > ----------------------------------- > > Key: SOLR-6460 > URL: https://issues.apache.org/jira/browse/SOLR-6460 > Project: Solr > Issue Type: Sub-task > Reporter: Yonik Seeley > Attachments: SOLR-6460.patch, SOLR-6460.patch, SOLR-6460.patch > > > Transaction logs are currently deleted relatively quickly... but we need to > keep them around much longer to be used as a source for cross-datacenter > recovery. This will also be useful in the future for enabling peer-sync to > use more historical updates before falling back to replication. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org