Hi, Afaik the journal works this way:
Each change operation is stored in the journal with an increasing identifier (GLOBAL_REVISION contains the actual highest value). The cluster nodes check from time to time (see journal config) the journal if there are new change operations done to the repository. If other nodes has made changes, then the difference in the journal is processed by this cluster node resulting in correct indexes. If in this scenario a new node is setup, then simply all journal entries are processed to build up the correct index for the new node. If the janitor is setup, it deletes all journal entries that are not needed, that means that all cluster nodes already has processed this entries (this is checked by local_revisions). So if you would just add a new cluster node, then the index would be not complete. The solution would be to shutdown an already existing cluster node (so the node doesn't change during copy & setup of the new node), copy its local files (especially the index) and insert an correct entry for the node in local_revisions. I've never tested this, but that seems the way to go. Regards, Robert -----Ursprüngliche Nachricht----- Von: liang cheng [mailto:[email protected]] Gesendet: Mittwoch, 29. Mai 2013 09:27 An: [email protected] Cc: [email protected] Betreff: about removing Old Revisions from journal table. Hi, all In our production environment, the Jackrabbit Journal table would become large (more than 100, 000 records) after running 2 weeks. As a result, we plan to utilize the janitor thread to remove old revisions mentioned in http://wiki.apache.org/jackrabbit/Clustering#Removing Old Revisions. After enabling it, there would be several caveats as mentioned in the wiki page too. 1. If the janitor is enabled then you loose the possibility to easily add cluster nodes. (It is still possible but takes detailed knowledge of Jackrabbit.) 2. You must make sure that all cluster nodes have written their local revision to the database before the clean-up task runs for the first time because otherwise cluster nodes might miss updates (because they have been purged) and their local caches and search-indexes get out of sync. 3. If a cluster node is removed permanently from the cluster, then its entry in the LOCAL_REVISIONS table should be removed manually. Otherwise, the clean-up thread will not be effective. I can understand point #3.But not quite sure about #1 and #2. #1 is our biggest concern. In our production environment, we have cases to need add new cluster node(s), e.g. If system capacity could not handle current workload, or if some running node needs to be stopped for some while for maintenance and then new node needs to be added. In #1, you only say that "you loose the possibility to easily add cluster nodes", but doesn't give more explaination about the reason. As I know, when new node is added into the JR cluster, there is no lucene index, then Jackrabbit would build the index for the whole current repository nodes (build from root node). After this step, Jackrabbit then process the revisions generated by other nodes. *I wonder what's the possible issue when processing old revisions with latest repository content in cache and indexes? * For #2, *does it mean any manual work needed to keep the consistency?* Although the wiki page give one approch to add new cluster node manually (i.e. clone indexes and local revision number from existing node), we still hope there is some safe programming way to avoid the manual work, because our production is deployed in Amazon EC2 environment and adding new node needs easily as much as possible. Could you please give some comments to my concerns? Thanks. Regards, -Liang ________________________________ AEB treffen Sie im Juni auf diesen Veranstaltungen: transport logistic | 4.-7. Juni 2013 | München EXCHAiNGE | 18.-19. Juni 2013 | Frankfurt am Main Weitere Informationen und Terminreservierung unter: www.aeb.de/events<http://logi4.xiti.com/gopc.url?xts=487638&xtor=AD-5-[aeb%20mails]-[link%20in%20mailsignatur]-[intext]-[e-mail-signatur]-[0]-[]&url=http://www.aeb.de/de/events/index.php>
