AW: about removing Old Revisions from journal table.

Seidel. Robert Wed, 29 May 2013 01:00:21 -0700

Hi,

#2 - you can ignore this one. The janitor deletes only entries which are older 
than the lowest entry in local revisions. So only if you just setup a new 
cluster node, which has never written its local revision entry at all would be 
affected.


Regards, Robert

-----Ursprüngliche Nachricht-----
Von: liang cheng [mailto:[email protected]]
Gesendet: Mittwoch, 29. Mai 2013 09:27
An: [email protected]
Cc: [email protected]
Betreff: about removing Old Revisions from journal table.

 Hi, all
   In our production environment, the Jackrabbit Journal table would become 
large (more than 100, 000 records) after running 2 weeks. As a result, we plan 
to utilize the janitor thread to remove old revisions mentioned in 
http://wiki.apache.org/jackrabbit/Clustering#Removing Old Revisions.
  After enabling it, there would be several caveats as mentioned in the wiki 
page too.
       1. If the janitor is enabled then you loose the possibility to easily 
add cluster nodes. (It is still possible but takes detailed knowledge of 
Jackrabbit.)
       2. You must make sure that all cluster nodes have written their local 
revision to the database before the clean-up task runs for the first
time because otherwise    cluster nodes might miss updates (because they
have been purged) and their local caches and search-indexes get out of sync.
      3. If a cluster node is removed permanently from the cluster, then its 
entry in the LOCAL_REVISIONS table should be removed manually.
Otherwise, the clean-up thread will not be effective.

  I can understand point #3.But not quite sure about #1 and #2.

  #1 is our biggest concern. In our production environment,  we have cases to 
need add new cluster node(s), e.g. If system capacity could not handle current 
workload, or if some running node needs to be stopped for some while for 
maintenance and then new node needs to be added. In #1, you only say that "you 
loose the possibility to easily add cluster nodes", but doesn't give more 
explaination about the reason.  As I know, when new node is added into the JR 
cluster, there is no lucene index, then Jackrabbit would build the index for 
the whole current repository nodes (build from root node). After this step, 
Jackrabbit then process the revisions generated by other nodes. *I wonder 
what's the possible issue when processing old revisions with latest repository 
content in cache and indexes?
*

  For #2, *does it mean any manual work needed to keep the consistency?*



  Although the wiki page give one approch to add new cluster node manually 
(i.e. clone indexes and local revision number from existing node), we still 
hope there is some safe  programming way to avoid the manual work, because our 
production is deployed in Amazon EC2 environment and adding new node needs 
easily as much as possible.

  Could you please give some comments to my concerns? Thanks.


Regards,

-Liang
________________________________

AEB treffen Sie im Juni auf diesen Veranstaltungen:
transport logistic | 4.-7. Juni 2013 | München
EXCHAiNGE | 18.-19. Juni 2013 | Frankfurt am Main
Weitere Informationen und Terminreservierung unter: 
www.aeb.de/events<http://logi4.xiti.com/gopc.url?xts=487638&xtor=AD-5-[aeb%20mails]-[link%20in%20mailsignatur]-[intext]-[e-mail-signatur]-[0]-[]&url=http://www.aeb.de/de/events/index.php>

AW: about removing Old Revisions from journal table.

Reply via email to