Sergiy Shyrkov created JCR-3560:
-----------------------------------

             Summary: CachingHierarchyManager.stateDiscarded() call to 
provider.hasItemState(discarded.getId())
                 Key: JCR-3560
                 URL: https://issues.apache.org/jira/browse/JCR-3560
             Project: Jackrabbit Content Repository
          Issue Type: Bug
          Components: jackrabbit-core
    Affects Versions: 2.6, 2.5.3, 2.4.3, 2.2.13
            Reporter: Sergiy Shyrkov
            Priority: Minor


As a follow up of the discussion in users mailing list: 
http://markmail.org/thread/w4ubczmddkqdzoac

Hello guys,

I would need your help in understanding the code in the 
CachingHierarchyManager.stateDiscarded() method (Jackrabbit 2.2.4). A short 
background: I am working on a tool to purge orphaned version history from the 
repository (version history for nodes, which are no longer present in the 
repository). The version store is quite large (the row count in the version 
bundle table is around 12 000 000 entries). When profiling the tool execution I 
see in the snapshot that in the CachingHierarchyManager.stateDiscarded() method 
the provider.hasItemState(discarded.getId()) is called. In my case this happens 
quite often and the call provider.hasItemState() goes into the DB and loads the 
bundle. The things is that the item is no longer present and the 
provider.hasItemState() evaluates to false.

The call in my case is coming from the 
org.apache.jackrabbit.core.state.ChangeLog.persisted() where we have the 
following lines:

... for (ItemState state : deletedStates()) { 
state.setStatus(ItemState.STATUS_EXISTING_REMOVED); 
state.notifyStateDestroyed(); state.discard(); } ...

The state.discard() calls the notifyStateDiscarded();

Is it really needed here after we have already called 
state.notifyStateDestroyed()?

If yes, is there any way we can optimize the check in the 
CachingHierarchyManager.stateDiscarded() to not call provider.hasItemState() in 
some cases? Perhaps check for discarded.getStatus() != 
ItemState.STATUS_EXISTING_REMOVED first or similar?

I am attaching the profiler snapshot 
(CachingHierarchyManager-stateDiscarded.png). Also provided here: 
http://img580.imageshack.us/img580/7179/cachinghierarchymanager.png
And just in case the image won't get through the mailing list, here is a text 
representation of the method call trace: 

org.apache.jackrabbit.core.state.ChangeLog.persisted()
 - org.apache.jackrabbit.core.state.ItemState.discard()
  - org.apache.jackrabbit.core.state.ItemState.notifyStateDiscarded()
   - 
org.apache.jackrabbit.core.state.SharedItemStateManager.stateDiscarded(ItemState)
    - 
org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyStateDiscarded(ItemState)
     - 
org.apache.jackrabbit.core.state.SharedItemStateManager.stateDiscarded(ItemState)
      - 
org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyStateDiscarded(ItemState)
       - 
org.apache.jackrabbit.core.CachingHierarchyManager.stateDiscarded(ItemState)
        - 
org.apache.jackrabbit.core.state.SharedItemStateManager.hasItemState(ItemId)
         - 
org.apache.jackrabbit.core.state.SharedItemStateManager.hasNonVirtualItemState(ItemId)
          - 
org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.exists(PropertyId)
           - 
org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.getBundle(NodeId)
            - 
org.apache.jackrabbit.core.persistence.pool.BundleDbPersistenceManager.loadBundle(NodeId)
             - org.apache.jackrabbit.core.util.db.ConnectionHelper.exec(String, 
Object[], boolean, int)


Thank you in advance for any hints!
Sergiy Shyrkov

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to