PageViewPlugin improvements, store statistics in JCR
----------------------------------------------------

                 Key: JSPWIKI-592
                 URL: https://issues.apache.org/jira/browse/JSPWIKI-592
             Project: JSPWiki
          Issue Type: Improvement
          Components: Plugins
    Affects Versions: 3.0
         Environment: All/NA
            Reporter: Harry Metske
            Priority: Minor


The current PageViewPlugin implementation stores the pageview counts in a file 
in the work directory.
This is fine for 2.8, but for 3.0 we would like to have that in in the 
repository for the following reasons :
* the counts don't get lost when we clear the work directory (it is called 
*work* directory after all :-) )
* if running in a clustered environment you have multiple places where counts 
are done, giving incorrect results

There are a couple of options :

* store the count as attribute of a (each) page Node
* store them all together (as binary blob ?) in one special page Node

Considerations:

*Performance*

There is already a memory cache in the PageViewPlugin, the pageview statistics 
in this cache are saved every 5 minutes to the file in the workdir.
If the stats are stored in a special single page, performance would be roughly 
the same.
If pagecount is stored as attribute in a page, performance will probably be 
worse, because we have to update every node that has a statistic entry in the 
memory cache.

*Cluster awareness*

When running in "scalable mode" , i.e. multiple wiki instances sharing the same 
repo, we have two options to achieve "correct" pageview counts :
* update the repo (special page or attribute of each page) on each pageview 
(dramatic performance penalty)
* maintain something like a memory cache like we currently do, and flush the 
contents each interval to the repo, reset the memory counters to zero and start 
counting from zero again, so we keep delta values in memory, and add them to 
the values in the repo. Displaying pageview counts are not more actual than the 
flush interval (you don't see the pageviews on other wiki members in the 
cluster until the flush-interval expires and you reread the total value from 
the repo)


My personal feeling is that the best option (trade off between 
accuracy/complexity/performance) is to keep an in-memory cache, store values in 
a single special non-versioned page that can be locked to guarantee serial 
access to the data.

Any other comments, suggestions, options are welcome here....

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to