[ 
https://jira.duraspace.org/browse/DS-892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20325#action_20325
 ] 

Bram Luyten (@mire) commented on DS-892:
----------------------------------------

With the risk of going out of scope on this issue, I was wondering what the 
desired behavior would be, with regards to updates/changes of statistics after 
moves or deletes of items.

In one way, you could think that statistics information from the past should be 
immutable to ensure consistency with previous reports. E.G, If I'm a repository 
manager, making monthly reports to my director about bitstream downloads for 
each of the collections, I don't want the current interface to contradict 
figures I reported from the past. So in that respect, item pageviews and 
downloads should only start counting for the new collection or community they 
are added to, after the move.

I can imagine there are usecases that totally contradict this. (for example, if 
a collection represents a department, and the department is split into two new 
departments (e.g. two new collections), you do want to transfer those 
historical downloads and pageviews to the new collection).



> Performance issues in update enabling the StatisticsLoggingConsumer
> -------------------------------------------------------------------
>
>                 Key: DS-892
>                 URL: https://jira.duraspace.org/browse/DS-892
>             Project: DSpace
>          Issue Type: Bug
>          Components: Solr
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1
>            Reporter: Andrea Bollini
>            Priority: Critical
>
> We have found that enabling the StatisticsLoggingConsumer to keep statistics 
> data up-to-date after item changes (metadata edit or collection 
> moving/mapping) the item update operations become slowly and the system 
> unusable.
> NOTE: the StatisticsLoggingConsumer is NOT enabled out-of-box in the 
> dspace.cfg this imply that your statistics data could be incongruous (item 
> access assigned to incorrect communities/collections)
> We noticed problems when there are large amount of statistics data (> 20M 
> records), for small repository (< 1M statistics record) the overhead is 
> acceptable.
> Finally, after the introduction of the autocommit patch, the 
> StatisticsLoggingConsumer is not more able to assure the data consistence 
> because the statistics data collected between two auto-commit are not 
> processed by the class.
> Our current idea is to discard the consumer approach in favour to implement a 
> batch tools to periodically analyze the statistics data and fix it as 
> appropriate.
> This issue is a placeholder for such feature and discussion around it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to