[ https://jira.duraspace.org/browse/DS-892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20325#action_20325 ]
Bram Luyten (@mire) commented on DS-892: ---------------------------------------- With the risk of going out of scope on this issue, I was wondering what the desired behavior would be, with regards to updates/changes of statistics after moves or deletes of items. In one way, you could think that statistics information from the past should be immutable to ensure consistency with previous reports. E.G, If I'm a repository manager, making monthly reports to my director about bitstream downloads for each of the collections, I don't want the current interface to contradict figures I reported from the past. So in that respect, item pageviews and downloads should only start counting for the new collection or community they are added to, after the move. I can imagine there are usecases that totally contradict this. (for example, if a collection represents a department, and the department is split into two new departments (e.g. two new collections), you do want to transfer those historical downloads and pageviews to the new collection). > Performance issues in update enabling the StatisticsLoggingConsumer > ------------------------------------------------------------------- > > Key: DS-892 > URL: https://jira.duraspace.org/browse/DS-892 > Project: DSpace > Issue Type: Bug > Components: Solr > Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1 > Reporter: Andrea Bollini > Priority: Critical > > We have found that enabling the StatisticsLoggingConsumer to keep statistics > data up-to-date after item changes (metadata edit or collection > moving/mapping) the item update operations become slowly and the system > unusable. > NOTE: the StatisticsLoggingConsumer is NOT enabled out-of-box in the > dspace.cfg this imply that your statistics data could be incongruous (item > access assigned to incorrect communities/collections) > We noticed problems when there are large amount of statistics data (> 20M > records), for small repository (< 1M statistics record) the overhead is > acceptable. > Finally, after the introduction of the autocommit patch, the > StatisticsLoggingConsumer is not more able to assure the data consistence > because the statistics data collected between two auto-commit are not > processed by the class. > Our current idea is to discard the consumer approach in favour to implement a > batch tools to periodically analyze the statistics data and fix it as > appropriate. > This issue is a placeholder for such feature and discussion around it. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.duraspace.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Achieve unprecedented app performance and reliability What every C/C++ and Fortran developer should know. Learn how Intel has extended the reach of its next-generation tools to help boost performance applications - inlcuding clusters. http://p.sf.net/sfu/intel-dev2devmay _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel