Hi Andrea,

Just a wee point about GA stats with apologies if I am stating the obvious. You 
can present data going back as long as you have been collecting it, not just 
from the moment you enable the DSpace GA Stats XMLUI aspect.

Cheers, Robin.

Robin Taylor
Main Library
University of Edinburgh
________________________________________
From: Andrea Schweer <schw...@waikato.ac.nz>
Sent: 12 March 2015 20:30
To: Peter Dietz
Cc: DSpace Developers
Subject: Re: [Dspace-devel] We need to think a bit more about how we use the 
'statistics' Solr core

Hi Peter, all,

On 13/03/15 07:35, Peter Dietz wrote:
> ES is equally guilty of being a statistics data source, by storing
> original/raw. So, statistics is something that complicates DSpace's
> role in preserving assets, since stats are a value-add, and not a core
> repository function. But, since repo managers enjoy statistics, we
> can't not offer statistics. I would however like to offload the role
> of stats to a third party, such as Google Analytics though.

I mentioned the new GA integration / pushing bitstream downloads to GA
functionality to my repository managers. Some of them are still quite
concerned since their repositories have stats going back 5+ years. They
were not happy with losing historical stats data (even keeping in mind
how inaccurate it probably is).

> Back to the relevant discussion. Both SOLR and ES prefer to be just
> indexes, something that you could rebuild if necessary. If you have
> all dspace.log's you potentially could rebuild, but its very
> laborsome. I've considered having an alternative log file,
> logs/usage-stats.<date>.log, that was similar to the output of
> stats-log-exporter|convertor, and input of stats-log-importer. Thus,
> that would be the source of record, and the stats engines could
> rebuild from this. Currently more information is being stored in the
> stats engines than gets logged to dspace.log (useragent, hostname, ...).
>
> I've added the ability for SOLR to export its data to csv:
> https://github.com/DSpace/DSpace/commit/f57619d726c07535ce786a3f79e9c39d56fd9031
> So, potentially, one could run that regularly to have backup data
> points...

That's a good start, but your code only stores some of the data, similar
to what is in the dspace.log files (actually, less than that, since your
code discards information about the currently logged in user -- not that
this is necessarily bad since this isn't shown in the current stats
interface). Is that because this is the format expected by the legacy
stats loader? If so, perhaps both of those could be improved to not
discard information? Which still leaves the issue of someone wanting to
switch between ElasticSearch and Solr stats without data loss, if these
two store different information.

cheers,
Andrea

--
Dr Andrea Schweer
IRR Technical Specialist, ITS Information Systems
The University of Waikato, Hamilton, New Zealand


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to