Re: [Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?

2010-08-15 Thread Andrea Bollini

 Hi Reinhard,
I too am working on this.
You can use the metadataStorage facility already in place in the SORL 
statistics dspace engine.

You need to add some configuration to you dspace.cfg
event.dispatcher.default.consumers = *solrstat, *search, browse, eperson

# consumer to maintain the solr statistics metadata storage and parents 
mapping
event.consumer.solrstat.class = 
org.dspace.statistics.StatisticsLoggingConsumer
event.consumer.solrstat.filters = 
Item+Delete|Modify|Modify_Metadata:Collection+Add|Remove


solr.metadata.item.1 = contributor:dc.contributor.*
solr.metadata.item.2 = dctype:dc.type.*

the first part (before the :) is the name of the solr field where the 
metadata values will be stored.
You need to add to the schema.xml the new fields (and for any one an 
additional _search field), ie


field name=contributor type=string indexed=true stored=true 
required=false/
field name=contributor_search type=string indexed=true 
stored=true required=false/
field name=dctype type=string indexed=true stored=true 
required=false/
field name=dctype_search type=string indexed=true stored=true 
required=false/


restart tomcat and start to collect statistics.
If you edit some metadata of an item with usage data already collected 
they will be updated with the new fields.
If you want update all the solr usage data document you could take a 
look at the StatisticsLoggingConsumer class in org.dspace.statistics 
(make a script that simulate a modify_metadata event on all items).


I know that there are some issues in the metadataStorage, btw it doesn't 
work with metadata without qualified (ie dc.type) and it doesn't take 
care of authority key if present.
I will send a bug/improvement report on jira as soon as possible with 
the needed patch (I already have it applied on my source, I only need to 
extract and post it... sorry to be so lazy)

Andrea





Il 04/08/2010 21:21, Reinhard Engels ha scritto:

Hi all,

We recently upgraded to 1.6 -- so far so good -- but still have a few
questions about how the SOLR usage statistics work and can be
extended.

We'd like to generate bitstream download counts by author, keyword,
and other fields.

But it doesn't look like any of that information is currently included
in the SOLR index (at least, it certainly isn't for the historical
dspace.log data).

I'm assuming it's going to require some custom coding to massage
SORL/generate these reports, but just wanted to double check before I
start reinventing wheels.

Also, I'd initially assumed that the community and collection counts
were aggregations of the number of item hits within each community or
collection, but it looks like they're just hit counts to the
community/collections home pages. Is that accurate?

Thanks in advance for any light you can shed on this!

Reinhard Engels

--
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech





--
Dott. Andrea Bollini
Project Manager, IT Architect  Systems Integrator
Sezione Servizi per le Biblioteche e l'Editoria Elettronica
CILEA, http://www.cilea.it
tel. +39 06-59292853
cel. +39 348-8277525

---

Disclaimer: the content of this email is confidential and may be privileged, 
and it must not be disclosed or copied without the sender's consent. If you 
have received this message in error, please notify the sender and remove it 
from your system. The content of this email does not constitute legal advice, 
nor any responsibility is accepted for loss or damage incurred as a result of 
acting upon its contents or attachments.
The statements and opinions expressed in this email are those of the author and 
do not necessarily reflect those of the employer.

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?

2010-08-15 Thread Reinhard Engels
Hi Andrea,

Please don't apologize for solving my problems! I'm thrilled -- and
let me know if there's any way I can help.

I'm on vacation this week, but when I get back later this month I'll
be diving back in hard and perhaps something will occur to me.

Best,

Reinhard

On Sun, Aug 15, 2010 at 9:47 AM, Andrea Bollini boll...@cilea.it wrote:
 Hi Reinhard,
 I too am working on this.
 You can use the metadataStorage facility already in place in the SORL
 statistics dspace engine.
 You need to add some configuration to you dspace.cfg
 event.dispatcher.default.consumers = solrstat, search, browse, eperson

 # consumer to maintain the solr statistics metadata storage and parents
 mapping
 event.consumer.solrstat.class =
 org.dspace.statistics.StatisticsLoggingConsumer
 event.consumer.solrstat.filters =
 Item+Delete|Modify|Modify_Metadata:Collection+Add|Remove

 solr.metadata.item.1 = contributor:dc.contributor.*
 solr.metadata.item.2 = dctype:dc.type.*

 the first part (before the :) is the name of the solr field where the
 metadata values will be stored.
 You need to add to the schema.xml the new fields (and for any one an
 additional _search field), ie

  field name=contributor type=string indexed=true stored=true
 required=false/
  field name=contributor_search type=string indexed=true stored=true
 required=false/
 field name=dctype type=string indexed=true stored=true
 required=false/
  field name=dctype_search type=string indexed=true stored=true
 required=false/

 restart tomcat and start to collect statistics.
 If you edit some metadata of an item with usage data already collected they
 will be updated with the new fields.
 If you want update all the solr usage data document you could take a look at
 the StatisticsLoggingConsumer class in org.dspace.statistics (make a script
 that simulate a modify_metadata event on all items).

 I know that there are some issues in the metadataStorage, btw it doesn't
 work with metadata without qualified (ie dc.type) and it doesn't take care
 of authority key if present.
 I will send a bug/improvement report on jira as soon as possible with the
 needed patch (I already have it applied on my source, I only need to extract
 and post it... sorry to be so lazy)
 Andrea





 Il 04/08/2010 21:21, Reinhard Engels ha scritto:

 Hi all,

 We recently upgraded to 1.6 -- so far so good -- but still have a few
 questions about how the SOLR usage statistics work and can be
 extended.

 We'd like to generate bitstream download counts by author, keyword,
 and other fields.

 But it doesn't look like any of that information is currently included
 in the SOLR index (at least, it certainly isn't for the historical
 dspace.log data).

 I'm assuming it's going to require some custom coding to massage
 SORL/generate these reports, but just wanted to double check before I
 start reinventing wheels.

 Also, I'd initially assumed that the community and collection counts
 were aggregations of the number of item hits within each community or
 collection, but it looks like they're just hit counts to the
 community/collections home pages. Is that accurate?

 Thanks in advance for any light you can shed on this!

 Reinhard Engels

 --
 The Palm PDK Hot Apps Program offers developers who use the
 Plug-In Development Kit to bring their C/C++ apps to Palm for a share
 of $1 Million in cash or HP Products. Visit us here for more details:
 http://p.sf.net/sfu/dev2dev-palm
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech




 --
 Dott. Andrea Bollini
 Project Manager, IT Architect  Systems Integrator
 Sezione Servizi per le Biblioteche e l'Editoria Elettronica
 CILEA, http://www.cilea.it
 tel. +39 06-59292853
 cel. +39 348-8277525

 ---

 Disclaimer: the content of this email is confidential and may be privileged,
 and it must not be disclosed or copied without the sender's consent. If you
 have received this message in error, please notify the sender and remove it
 from your system. The content of this email does not constitute legal
 advice, nor any responsibility is accepted for loss or damage incurred as a
 result of acting upon its contents or attachments.
 The statements and opinions expressed in this email are those of the author
 and do not necessarily reflect those of the employer.


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?

2010-08-04 Thread Reinhard Engels
Hi all,

We recently upgraded to 1.6 -- so far so good -- but still have a few
questions about how the SOLR usage statistics work and can be
extended.

We'd like to generate bitstream download counts by author, keyword,
and other fields.

But it doesn't look like any of that information is currently included
in the SOLR index (at least, it certainly isn't for the historical
dspace.log data).

I'm assuming it's going to require some custom coding to massage
SORL/generate these reports, but just wanted to double check before I
start reinventing wheels.

Also, I'd initially assumed that the community and collection counts
were aggregations of the number of item hits within each community or
collection, but it looks like they're just hit counts to the
community/collections home pages. Is that accurate?

Thanks in advance for any light you can shed on this!

Reinhard Engels

--
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech