Re: [Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?
Hi Reinhard, I too am working on this. You can use the metadataStorage facility already in place in the SORL statistics dspace engine. You need to add some configuration to you dspace.cfg event.dispatcher.default.consumers = *solrstat, *search, browse, eperson # consumer to maintain the solr statistics metadata storage and parents mapping event.consumer.solrstat.class = org.dspace.statistics.StatisticsLoggingConsumer event.consumer.solrstat.filters = Item+Delete|Modify|Modify_Metadata:Collection+Add|Remove solr.metadata.item.1 = contributor:dc.contributor.* solr.metadata.item.2 = dctype:dc.type.* the first part (before the :) is the name of the solr field where the metadata values will be stored. You need to add to the schema.xml the new fields (and for any one an additional _search field), ie field name=contributor type=string indexed=true stored=true required=false/ field name=contributor_search type=string indexed=true stored=true required=false/ field name=dctype type=string indexed=true stored=true required=false/ field name=dctype_search type=string indexed=true stored=true required=false/ restart tomcat and start to collect statistics. If you edit some metadata of an item with usage data already collected they will be updated with the new fields. If you want update all the solr usage data document you could take a look at the StatisticsLoggingConsumer class in org.dspace.statistics (make a script that simulate a modify_metadata event on all items). I know that there are some issues in the metadataStorage, btw it doesn't work with metadata without qualified (ie dc.type) and it doesn't take care of authority key if present. I will send a bug/improvement report on jira as soon as possible with the needed patch (I already have it applied on my source, I only need to extract and post it... sorry to be so lazy) Andrea Il 04/08/2010 21:21, Reinhard Engels ha scritto: Hi all, We recently upgraded to 1.6 -- so far so good -- but still have a few questions about how the SOLR usage statistics work and can be extended. We'd like to generate bitstream download counts by author, keyword, and other fields. But it doesn't look like any of that information is currently included in the SOLR index (at least, it certainly isn't for the historical dspace.log data). I'm assuming it's going to require some custom coding to massage SORL/generate these reports, but just wanted to double check before I start reinventing wheels. Also, I'd initially assumed that the community and collection counts were aggregations of the number of item hits within each community or collection, but it looks like they're just hit counts to the community/collections home pages. Is that accurate? Thanks in advance for any light you can shed on this! Reinhard Engels -- The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Dott. Andrea Bollini Project Manager, IT Architect Systems Integrator Sezione Servizi per le Biblioteche e l'Editoria Elettronica CILEA, http://www.cilea.it tel. +39 06-59292853 cel. +39 348-8277525 --- Disclaimer: the content of this email is confidential and may be privileged, and it must not be disclosed or copied without the sender's consent. If you have received this message in error, please notify the sender and remove it from your system. The content of this email does not constitute legal advice, nor any responsibility is accepted for loss or damage incurred as a result of acting upon its contents or attachments. The statements and opinions expressed in this email are those of the author and do not necessarily reflect those of the employer. -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?
Hi Andrea, Please don't apologize for solving my problems! I'm thrilled -- and let me know if there's any way I can help. I'm on vacation this week, but when I get back later this month I'll be diving back in hard and perhaps something will occur to me. Best, Reinhard On Sun, Aug 15, 2010 at 9:47 AM, Andrea Bollini boll...@cilea.it wrote: Hi Reinhard, I too am working on this. You can use the metadataStorage facility already in place in the SORL statistics dspace engine. You need to add some configuration to you dspace.cfg event.dispatcher.default.consumers = solrstat, search, browse, eperson # consumer to maintain the solr statistics metadata storage and parents mapping event.consumer.solrstat.class = org.dspace.statistics.StatisticsLoggingConsumer event.consumer.solrstat.filters = Item+Delete|Modify|Modify_Metadata:Collection+Add|Remove solr.metadata.item.1 = contributor:dc.contributor.* solr.metadata.item.2 = dctype:dc.type.* the first part (before the :) is the name of the solr field where the metadata values will be stored. You need to add to the schema.xml the new fields (and for any one an additional _search field), ie field name=contributor type=string indexed=true stored=true required=false/ field name=contributor_search type=string indexed=true stored=true required=false/ field name=dctype type=string indexed=true stored=true required=false/ field name=dctype_search type=string indexed=true stored=true required=false/ restart tomcat and start to collect statistics. If you edit some metadata of an item with usage data already collected they will be updated with the new fields. If you want update all the solr usage data document you could take a look at the StatisticsLoggingConsumer class in org.dspace.statistics (make a script that simulate a modify_metadata event on all items). I know that there are some issues in the metadataStorage, btw it doesn't work with metadata without qualified (ie dc.type) and it doesn't take care of authority key if present. I will send a bug/improvement report on jira as soon as possible with the needed patch (I already have it applied on my source, I only need to extract and post it... sorry to be so lazy) Andrea Il 04/08/2010 21:21, Reinhard Engels ha scritto: Hi all, We recently upgraded to 1.6 -- so far so good -- but still have a few questions about how the SOLR usage statistics work and can be extended. We'd like to generate bitstream download counts by author, keyword, and other fields. But it doesn't look like any of that information is currently included in the SOLR index (at least, it certainly isn't for the historical dspace.log data). I'm assuming it's going to require some custom coding to massage SORL/generate these reports, but just wanted to double check before I start reinventing wheels. Also, I'd initially assumed that the community and collection counts were aggregations of the number of item hits within each community or collection, but it looks like they're just hit counts to the community/collections home pages. Is that accurate? Thanks in advance for any light you can shed on this! Reinhard Engels -- The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Dott. Andrea Bollini Project Manager, IT Architect Systems Integrator Sezione Servizi per le Biblioteche e l'Editoria Elettronica CILEA, http://www.cilea.it tel. +39 06-59292853 cel. +39 348-8277525 --- Disclaimer: the content of this email is confidential and may be privileged, and it must not be disclosed or copied without the sender's consent. If you have received this message in error, please notify the sender and remove it from your system. The content of this email does not constitute legal advice, nor any responsibility is accepted for loss or damage incurred as a result of acting upon its contents or attachments. The statements and opinions expressed in this email are those of the author and do not necessarily reflect those of the employer. -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] dspace SOLR stats aggregation by author, keyword, etc?
Hi all, We recently upgraded to 1.6 -- so far so good -- but still have a few questions about how the SOLR usage statistics work and can be extended. We'd like to generate bitstream download counts by author, keyword, and other fields. But it doesn't look like any of that information is currently included in the SOLR index (at least, it certainly isn't for the historical dspace.log data). I'm assuming it's going to require some custom coding to massage SORL/generate these reports, but just wanted to double check before I start reinventing wheels. Also, I'd initially assumed that the community and collection counts were aggregations of the number of item hits within each community or collection, but it looks like they're just hit counts to the community/collections home pages. Is that accurate? Thanks in advance for any light you can shed on this! Reinhard Engels -- The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech