[Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-04 Thread Sean Carte
I *think* I've followed the instructions, but I can't seem to get the
new solr statistics functioning.

After installing with `ant -Doverwrite=true update`, I converted the
old dspace logs (/dspace/bin/dspace stats-log-converter -i
/dspace/log/old/dspace.log -o /dspace/log/dspace.log -m). That went
well, but the next step, importing the new logs fails:

dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
/dspace/log/dspace.log -m
dspace.log.1000
Processing file: /dspace/log/dspace.log.1000
Processed 0 log lines
 done!
dspace.log.1
Processing file: /dspace/log/dspace.log.1
Error seeking country while seeking 3251471352
[lots of these]
Error seeking country while seeking 1255147537
Processed 309 log lines
 - 0 entries added to solr: 0%
 - 152 errors: 49.191%
 - 157 search engine activity skipped: 50.809%
About to commit data to solr...Exception: java.lang.String cannot be
cast to org.apache.solr.common.util.NamedList
java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.solr.common.util.NamedList
at 
org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:385)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
at 
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:85)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:74)
at 
org.dspace.statistics.util.StatisticsImporter.load(StatisticsImporter.java:387)
at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:493)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:194)

(Incidentally, the help files refer to 'log-importer' and
'log-converter', where it should be 'stats-log-importer' and
'stats-log-converter'.)

And this is what I have in the current log file:

2010-03-04 10:56:32,035 INFO  org.dspace.statistics.SolrLogger @
solr.spidersfile:null
2010-03-04 10:56:32,036 INFO  org.dspace.statistics.SolrLogger @
solr.log.server:http://10.4.36.18/solr/statistics
2010-03-04 10:56:32,036 INFO  org.dspace.statistics.SolrLogger @
solr.dbfile:/dspace/config/GeoLiteCity.dat
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Statistics/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Statistics/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Submission/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Submission/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue: context://i18n/aspects/EPerson/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/EPerson/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Administrative/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Administrative/i18n/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/ArtifactBrowser/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/ArtifactBrowser/i18n/
2010-03-04 10:56:35,879 ERROR org.dspace.statistics.SolrLogger @ Error
executing query
org.apache.solr.client.solrj.SolrServerException: Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:96)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:109)
at org.dspace.statistics.SolrLogger.(SolrLogger.java:81)
at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:472)
at sun.reflect.NativeM

Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-04 Thread Stuart Lewis
Hi Sean,

> I *think* I've followed the instructions, but I can't seem to get the
> new solr statistics functioning.
> 
> After installing with `ant -Doverwrite=true update`, I converted the
> old dspace logs (/dspace/bin/dspace stats-log-converter -i
> /dspace/log/old/dspace.log -o /dspace/log/dspace.log -m). That went

The log files are only getting converted into an intermediate file format ready 
for import into solr. [dspace]/log/dspace.log* files still get created in the 
same format, so it might be worth saving the converted files somewhere else, 
and as a different file name, so as not to get confused between the log4j 
'dspace.log' files, and these intermediate files that contain less information 
ready to be imported into solr. 

> dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
> /dspace/log/dspace.log -m

Could you send me a copy of one of your original dspace.log files, and the 
corresponding version which stats-log-converter creates? I can then look at 
them, and try importing them on my machine to see what might be going wrong.

> Almost everything else seems to work as advertised.

That's good to hear :)

Cheers,


Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-05 Thread Stuart Lewis
Hi Sean,

It is probably safer not to put the intermediate files in [dspace]/log/ and 
called dspace.log.* as the 'traditional' DSpace statistics reports still look 
in that directory for dspace.log* files, and DSpace will still be writing its 
dspace.log* log files there. Best keep them somewhere separate if possible.

I've tried importing the dspace.log file you sent, and it seemed to work fine:

gendiglt02:~ stuartlewis$ /dspace/bin/dspace stats-log-importer -i dspace.log
Processing file: dspace.log
Processed 309 log lines
 - 4 entries added to solr: 1.294%
 - 144 errors: 46.602%
 - 161 search engine activity skipped: 52.104%
About to commit data to solr... done!

(it lists 144errors, but that is because you have handles that don't exist in 
my development copy of DSpace - otherwise no errors were thrown)

The error you mentioned in your first email 'Error seeking country while 
seeking 3251471352' is thrown when the code tries to find out what country and 
city the request was made from. It looks this up from an IP address. Could you 
re-run the script, and when it throws one of these errors, look at the file it 
is importing, find the line that has that value on, and copy that line into an 
email? It is strange that there is a 10 digit number instead of an IP address, 
so we need to work out where it is getting that from. 

Thanks,


Stuart 

From: Sean Carte [sean.ca...@gmail.com]
Sent: Saturday, 6 March 2010 1:32 a.m.
To: Stuart Lewis
Cc: dspace-tech
Subject: Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

On 4 March 2010 23:29, Stuart Lewis  wrote:
>
>The log files are only getting converted into an intermediate file format 
>ready for
> import into solr. [dspace]/log/dspace.log* files still get created in the 
> same format, so
> it might be worth saving the converted files somewhere else, and as a 
> different file
> name, so as not to get confused between the log4j 'dspace.log' files, and 
> these
> intermediate files that contain less information ready to be imported into 
> solr.

I did something like that: I copied all the original dspace.log* files
to another directory, then converted them with the intermediate files
ending up in my /dspace/log/ directory, which I then tried to import:

dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-converter -i
/dspace/log/old/dspace.log -o /dspace/log/dspace.log -m
dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
/dspace/log/dspace.log -m

> Could you send me a copy of one of your original dspace.log files, and the 
> corresponding version which stats-log-converter creates? I can then look at 
> them, and try importing them on my machine to see what might be going wrong.
>

Attached.

Sean
--
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/
--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-08 Thread Sean Carte
On 6 March 2010 09:48, Stuart Lewis  wrote:
> It is probably safer not to put the intermediate files in [dspace]/log/ and 
> called dspace.log.* as the 'traditional' DSpace statistics reports still look 
> in that directory for dspace.log* files, and DSpace will still be writing its 
> dspace.log* log files there. Best keep them somewhere separate if possible.

I've done so now.

> I've tried importing the dspace.log file you sent, and it seemed to work fine:

Your parallel discussion with Dale Poulter provided some of the
answers I needed: I had to add the following to my server.xml file to
get solr running:



and then change the solr.og.server URL in dspace.cfg to:

solr.log.server = http://127.0.0.1/solr/statistics

Thank you for that.

> The error you mentioned in your first email 'Error seeking country while 
> seeking 3251471352' is thrown when the code tries to find out what country 
> and city the request was made from. It looks this up from an IP address. 
> Could you re-run the script, and when it throws one of these errors, look at 
> the file it is importing, find the line that has that value on, and copy that 
> line into an email? It is strange that there is a 10 digit number instead of 
> an IP address, so we need to work out where it is getting that from.

Yes, this makes no sense to me:

Processing file: /dspace/log/conv/dspace.log.92
Error seeking country while seeking 3438073063
Error seeking country while seeking 1074558283
Error seeking country while seeking 2205575261
Error seeking country while seeking 1074558283
Error seeking country while seeking 1074558283
Error seeking country while seeking 1074558220
Error seeking country while seeking 2107905785

cat /dspace/log/conv/dspace.log.92
20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249

Sean
-- 
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-08 Thread Stuart Lewis
Hi Sean,

Something else to try:

 - Try using the 'v' (verbose) parameter, as this will show you what it is 
reading for each line, and what it thinks this represents

It would be interesting to see the output which that gives.

Thanks,


Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/


On 8/03/2010, at 10:11 PM, Sean Carte wrote:

> On 6 March 2010 09:48, Stuart Lewis  wrote:
>> It is probably safer not to put the intermediate files in [dspace]/log/ and 
>> called dspace.log.* as the 'traditional' DSpace statistics reports still 
>> look in that directory for dspace.log* files, and DSpace will still be 
>> writing its dspace.log* log files there. Best keep them somewhere separate 
>> if possible.
> 
> I've done so now.
> 
>> I've tried importing the dspace.log file you sent, and it seemed to work 
>> fine:
> 
> Your parallel discussion with Dale Poulter provided some of the
> answers I needed: I had to add the following to my server.xml file to
> get solr running:
> 
> reloadable="true" cachingAllowed="false"
>allowLinking="true"/>
> 
> and then change the solr.og.server URL in dspace.cfg to:
> 
> solr.log.server = http://127.0.0.1/solr/statistics
> 
> Thank you for that.
> 
>> The error you mentioned in your first email 'Error seeking country while 
>> seeking 3251471352' is thrown when the code tries to find out what country 
>> and city the request was made from. It looks this up from an IP address. 
>> Could you re-run the script, and when it throws one of these errors, look at 
>> the file it is importing, find the line that has that value on, and copy 
>> that line into an email? It is strange that there is a 10 digit number 
>> instead of an IP address, so we need to work out where it is getting that 
>> from.
> 
> Yes, this makes no sense to me:
> 
> Processing file: /dspace/log/conv/dspace.log.92
> Error seeking country while seeking 3438073063
> Error seeking country while seeking 1074558283
> Error seeking country while seeking 2205575261
> Error seeking country while seeking 1074558283
> Error seeking country while seeking 1074558283
> Error seeking country while seeking 1074558220
> Error seeking country while seeking 2107905785
> 
> cat /dspace/log/conv/dspace.log.92
> 20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
> 20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
> 20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
> 20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
> 20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
> 20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
> 20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
> 20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
> 20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249
> 
> Sean
> -- 
> Sean Carte
> esAL Library Systems Manager
> +27 72 898 8775
> +27 31 373 2490
> fax: 0866741254
> http://esal.dut.ac.za/



--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade & Statistics

2010-03-08 Thread Sean Carte
On 8 March 2010 23:16, Stuart Lewis  wrote:
>
>
> Something else to try:
>
>  - Try using the 'v' (verbose) parameter, as this will show you what it is 
> reading for each line, and what it thinks this represents
>
> It would be interesting to see the output which that gives.

It looks like it genuinely can't resolve the country from those IP
addresses. That 10-digit number must just be some internal
representation of the IP address:


$ /dspace/bin/dspace stats-log-importer -v -i /dspace/log/conv/dspace.log.92
Writing to solr server at: http://127.0.0.1/solr/statistics
Processing file: /dspace/log/conv/dspace.log.92
Line:20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
ip addr = 204.236.212.231, dns name = , country = , city =
Error seeking country while seeking 3438073063
Unknown country code: --
Line:20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = , country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
ip addr = 131.118.104.93, dns name = , country = N/A, city = null
Error seeking country while seeking 2205575261
Unknown country code: --
Line:20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
ip addr = 67.195.115.33, dns name = , country = N/A, city = null
Error seeking country while seeking 1136882465
Unknown country code: --
Line:20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
ip addr = 207.46.204.188, dns name =
msnbot-207-46-204-188.search.msn.com., country = N/A, city = null,
IGNORE (search engine)
Line:20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = cache-mtc-af11.proxy.aol.com.,
country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = cache-mtc-af11.proxy.aol.com.,
country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
ip addr = 64.12.117.12, dns name = , country = N/A, city = null
Error seeking country while seeking 1074558220
Unknown country code: --
Line:20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249
ip addr = 125.164.22.249, dns name =
249.subnet125-164-22.speedy.telkom.net.id., country = N/A, city = null
Error seeking country while seeking 2107905785
Unknown country code: --
Processed 9 log lines
 - 0 entries added to solr: 0%
 - 8 errors: 88.889%
 - 1 search engine activity skipped: 11.111%
About to commit data to solr... done!



$ dig -x 64.12.117.75

; <<>> DiG 9.4.2-P2 <<>> -x 64.12.117.75
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12586
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;75.117.12.64.in-addr.arpa. IN  PTR

;; ANSWER SECTION:
75.117.12.64.in-addr.arpa. 3409 IN  PTR cache-mtc-af11.proxy.aol.com.

;; Query time: 0 msec
;; SERVER: 10.1.4.101#53(10.1.4.101)
;; WHEN: Tue Mar  9 08:32:23 2010
;; MSG SIZE  rcvd: 85



Given that there doesn't seem to be anything I can do about this, it's
probably safe to leave it be.

Thanks for your help with this Stuart.

Sean
-- 
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech