Re: [Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-08 Thread Sean Carte
On 6 March 2010 09:48, Stuart Lewis s.le...@auckland.ac.nz wrote:
 It is probably safer not to put the intermediate files in [dspace]/log/ and 
 called dspace.log.* as the 'traditional' DSpace statistics reports still look 
 in that directory for dspace.log* files, and DSpace will still be writing its 
 dspace.log* log files there. Best keep them somewhere separate if possible.

I've done so now.

 I've tried importing the dspace.log file you sent, and it seemed to work fine:

Your parallel discussion with Dale Poulter provided some of the
answers I needed: I had to add the following to my server.xml file to
get solr running:

Context path=/solr docBase=/dspace/webapps/solr debug=0
reloadable=true cachingAllowed=false
allowLinking=true/

and then change the solr.og.server URL in dspace.cfg to:

solr.log.server = http://127.0.0.1/solr/statistics

Thank you for that.

 The error you mentioned in your first email 'Error seeking country while 
 seeking 3251471352' is thrown when the code tries to find out what country 
 and city the request was made from. It looks this up from an IP address. 
 Could you re-run the script, and when it throws one of these errors, look at 
 the file it is importing, find the line that has that value on, and copy that 
 line into an email? It is strange that there is a 10 digit number instead of 
 an IP address, so we need to work out where it is getting that from.

Yes, this makes no sense to me:

Processing file: /dspace/log/conv/dspace.log.92
Error seeking country while seeking 3438073063
Error seeking country while seeking 1074558283
Error seeking country while seeking 2205575261
Error seeking country while seeking 1074558283
Error seeking country while seeking 1074558283
Error seeking country while seeking 1074558220
Error seeking country while seeking 2107905785

cat /dspace/log/conv/dspace.log.92
20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249

Sean
-- 
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-08 Thread Stuart Lewis
Hi Sean,

Something else to try:

 - Try using the 'v' (verbose) parameter, as this will show you what it is 
reading for each line, and what it thinks this represents

It would be interesting to see the output which that gives.

Thanks,


Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/


On 8/03/2010, at 10:11 PM, Sean Carte wrote:

 On 6 March 2010 09:48, Stuart Lewis s.le...@auckland.ac.nz wrote:
 It is probably safer not to put the intermediate files in [dspace]/log/ and 
 called dspace.log.* as the 'traditional' DSpace statistics reports still 
 look in that directory for dspace.log* files, and DSpace will still be 
 writing its dspace.log* log files there. Best keep them somewhere separate 
 if possible.
 
 I've done so now.
 
 I've tried importing the dspace.log file you sent, and it seemed to work 
 fine:
 
 Your parallel discussion with Dale Poulter provided some of the
 answers I needed: I had to add the following to my server.xml file to
 get solr running:
 
 Context path=/solr docBase=/dspace/webapps/solr debug=0
reloadable=true cachingAllowed=false
allowLinking=true/
 
 and then change the solr.og.server URL in dspace.cfg to:
 
 solr.log.server = http://127.0.0.1/solr/statistics
 
 Thank you for that.
 
 The error you mentioned in your first email 'Error seeking country while 
 seeking 3251471352' is thrown when the code tries to find out what country 
 and city the request was made from. It looks this up from an IP address. 
 Could you re-run the script, and when it throws one of these errors, look at 
 the file it is importing, find the line that has that value on, and copy 
 that line into an email? It is strange that there is a 10 digit number 
 instead of an IP address, so we need to work out where it is getting that 
 from.
 
 Yes, this makes no sense to me:
 
 Processing file: /dspace/log/conv/dspace.log.92
 Error seeking country while seeking 3438073063
 Error seeking country while seeking 1074558283
 Error seeking country while seeking 2205575261
 Error seeking country while seeking 1074558283
 Error seeking country while seeking 1074558283
 Error seeking country while seeking 1074558220
 Error seeking country while seeking 2107905785
 
 cat /dspace/log/conv/dspace.log.92
 20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
 20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
 20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
 20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
 20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
 20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
 20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
 20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
 20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249
 
 Sean
 -- 
 Sean Carte
 esAL Library Systems Manager
 +27 72 898 8775
 +27 31 373 2490
 fax: 0866741254
 http://esal.dut.ac.za/



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-08 Thread Sean Carte
On 8 March 2010 23:16, Stuart Lewis s.le...@auckland.ac.nz wrote:


 Something else to try:

  - Try using the 'v' (verbose) parameter, as this will show you what it is 
 reading for each line, and what it thinks this represents

 It would be interesting to see the output which that gives.

It looks like it genuinely can't resolve the country from those IP
addresses. That 10-digit number must just be some internal
representation of the IP address:


$ /dspace/bin/dspace stats-log-importer -v -i /dspace/log/conv/dspace.log.92
Writing to solr server at: http://127.0.0.1/solr/statistics
Processing file: /dspace/log/conv/dspace.log.92
Line:20100225223219087,view_bitstream,4278,2010-02-25T22:32:19,anonymous,204.236.212.231
ip addr = 204.236.212.231, dns name = , country = , city =
Error seeking country while seeking 3438073063
Unknown country code: --
Line:20100225223409022,view_item,196,2010-02-25T22:34:09,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = , country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223443114,view_bitstream,2160,2010-02-25T22:34:43,anonymous,131.118.104.93
ip addr = 131.118.104.93, dns name = , country = N/A, city = null
Error seeking country while seeking 2205575261
Unknown country code: --
Line:20100225223524308,view_item,276,2010-02-25T22:35:24,anonymous,67.195.115.33
ip addr = 67.195.115.33, dns name = , country = N/A, city = null
Error seeking country while seeking 1136882465
Unknown country code: --
Line:20100225223531000,view_community,5,2010-02-25T22:35:31,anonymous,207.46.204.188
ip addr = 207.46.204.188, dns name =
msnbot-207-46-204-188.search.msn.com., country = N/A, city = null,
IGNORE (search engine)
Line:20100225223600288,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = cache-mtc-af11.proxy.aol.com.,
country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223600631,view_item,196,2010-02-25T22:36:00,anonymous,64.12.117.75
ip addr = 64.12.117.75, dns name = cache-mtc-af11.proxy.aol.com.,
country = N/A, city = null
Error seeking country while seeking 1074558283
Unknown country code: --
Line:20100225223638679,view_item,196,2010-02-25T22:36:38,anonymous,64.12.117.12
ip addr = 64.12.117.12, dns name = , country = N/A, city = null
Error seeking country while seeking 1074558220
Unknown country code: --
Line:20100225223647730,view_bitstream,4564,2010-02-25T22:36:47,anonymous,125.164.22.249
ip addr = 125.164.22.249, dns name =
249.subnet125-164-22.speedy.telkom.net.id., country = N/A, city = null
Error seeking country while seeking 2107905785
Unknown country code: --
Processed 9 log lines
 - 0 entries added to solr: 0%
 - 8 errors: 88.889%
 - 1 search engine activity skipped: 11.111%
About to commit data to solr... done!



$ dig -x 64.12.117.75

;  DiG 9.4.2-P2  -x 64.12.117.75
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 12586
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;75.117.12.64.in-addr.arpa. IN  PTR

;; ANSWER SECTION:
75.117.12.64.in-addr.arpa. 3409 IN  PTR cache-mtc-af11.proxy.aol.com.

;; Query time: 0 msec
;; SERVER: 10.1.4.101#53(10.1.4.101)
;; WHEN: Tue Mar  9 08:32:23 2010
;; MSG SIZE  rcvd: 85



Given that there doesn't seem to be anything I can do about this, it's
probably safe to leave it be.

Thanks for your help with this Stuart.

Sean
-- 
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-05 Thread Stuart Lewis
Hi Sean,

It is probably safer not to put the intermediate files in [dspace]/log/ and 
called dspace.log.* as the 'traditional' DSpace statistics reports still look 
in that directory for dspace.log* files, and DSpace will still be writing its 
dspace.log* log files there. Best keep them somewhere separate if possible.

I've tried importing the dspace.log file you sent, and it seemed to work fine:

gendiglt02:~ stuartlewis$ /dspace/bin/dspace stats-log-importer -i dspace.log
Processing file: dspace.log
Processed 309 log lines
 - 4 entries added to solr: 1.294%
 - 144 errors: 46.602%
 - 161 search engine activity skipped: 52.104%
About to commit data to solr... done!

(it lists 144errors, but that is because you have handles that don't exist in 
my development copy of DSpace - otherwise no errors were thrown)

The error you mentioned in your first email 'Error seeking country while 
seeking 3251471352' is thrown when the code tries to find out what country and 
city the request was made from. It looks this up from an IP address. Could you 
re-run the script, and when it throws one of these errors, look at the file it 
is importing, find the line that has that value on, and copy that line into an 
email? It is strange that there is a 10 digit number instead of an IP address, 
so we need to work out where it is getting that from. 

Thanks,


Stuart 

From: Sean Carte [sean.ca...@gmail.com]
Sent: Saturday, 6 March 2010 1:32 a.m.
To: Stuart Lewis
Cc: dspace-tech
Subject: Re: [Dspace-tech] DSpace 1.6 Upgrade  Statistics

On 4 March 2010 23:29, Stuart Lewis s.le...@auckland.ac.nz wrote:

The log files are only getting converted into an intermediate file format 
ready for
 import into solr. [dspace]/log/dspace.log* files still get created in the 
 same format, so
 it might be worth saving the converted files somewhere else, and as a 
 different file
 name, so as not to get confused between the log4j 'dspace.log' files, and 
 these
 intermediate files that contain less information ready to be imported into 
 solr.

I did something like that: I copied all the original dspace.log* files
to another directory, then converted them with the intermediate files
ending up in my /dspace/log/ directory, which I then tried to import:

dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-converter -i
/dspace/log/old/dspace.log -o /dspace/log/dspace.log -m
dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
/dspace/log/dspace.log -m

 Could you send me a copy of one of your original dspace.log files, and the 
 corresponding version which stats-log-converter creates? I can then look at 
 them, and try importing them on my machine to see what might be going wrong.


Attached.

Sean
--
Sean Carte
esAL Library Systems Manager
+27 72 898 8775
+27 31 373 2490
fax: 0866741254
http://esal.dut.ac.za/
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-04 Thread Sean Carte
I *think* I've followed the instructions, but I can't seem to get the
new solr statistics functioning.

After installing with `ant -Doverwrite=true update`, I converted the
old dspace logs (/dspace/bin/dspace stats-log-converter -i
/dspace/log/old/dspace.log -o /dspace/log/dspace.log -m). That went
well, but the next step, importing the new logs fails:

dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
/dspace/log/dspace.log -m
dspace.log.1000
Processing file: /dspace/log/dspace.log.1000
Processed 0 log lines
 done!
dspace.log.1
Processing file: /dspace/log/dspace.log.1
Error seeking country while seeking 3251471352
[lots of these]
Error seeking country while seeking 1255147537
Processed 309 log lines
 - 0 entries added to solr: 0%
 - 152 errors: 49.191%
 - 157 search engine activity skipped: 50.809%
About to commit data to solr...Exception: java.lang.String cannot be
cast to org.apache.solr.common.util.NamedList
java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.solr.common.util.NamedList
at 
org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:385)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
at 
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:85)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:74)
at 
org.dspace.statistics.util.StatisticsImporter.load(StatisticsImporter.java:387)
at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:493)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:194)

(Incidentally, the help files refer to 'log-importer' and
'log-converter', where it should be 'stats-log-importer' and
'stats-log-converter'.)

And this is what I have in the current log file:

2010-03-04 10:56:32,035 INFO  org.dspace.statistics.SolrLogger @
solr.spidersfile:null
2010-03-04 10:56:32,036 INFO  org.dspace.statistics.SolrLogger @
solr.log.server:http://10.4.36.18/solr/statistics
2010-03-04 10:56:32,036 INFO  org.dspace.statistics.SolrLogger @
solr.dbfile:/dspace/config/GeoLiteCity.dat
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Statistics/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Statistics/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Submission/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Submission/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue: context://i18n/aspects/EPerson/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/EPerson/i18n/
2010-03-04 10:56:35,475 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/Administrative/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/Administrative/i18n/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
context://i18n/aspects/ArtifactBrowser/
2010-03-04 10:56:35,476 INFO
org.dspace.app.xmlui.cocoon.DSpaceI18NTransformer @ Adding i18n
location path for 'default' catalogue:
resource://aspects/ArtifactBrowser/i18n/
2010-03-04 10:56:35,879 ERROR org.dspace.statistics.SolrLogger @ Error
executing query
org.apache.solr.client.solrj.SolrServerException: Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:96)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:109)
at org.dspace.statistics.SolrLogger.clinit(SolrLogger.java:81)
at 
org.dspace.statistics.util.StatisticsImporter.main(StatisticsImporter.java:472)
at 

Re: [Dspace-tech] DSpace 1.6 Upgrade Statistics

2010-03-04 Thread Stuart Lewis
Hi Sean,

 I *think* I've followed the instructions, but I can't seem to get the
 new solr statistics functioning.
 
 After installing with `ant -Doverwrite=true update`, I converted the
 old dspace logs (/dspace/bin/dspace stats-log-converter -i
 /dspace/log/old/dspace.log -o /dspace/log/dspace.log -m). That went

The log files are only getting converted into an intermediate file format ready 
for import into solr. [dspace]/log/dspace.log* files still get created in the 
same format, so it might be worth saving the converted files somewhere else, 
and as a different file name, so as not to get confused between the log4j 
'dspace.log' files, and these intermediate files that contain less information 
ready to be imported into solr. 

 dsp...@esal-lr:~$ /dspace/bin/dspace stats-log-importer -i
 /dspace/log/dspace.log -m

Could you send me a copy of one of your original dspace.log files, and the 
corresponding version which stats-log-converter creates? I can then look at 
them, and try importing them on my machine to see what might be going wrong.

 Almost everything else seems to work as advertised.

That's good to hear :)

Cheers,


Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: 64 9 373-7599 x81928
http://www.library.auckland.ac.nz/


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech