Hi Karol and all,
I did not have time to look into it. Most of the ip list are not free
anymore, so I wonder how we can clean up the statistics, like replacing
them with a new source of lists and then flag the bots and remove them.
Sunny greetings
Claudia
Am 21.01.2023 um 17:28 schrieb Karol:
Hi Claudia,
i have exactly the same problem.UP.
Best,
Karol
wtorek, 17 stycznia 2023 o 15:19:39 UTC+1 Claudia Jürgen napisał(a):
Hi all,
I noted two things about the iplists used for stats-util.
The lists are configured in:
https://github.com/DSpace/DSpace/blob/main/dspace/config/modules/solr-statistics.cfg
<https://github.com/DSpace/DSpace/blob/main/dspace/config/modules/solr-statistics.cfg>
solr-statistics.spiderips.urls = http://iplists.com/google.txt
<http://iplists.com/google.txt>, \
http://iplists.com/inktomi.txt <http://iplists.com/inktomi.txt>, \
http://iplists.com/lycos.txt <http://iplists.com/lycos.txt>, \
http://iplists.com/infoseek.txt <http://iplists.com/infoseek.txt>, \
http://iplists.com/altavista.txt <http://iplists.com/altavista.txt>, \
http://iplists.com/excite.txt <http://iplists.com/excite.txt>, \
http://iplists.com/misc.txt <http://iplists.com/misc.txt>
a) the lists are most likely obsolete and thus the statistics very
imprecise with regards to bot traffic
https://iplists.com/ <https://iplists.com/>
The last revised dates on the site are from 2008 and 2014
Maybe we need another source for iplists and a "cleanup".
b) stats-util -u (in order to get theoretically updated files) does not
work and throws an NPE
Getting: http://iplists.com/google.txt <http://iplists.com/google.txt>
To: /opt/dspace/dspace63tu/config/spiders/iplists.com-google.txt
- Error: null
java.lang.NullPointerException
at org.apache.tools.ant.taskdefs.Get.doGet(Get.java:221)
at org.apache.tools.ant.taskdefs.Get.execute(Get.java:134)
at
org.dspace.statistics.util.StatisticsClient.updateSpiderFiles(StatisticsClient.java:152)
at
org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at
org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
Sunny Greetings
Claudia
--
Claudia Juergen
Technische Universität Dortmund
Universitätsbibliothek
Bibliotheks-IT
Vogelpothsweg 76
44227 Dortmund
Tel.: +49 231-755 40 43 <tel:+49%20231%207554043>
Fax: +49 231-755 40 32 <tel:+49%20231%207554032>
claudia...@tu-dortmund.de
www.ub.tu-dortmund.de <http://www.ub.tu-dortmund.de>
Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich.
Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie
nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie
bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen
ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher
Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung
eines solchen Schriftstücks per Telefax erfolgen.
Important note: The information included in this e-mail is
confidential. It is solely intended for the recipient. If you are
not the intended recipient of this e-mail please contact the sender
and delete this message. Thank you. Without prejudice of e-mail
correspondence, our statements are only legally binding when they
are made in the conventional written form (with personal signature)
or when such documents are sent by fax.
--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
<https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx>
---
You received this message because you are subscribed to the Google
Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to dspace-tech+unsubscr...@googlegroups.com
<mailto:dspace-tech+unsubscr...@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-tech/7492efa0-a1ab-49ad-a821-1cf5bd652846n%40googlegroups.com
<https://groups.google.com/d/msgid/dspace-tech/7492efa0-a1ab-49ad-a821-1cf5bd652846n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
Claudia Juergen
Technische Universität Dortmund
Universitätsbibliothek
Bibliotheks-IT
Vogelpothsweg 76
44227 Dortmund
Tel.: +49 231-755 40 43
Fax: +49 231-755 40 32
claudia.juer...@tu-dortmund.de
www.ub.tu-dortmund.de
Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist
ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese
E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und
vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen
ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform
(mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen
Schriftstücks per Telefax erfolgen.
Important note: The information included in this e-mail is confidential. It is
solely intended for the recipient. If you are not the intended recipient of
this e-mail please contact the sender and delete this message. Thank you.
Without prejudice of e-mail correspondence, our statements are only legally
binding when they are made in the conventional written form (with personal
signature) or when such documents are sent by fax.
--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-tech/a0babf5a-a5bb-12d3-4be1-f10fe066b3b3%40tu-dortmund.de.