Hi Karol and all,

I did not have time to look into it. Most of the ip list are not free
anymore, so I wonder how we can clean up the statistics, like replacing
them with a new source of lists and then flag the bots and remove them.

Sunny greetings

Claudia




Am 21.01.2023 um 17:28 schrieb Karol:
Hi Claudia,

i have exactly the same problem.UP.

Best,

Karol

wtorek, 17 stycznia 2023 o 15:19:39 UTC+1 Claudia Jürgen napisał(a):

    Hi all,

    I noted two things about the iplists used for stats-util.

    The lists are configured in:

    
https://github.com/DSpace/DSpace/blob/main/dspace/config/modules/solr-statistics.cfg 
<https://github.com/DSpace/DSpace/blob/main/dspace/config/modules/solr-statistics.cfg>

    solr-statistics.spiderips.urls = http://iplists.com/google.txt
    <http://iplists.com/google.txt>, \
    http://iplists.com/inktomi.txt <http://iplists.com/inktomi.txt>, \
    http://iplists.com/lycos.txt <http://iplists.com/lycos.txt>, \
    http://iplists.com/infoseek.txt <http://iplists.com/infoseek.txt>, \
    http://iplists.com/altavista.txt <http://iplists.com/altavista.txt>, \
    http://iplists.com/excite.txt <http://iplists.com/excite.txt>, \
    http://iplists.com/misc.txt <http://iplists.com/misc.txt>


    a) the lists are most likely obsolete and thus the statistics very
    imprecise with regards to bot traffic
    https://iplists.com/ <https://iplists.com/>
    The last revised dates on the site are from 2008 and 2014
    Maybe we need another source for iplists and a "cleanup".

    b) stats-util -u (in order to get theoretically updated files) does not
    work and throws an NPE
    Getting: http://iplists.com/google.txt <http://iplists.com/google.txt>
    To: /opt/dspace/dspace63tu/config/spiders/iplists.com-google.txt
    - Error: null
    java.lang.NullPointerException
    at org.apache.tools.ant.taskdefs.Get.doGet(Get.java:221)
    at org.apache.tools.ant.taskdefs.Get.execute(Get.java:134)
    at
    
org.dspace.statistics.util.StatisticsClient.updateSpiderFiles(StatisticsClient.java:152)
    at
    org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:80)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at
    
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at
    
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
    at
    org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)

    Sunny Greetings

    Claudia

    --
    Claudia Juergen

    Technische Universität Dortmund
    Universitätsbibliothek
    Bibliotheks-IT
    Vogelpothsweg 76
    44227 Dortmund

    Tel.: +49 231-755 40 43 <tel:+49%20231%207554043>
    Fax: +49 231-755 40 32 <tel:+49%20231%207554032>
    claudia...@tu-dortmund.de
    www.ub.tu-dortmund.de <http://www.ub.tu-dortmund.de>


    Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich.
    Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie
    nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie
    bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
    Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen
    ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher
    Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung
    eines solchen Schriftstücks per Telefax erfolgen.

    Important note: The information included in this e-mail is
    confidential. It is solely intended for the recipient. If you are
    not the intended recipient of this e-mail please contact the sender
    and delete this message. Thank you. Without prejudice of e-mail
    correspondence, our statements are only legally binding when they
    are made in the conventional written form (with personal signature)
    or when such documents are sent by fax.

--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
<https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx>
---
You received this message because you are subscribed to the Google
Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to dspace-tech+unsubscr...@googlegroups.com
<mailto:dspace-tech+unsubscr...@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-tech/7492efa0-a1ab-49ad-a821-1cf5bd652846n%40googlegroups.com
 
<https://groups.google.com/d/msgid/dspace-tech/7492efa0-a1ab-49ad-a821-1cf5bd652846n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Claudia Juergen

Technische Universität Dortmund
Universitätsbibliothek
Bibliotheks-IT
Vogelpothsweg 76
44227 Dortmund

Tel.: +49 231-755 40 43
Fax: +49 231-755 40 32
claudia.juer...@tu-dortmund.de
www.ub.tu-dortmund.de

Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist 
ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese 
E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und 
vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen 
ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform 
(mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen 
Schriftstücks per Telefax erfolgen.

Important note: The information included in this e-mail is confidential. It is 
solely intended for the recipient. If you are not the intended recipient of 
this e-mail please contact the sender and delete this message. Thank you. 
Without prejudice of e-mail correspondence, our statements are only legally 
binding when they are made in the conventional written form (with personal 
signature) or when such documents are sent by fax.

--
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/a0babf5a-a5bb-12d3-4be1-f10fe066b3b3%40tu-dortmund.de.

Reply via email to