SOLR Statistics: Better detection & avoidance of abusive traffic (including a 
bot trap) 
----------------------------------------------------------------------------------------

                 Key: DS-919
                 URL: https://jira.duraspace.org/browse/DS-919
             Project: DSpace
          Issue Type: New Feature
          Components: Solr
            Reporter: Bram Luyten (@mire)


The current implementation of bot traffic filtering relies on IP lists. Even 
though using hostnames (as suggested here: 
https://jira.duraspace.org/browse/DS-790 ) could improve the situation, there 
are still forms of abusive traffic we might want to detect and exclude from 
stats.

The most obvious example here would be repeated hits or downloads coming from 
the same unique source. Another example could be traffic from spiders that 
aren't included in the lists. A way to do this would be to create a bot trap: a 
link hidden behind one pixel, that a human user would never click, but that 
bots might follow. The agents getting to the resource at this link, could be 
listed and dynamically removed from the hit/download counts.

Some related links:
http://www.affiliatebeginnersguide.com/sitelogs/bots_hunt.html
http://www.elxsy.com/2009/06/how-to-identify-and-ban-bots-spiders-crawlers/

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.duraspace.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel
  • [Dspace-devel] [DuraSpace... Bram Luyten (@mire) (DuraSpace JIRA)

Reply via email to