[
https://jira.duraspace.org/browse/DS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=24505#comment-24505
]
Bram Luyten (@mire) commented on DS-919:
----------------------------------------
2 new observations:
- not all bots follow all the links on the page they hit. So they might not
follow the bot trap link.
- almost no human user hide their user agent. Many bots do. A simple way to
catch more bots would be to flag all hits that don't have a user agent.
> SOLR Statistics: Better detection & avoidance of abusive traffic (including a
> bot trap)
> ----------------------------------------------------------------------------------------
>
> Key: DS-919
> URL: https://jira.duraspace.org/browse/DS-919
> Project: DSpace
> Issue Type: New Feature
> Components: Solr
> Reporter: Bram Luyten (@mire)
>
> The current implementation of bot traffic filtering relies on IP lists. Even
> though using hostnames (as suggested here:
> https://jira.duraspace.org/browse/DS-790 ) could improve the situation, there
> are still forms of abusive traffic we might want to detect and exclude from
> stats.
> The most obvious example here would be repeated hits or downloads coming from
> the same unique source. Another example could be traffic from spiders that
> aren't included in the lists. A way to do this would be to create a bot trap:
> a link hidden behind one pixel, that a human user would never click, but that
> bots might follow. The agents getting to the resource at this link, could be
> listed and dynamically removed from the hit/download counts.
> Some related links:
> http://www.affiliatebeginnersguide.com/sitelogs/bots_hunt.html
> http://www.elxsy.com/2009/06/how-to-identify-and-ban-bots-spiders-crawlers/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel