[ https://jira.duraspace.org/browse/DS-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Dietz reassigned DS-1008: ------------------------------- Assignee: Peter Dietz > Solr Statistics markRobotsByIP can mark too many IP addresses, including IP's > not on the IP list > ------------------------------------------------------------------------------------------------ > > Key: DS-1008 > URL: https://jira.duraspace.org/browse/DS-1008 > Project: DSpace > Issue Type: Bug > Components: Solr > Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2 > Reporter: Peter Dietz > Assignee: Peter Dietz > Attachments: DS-1008-fix-robot-overcounting.patch > > > The function markRobotsByIP is including too many bot IP's by a factor of > potentially 9. > https://github.com/DSpace/DSpace/blob/5366d237afa07005ec485831c9bca1f1c992f01d/dspace-stats/src/main/java/org/dspace/statistics/SolrLogger.java#L473 > /* query for ip, exclude results previously set as bots. */ > processor.execute("ip:"+ip+ "* AND -isBot:true"); > ip* would expand: > 10.10.10* to 10.10.[10, 100-109].* > 10.10.10.10* to 10.10.10.[10, 100-109] > My co-worker Brian Stamper suggested: > if (ip.matches("[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+") { > // Full 4 octet string, run as-is. > processor.execute("ip:" + ip + " AND -isBot:true"); > } else if (ip.matches("\.$") { > // didn't match full-octet, but ends in period, we assume it was something > like #.#.#. or #.#. -- I don't expect this in the "stock" input from > ip-list.com > processor.execute("ip:" + ip + "* AND -isBot:true"); > } else if (ip.matches("[0-9]$") { > // ends with a number, and is not a full 4-octet as first entry, so we > append .* > processor.execute("ip:" + ip + ".* AND -isBot:true"); > } else { > log.error("Unexpected IP value: " + ip); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://jira.duraspace.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free "Love Thy Logs" t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel