[
https://issues.apache.org/jira/browse/LABS-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Javier Puerto updated LABS-190:
-------------------------------
Attachment: NoRobotClient.diff
well, revised the code and you are right. If the link points to robots.txt,
return true because the robots.txt is always true.
> concurrency error in NoRobotClient
> ----------------------------------
>
> Key: LABS-190
> URL: https://issues.apache.org/jira/browse/LABS-190
> Project: Labs
> Issue Type: Bug
> Components: Droids
> Environment: Ubuntu 8.04, JDK 1.6
> Reporter: Javier Puerto
> Priority: Blocker
> Attachments: NoRobotClient.diff
>
>
> Testing with droids, when the number of workers rise the NoRobotClient thows
> an exception. I was searching for the error without sucess but it seems to be
> caused by the concurrency with the base url.
> This is the error:
> pool-1-thread-3: Starting org.apache.droids.crawler.CrawlingWorker
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
> at java.lang.String.substring(String.java:1938)
> at java.lang.String.substring(String.java:1905)
> at
> org.apache.http.norobots.NoRobotClient.isUrlAllowed(NoRobotClient.java:202)
> at org.apache.droids.protocol.http.Http.isAllowed(Http.java:87)
> at
> org.apache.droids.crawler.CrawlingWorker.execute(CrawlingWorker.java:49)
> at
> org.apache.droids.crawler.CrawlingWorker.execute(CrawlingWorker.java:1)
> at
> org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner.run(MultiThreadedTaskMaster.java:186)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> pool-1-thread-1: Worker "76" has finished.
> pool-1-thread-2: Url is allowed
> java.lang.StringIndexOutOfBoundsException: String index out of range: -2
> at java.lang.String.substring(String.java:1938)
> at java.lang.String.substring(String.java:1905)
> at
> org.apache.http.norobots.NoRobotClient.isUrlAllowed(NoRobotClient.java:202)
> at org.apache.droids.protocol.http.Http.isAllowed(Http.java:87)
> at
> org.apache.droids.crawler.CrawlingWorker.execute(CrawlingWorker.java:49)
> at
> org.apache.droids.crawler.CrawlingWorker.execute(CrawlingWorker.java:1)
> at
> org.apache.droids.impl.MultiThreadedTaskMaster$WorkerRunner.run(MultiThreadedTaskMaster.java:186)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]