[ https://issues.apache.org/jira/browse/NUTCH-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doğacan Güney closed NUTCH-749. ------------------------------- Resolution: Invalid Please use the nutch-user mailing list to ask questions. As for your problem, you need to add to your nutch-site.xml something like this: <property> <name>http.robots.agents</name> <value>nutch-solr-integration,*</value> </property> Change nutch-solr-integration to your robot name. > Fetching the url from crawldb > ----------------------------- > > Key: NUTCH-749 > URL: https://issues.apache.org/jira/browse/NUTCH-749 > Project: Nutch > Issue Type: Bug > Environment: Nutch with solr integration > Reporter: salima abdulsalam > > Hi, > Iam new to using the nutch with solr.I followed the link > http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ for > integration.Iam getting an error while fetching the url from crawldb. > I used the below command > bin/nutch fetch $SEGMENT -noParsing and i set the SEGMENT as export > SEGMENT=crawl/segments/`ls -tr crawl/segments|tail -1` > after running the command, iam getting the error as > Fetcher: Your 'http.agent.name' value should be listed first in > 'http.robots.agents' property. > Fetcher: starting > Fetcher: segment: crawl/segments/20090821062021 > Exception in thread "main" java.io.IOException: Illegal file pattern: > Expecting set closure character or end of range, or } for glob 20090821062021 > at 30 > at > org.apache.hadoop.fs.FileSystem$GlobFilter.error(FileSystem.java:1086) > at > org.apache.hadoop.fs.FileSystem$GlobFilter.setRegex(FileSystem.java:1071) > at > org.apache.hadoop.fs.FileSystem$GlobFilter.<init>(FileSystem.java:989) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:955) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:964) > at > org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:904) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:868) > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:159) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:39) > at > org.apache.nutch.fetcher.Fetcher$InputFormat.getSplits(Fetcher.java:101) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:797) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142) > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:969) > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1003) > Can anyone help in this. > Thanks, > Salima > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.