Re: Solr expert(s) needed
http://issues.apache.org/jira/browse/NUTCH-442 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into a Solr schema? In that case, you can just copy the Lucene index files away from Nutch and run them under Solr.
Re: Solr expert(s) needed
Thanks Lance! I have no idea whether the Nuth-generated index could be converted to Solr schema. I wonder what people are using this NUTCH-442 for (http://issues.apache.org/jira/browse/NUTCH-442). So what crawler do you use to generate index for Solr? Thanks a lot!! On Fri, Jan 9, 2009 at 8:04 PM, Lance Norskog goks...@gmail.com wrote: http://issues.apache.org/jira/browse/NUTCH-442 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into a Solr schema? In that case, you can just copy the Lucene index files away from Nutch and run them under Solr. -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信
Re: Solr expert(s) needed
I don't know about the Nutch format - Solr schema idea either. The NUTCH-442 system uses Solr for both indexing and searching, and uses Nutch for only crawling. At my last job we had a custom scripting system that crawled the front page of over 5000 sites. Each site had a configured script. Yes, it was complex. We also had custom crawlers for Youtube myspace and some other sites which gave APIs, but in general it was all hand-coded. I have used the rss format of the data input handler, and it works well but has problems with detecting errors etc. That is, it works well when it works but does not fail gracefully in a useful way. Lance 2009/1/9 Tony Wang ivyt...@gmail.com Thanks Lance! I have no idea whether the Nuth-generated index could be converted to Solr schema. I wonder what people are using this NUTCH-442 for (http://issues.apache.org/jira/browse/NUTCH-442). So what crawler do you use to generate index for Solr? Thanks a lot!! On Fri, Jan 9, 2009 at 8:04 PM, Lance Norskog goks...@gmail.com wrote: http://issues.apache.org/jira/browse/NUTCH-442 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into a Solr schema? In that case, you can just copy the Lucene index files away from Nutch and run them under Solr. -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信
Solr expert(s) needed
I would like to build a search engine that indexes online videos from such websites as metacafe, youtube, etc. I want to use the best open-source Solr as the indexing tool with Nutch as web crawler. However, I have difficulties integrating these two open source products. So I am seeking for help and I will compensate you for the time spent. Those people who are interested, please send me your hourly rate and the estimated hours needed to get this done. My email is: ivytony [at] gmail dot com Thanks! Tony -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信