Re: Solr expert(s) needed

2009-01-09 Thread Lance Norskog
http://issues.apache.org/jira/browse/NUTCH-442

Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into
a Solr schema? In that case, you can just copy the Lucene index files away
from Nutch and run them under Solr.


Re: Solr expert(s) needed

2009-01-09 Thread Tony Wang
Thanks Lance! I have no idea whether the Nuth-generated index could be
converted to Solr schema. I wonder what people are using this NUTCH-442 for
(http://issues.apache.org/jira/browse/NUTCH-442).

So what crawler do you use to generate index for Solr? Thanks a lot!!

On Fri, Jan 9, 2009 at 8:04 PM, Lance Norskog goks...@gmail.com wrote:

 http://issues.apache.org/jira/browse/NUTCH-442

 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered
 into
 a Solr schema? In that case, you can just copy the Lucene index files away
 from Nutch and run them under Solr.




-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信


Re: Solr expert(s) needed

2009-01-09 Thread Lance Norskog
I don't know about the Nutch format - Solr schema idea either. The
NUTCH-442 system uses Solr for both indexing and searching, and uses Nutch
for only crawling.

At my last job we had a custom scripting system that crawled the front page
of over 5000 sites. Each site had a configured script. Yes, it was complex.
We also had custom crawlers for Youtube  myspace and some other sites which
gave APIs, but in general it was all hand-coded.

I have used the rss format of the data input handler, and it works well but
has problems with detecting errors etc. That is, it works well when it works
but does not fail gracefully in a useful way.

Lance

2009/1/9 Tony Wang ivyt...@gmail.com

 Thanks Lance! I have no idea whether the Nuth-generated index could be
 converted to Solr schema. I wonder what people are using this NUTCH-442 for
 (http://issues.apache.org/jira/browse/NUTCH-442).

 So what crawler do you use to generate index for Solr? Thanks a lot!!

 On Fri, Jan 9, 2009 at 8:04 PM, Lance Norskog goks...@gmail.com wrote:

  http://issues.apache.org/jira/browse/NUTCH-442
 
  Haven't used Nutch. Can the Nutch-generated index be reverse-engineered
  into
  a Solr schema? In that case, you can just copy the Lucene index files
 away
  from Nutch and run them under Solr.
 



 --
 Are you RCholic? www.RCholic.com
 温 良 恭 俭 让 仁 义 礼 智 信



Solr expert(s) needed

2009-01-07 Thread Tony Wang
I would like to build a search engine that indexes online videos from such
websites as metacafe, youtube, etc. I want to use the best open-source Solr
as the indexing tool with Nutch as web crawler. However, I have difficulties
integrating these two open source products. So I am seeking for help and I
will compensate you for the time spent. Those people who are interested,
please send me your hourly rate and the estimated hours needed to get this
done.

My email is: ivytony [at] gmail dot com

Thanks!

Tony

-- 
Are you RCholic? www.RCholic.com
温 良 恭 俭 让 仁 义 礼 智 信