Hi Daniel,
A newbie to Nutch here myself.
Answers to some of your questions.
a) 0.7.2 for single site or small number of sites - yes, that's the
better way to go for now.
b) re: Analyzer not sure, have not used anything custom
c) Re: the webapp - it is completely independent of Nutch which is
crawler + indexer. The app is just meant for testing your set up easily,
IMHO.
To use your own UI, take a look at the contents of search.jsp - you
want to use NutchBean and if you are familiar with Lucene, the Hits class.
From there you can take off on your own. I am pretty sure most
people using Nutch beyond the simple examples are using their own UI.
You can also look at the OpenSearch servlet which serves up results
in OpenSearch XML format. So you can completely decouple the index from
the UI.
You can initiate search queries from a PHP, or Python or Ruby or
..... whatever UI as long as you know how to wrap the XML in the style
sheet of your choice.
Hope that helps -- sorry don't have more info on the Analyzer question.
Nitin Borwankar
http://tagschema.com
Daniel Lopez wrote:
>Hi there,
>
>I just started playing with Nutch and I have still not decided yet if it
>would be appropriate or not, hence my questions. I already have experience
>with Lucene inside my own projects, so I think I could tweak it a bit. I
>browsed the documentation I could find, the Wiki and the mail archives and
>then I thought about checking with the people already using it to see if my
>impression is correct. So, here we go:
>
>.- I'm planning on using it just in a single node to crawl/search on our
>different web servers, to provide a search facility inside our own pages,
>not for the whole web, and I read that the 7.X branch might be more
>appropriate as the 8.X seemed to be more focused on multinode sites and
>that might cause performance problems. Is that still true? Should I stick
>to the 7.X branch?
>
>.- I would like to be able to crawl/index/search the documents using
>specific analyzers, due to documents being LATIN-1. I already applied an
>appropriate analyzer in my programms but I'm not sure if Nutch allows to
>change it easily, through some property, or I have to get into the code and
>do it myself. I have no problem with that but the less I deviate from a
>standard Nutch installation, the better, I guess. The same goes for the
>Indexer and the searching possibilities. I would like to use something else
>than a Boolean query. Can those things be tweaked through properties?
>
>.- Lastly, the search interface is not exactly what I want and I'm also not
>too keen on plain JSPs with the scripting inside. I thought I might as well
>replicate the functionality using a framework we use, based on XML so we
>have the UI and the rest separated... Are there any plans to develop the
>search UI further, or should I simply look at the JSPs and replicate, more
>or less, their behaviour. In that case, any special tips for that?
>
>.- Anyone using Nutch in a similar scenario has any special tips/advice?
>
>Thanks for any insight you can provide, I do have plenty of experience with
>Java on the server side and Open Source, but I'd rather not duplicate work
>if I can help it and I'd like to stick as close to the "standard" Nutch as
>possible.
>
>Cheers!
>D.
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general