[Nutch-dev] Joint Venture

2004-11-30 Thread shaun
As you may know, we specialize in assisting emerging companies in "Going Public". We are also able to assist with Private Placement preparation (for companies wanting to raise capital) and public company reporting. The President of our company is a very experienced securities and corporate law at

[Nutch-dev] Local Crawl / Stemming / Synonyms?

2004-11-30 Thread Yousef Ourabi
Hey, I have performed a local intranet crawl on my site. Now I know one of the pages contains the word cars, and so when I search for cars, i find it. However, when I search for car, I do not see it. How can I make the nuthc search more fuzzy? I know lucene has has fuzzy query matching so I assumin

[Nutch-dev] contributors: please document your plugin

2004-11-30 Thread Stefan Groschupf
Dear Plugin Contributors, after one year I count 13 nutch plugins. That's great since it means we got every month a bit more than one new plugin. :-) We have very easy to use plugins as html-parser but we have complex plugins like clustering and ontology based queries as well. I think it is tim

Re: [Nutch-dev] Experience with a big index

2004-11-30 Thread sg
> I would like to give my small contribute too. Great! So the question is only if the boxes are still available and if Doug give a ok. Stefan --- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Pr

Re: [Nutch-dev] Experience with a big index

2004-11-30 Thread Antonio Gulli
[EMAIL PROTECTED] wrote: Antonio, Do I miss something? Mike mentioned that the index was highly customized for a named entity extraction task. The WebDB and the WebGraph contained in it is still a gold mine. As I remeber there was a offer by archive.org to to use some boxes there. @Doug does t

Re: [Nutch-dev] Experience with a big index

2004-11-30 Thread sg
Antonio, Do I miss something? Mike mentioned that the index was highly customized for a named entity extraction task. As I remeber there was a offer by archive.org to to use some boxes there. @Doug does this offer still exist? I would love to offer to setup nutch on this boxes in case a other pe

Re: [Nutch-dev] Experience with a big index

2004-11-30 Thread Andrzej Bialecki
Michael Cafarella wrote: Andrzej, I think you make an excellent point here: On Mon, 2004-11-29 at 07:48, Andrzej Bialecki wrote: That was also the general idea of my ramblings about modifying NDFS so More on that in the thread about a month ago, titled "NDFS, DistributedSearch - redundant dep

Re: [Nutch-dev] Experience with a big index

2004-11-30 Thread Antonio Gulli
Hi Mike, you are doing a great job and i really impatient to read your paper, as soon as it will be published. A question: do you think that this big index can be available to the research community. It is a gold mine. The largest dataset if made by stanford in 2001 and it is outdated. It would