Hi Zhou, sounds like DMOZ is not as bad an option as you said. Why don't you use it as a starting point for searching images.
But please keep in mind, that Nutch does not crawl images by default. I would suggest, you do the following: 1. start using Nutch with default values and classical text search 2. grow your index and become comfortable with Nutch 2. look at plug-ins for dealing with content (rtf, pdf, ...) 3. build your own plug-in for dealing with images (to extract size, ...) Kind regards, Olaf On Tue, 1 Mar 2005 20:03:18 +0800, Zhou LiBing <[EMAIL PROTECTED]> wrote: > ïjust want to finish image search engine,my team has six graduate > ,and my job is collect the resources,such as image,text and etc. > > Do you have some suggestions about this ? > > Thanks anyway! > > > On Tue, 1 Mar 2005 08:05:06 +0100, Olaf Thiele <[EMAIL PROTECTED]> wrote: > > Hi, > > if you want to build an index with 100 million pages, I recommend > > Thompson's rule for first-time telescope makers: > > It is faster to make a four-inch mirror then a six-inch mirror than to > > make a six-inch mirror (http://www.javaranch.com/granny.jsp). > > > > For more information on a big index, read the following thread: > > http://sourceforge.net/mailarchive/message.php?msg_id=10163623 > > > > And for the second question, if you are not using DMOZ data, > > you will need to find your own. WHAT do you want to index? > > There must be a reason for you to build a search engine. > > > > Kind regards, > > Olaf > > > > On Tue, 1 Mar 2005 09:37:17 +0800, Zhou LiBing <[EMAIL PROTECTED]> wrote: > > > If Idonot use the DMOZ data,How could I complete the search engine > > > > > > > > > On Mon, 28 Feb 2005 18:14:33 -0600, Ivaylo Georgiev <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > > > > I just ran the tutorial and read about hardware requirements for running > > > > Nutch. > > > > > > > > I have some questions. What does it mean "search nodes"? > > > > > > > > Assume I want to index 100 million pages and I have 5 machines to use as > > > > search nodes - how these search nodes must be built â what part of > > > > Nutch > > > > must reside on these machines? > > > > > > > > > > > > > > > > Thank you, > > > > > > > > Ivo > > > > > > -- > > > ---Letter From your friend Blue at HUST CGCL--- > > > > > > ------------------------------------------------------- > > > SF email is sponsored by - The IT Product Guide > > > Read honest & candid reviews on hundreds of IT Products from real users. > > > Discover which products truly live up to the hype. Start reading now. > > > http://ads.osdn.com/?ad_ide95&alloc_id396&opclick > > > _______________________________________________ > > > Nutch-general mailing list > > > [email protected] > > > https://lists.sourceforge.net/lists/listinfo/nutch-general > > > > > > > -- > > > > <SimpleHuman gender="male"> > > <Physical name="Olaf Thiele" /> > > <Virtual adress="http://www.olafthiele.de" /> > > </SimpleHuman> > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products from real users. > > Discover which products truly live up to the hype. Start reading now. > > http://ads.osdn.com/?ad_ide95&alloc_id396&opclick > > _______________________________________________ > > Nutch-general mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/nutch-general > > > > -- > ---Letter From your friend Blue at HUST CGCL--- > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_ide95&alloc_id396&opclick > _______________________________________________ > Nutch-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nutch-general > -- <SimpleHuman gender="male"> <Physical name="Olaf Thiele" /> <Virtual adress="http://www.olafthiele.de" /> </SimpleHuman> ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
