Hi Zhou,
sounds like DMOZ is not as bad an option as you
said. Why don't you use it as a starting point for 
searching images.

But please keep in mind, that Nutch does not crawl
images by default. I would suggest, you do the following:

1. start using Nutch with default values and classical text search
2. grow your index and become comfortable with Nutch
2. look at plug-ins for dealing with content (rtf, pdf, ...)
3. build your own plug-in for dealing with images (to extract size, ...)

Kind regards,
Olaf



On Tue, 1 Mar 2005 20:03:18 +0800, Zhou LiBing <[EMAIL PROTECTED]> wrote:
> ïjust want to finish  image search engine,my team has six graduate
> ,and my job is collect the resources,such as image,text and etc.
> 
> Do you have some suggestions about this ?
> 
> Thanks anyway!
> 
> 
> On Tue, 1 Mar 2005 08:05:06 +0100, Olaf Thiele <[EMAIL PROTECTED]> wrote:
> > Hi,
> > if you want to build an index with 100 million pages, I recommend
> > Thompson's rule for first-time telescope makers:
> > It is faster to make a four-inch mirror then a six-inch mirror than to
> > make a six-inch mirror (http://www.javaranch.com/granny.jsp).
> >
> > For more information on a big index, read the following thread:
> > http://sourceforge.net/mailarchive/message.php?msg_id=10163623
> >
> > And for the second question, if you are not using DMOZ data,
> > you will need to find your own. WHAT do you want to index?
> > There must be a reason for you to build a search engine.
> >
> > Kind regards,
> > Olaf
> >
> > On Tue, 1 Mar 2005 09:37:17 +0800, Zhou LiBing <[EMAIL PROTECTED]> wrote:
> > > If Idonot use the DMOZ data,How could I complete the search engine
> > >
> > >
> > > On Mon, 28 Feb 2005 18:14:33 -0600, Ivaylo Georgiev <[EMAIL PROTECTED]> 
> > > wrote:
> > > >
> > > >
> > > > I just ran the tutorial and read about hardware requirements for running
> > > > Nutch.
> > > >
> > > > I have some questions. What does it mean "search nodes"?
> > > >
> > > > Assume I want to index 100 million pages and I have 5 machines to use as
> > > > search nodes - how these search nodes must be built â what part of 
> > > > Nutch
> > > > must reside on these machines?
> > > >
> > > >
> > > >
> > > > Thank you,
> > > >
> > > > Ivo
> > >
> > > --
> > > ---Letter From your friend Blue at HUST CGCL---
> > >
> > > -------------------------------------------------------
> > > SF email is sponsored by - The IT Product Guide
> > > Read honest & candid reviews on hundreds of IT Products from real users.
> > > Discover which products truly live up to the hype. Start reading now.
> > > http://ads.osdn.com/?ad_ide95&alloc_id396&opclick
> > > _______________________________________________
> > > Nutch-general mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/nutch-general
> > >
> >
> > --
> >
> > <SimpleHuman gender="male">
> >   <Physical name="Olaf Thiele" />
> >   <Virtual adress="http://www.olafthiele.de"; />
> > </SimpleHuman>
> >
> > -------------------------------------------------------
> > SF email is sponsored by - The IT Product Guide
> > Read honest & candid reviews on hundreds of IT Products from real users.
> > Discover which products truly live up to the hype. Start reading now.
> > http://ads.osdn.com/?ad_ide95&alloc_id396&opclick
> > _______________________________________________
> > Nutch-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> 
> --
> ---Letter From your friend Blue at HUST CGCL---
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_ide95&alloc_id396&opclick
> _______________________________________________
> Nutch-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
> 


-- 

<SimpleHuman gender="male">
   <Physical name="Olaf Thiele" />
   <Virtual adress="http://www.olafthiele.de"; />
</SimpleHuman>


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to