Thanks!  That solved the problem of the host file!  You are great!


On Apr 26, 2011, at 11:02 AM, [email protected] wrote:

It seems you should move www.example.com example.com from line 3 to line 1, uncomment line 3 and comment other lines.

Alex.










-----Original Message-----
From: Alex <[email protected]>
To: user <[email protected]>
Sent: Tue, Apr 26, 2011 4:18 am
Subject: Re: Hosts File & Nutch 1.0+


Just in case someone has more ideas.  Here is how my hosts file look
like:

http://pastebin.com/wyV7wnqn

Any help is highly appreciated!

Alex


On Apr 25, 2011, at 10:13 PM, Alex wrote:

Dear Mark:

Thank you so much for the help!

I tried it but it still give me the same error.

According to the developer is either a server environment for not
able to search itself or host file issue.


Any other ideas?

Thank you so much for your time!

Alex



On Apr 19, 2011, at 6:01 PM, Mark Achee wrote:

With nslookup already showing the correct IP address, it doesn't
seem like a
hostname/DNS issue.  But I assume this is what the developer is
talking
about:

At the end of your /etc/hosts file add

127.0.0.1  www.example.org

but replace www.example.org with your domain.  If you know what the
server's
other IP address(es) is/are, you could try those also instead of
127.0.0.1.
If that doesn't fix it, it's probably not really a hostname/DNS
issue.



-Mark


On Tue, Apr 19, 2011 at 6:47 PM, Alex <[email protected]>
wrote:

I edited that so that it does not disclose the location of my
rootUrLDir.  The path is accurate.

I am going to find out what command is given to nutch but basically
the application developer has confirmed that the issue is the hosts
file or something on the server that can not search itself.

Alex
On Apr 19, 2011, at 5:22 PM, Mark Achee wrote:

From your logs:

INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/


Looks like you didn't set the seed urls directory.  If that's not
enough
info for you to fix it, send the full command you're running.

-Mark



On Thu, Apr 14, 2011 at 10:57 PM, Alex <[email protected]>
wrote:

Hi,

I am new to Nutch.  I have an application that uses Nutch to
search.
I have configured the application so that Nutch can run. However,
after a lot of troubleshooting I have been pointed to the fact
that
there is something wrong with my hosts file.  My hostname is
different
than my domain name and that "seems" to make Nutch stop in depth
1.
Does anyone have any idea of what is the correct configuration
of the
hosts file so that nutch runs properly?

My domain name resolves fine.  Please help me!

Here are the logs of the indexing:

Stopping at depth=1 - no more URLs to fetch.

INFO sitesearch.CrawlerUtil: indexHost : Starting an Site Search
index on host www.mydomain.com
INFO sitesearch.CrawlerUtil: site search crawl started in: /opt/
dotcms/
dotCMS/assets/search_index/www.mydomain.com/1-XXX_temp/crawl- index
] INFO sitesearch.CrawlerUtil: rootUrlDir = /path/to/directory/
search_index/www.mydomain.com/url_folder
INFO sitesearch.CrawlerUtil: threads = 10
INFO sitesearch.CrawlerUtil: depth = 20
INFO sitesearch.CrawlerUtil: indexer=lucene

INFO sitesearch.CrawlerUtil: Stopping at depth=1 - no more URLs to
fetch.
NFO sitesearch.CrawlerUtil: site search crawl finished: /
directorypath/
search_index/www.mydomain.com/1xxx/crawl-index
INFO sitesearch.CrawlerUtil: indexHost : Finished Site Search
index
on
host www.mydomain.com







Reply via email to