I have question on the contents of crawldb folder with Nutch 1.6

After I do updatedb step, crawldb folder includes the following. Is this
correct result I should get?
If not, how I can fix it?

If I execute "generate" on this crawldb below, will it generate full url
lists? My concern is that updatedb process is not completed fully because we
"624730206" and "current" folder at the same time.
Does Nutch take care of this?

I appreciate your help


hduser@hadoopdev1:~$ hadoop dfs -ls 160milyonurls/crawldb
Warning: $HADOOP_HOME is deprecated.

Found 3 items
drwxr-xr-x   - hduser supergroup          0 2013-07-05 23:55
/user/hduser/160milyonurls/crawldb/624730206
drwxr-xr-x   - hduser supergroup          0 2013-07-08 18:59
/user/hduser/160milyonurls/crawldb/current
drwxr-xr-x   - hduser supergroup          0 2013-07-03 14:39
/user/hduser/160milyonurls/crawldb/old




--
View this message in context: 
http://lucene.472066.n3.nabble.com/crawldb-contents-tp4076345.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to