Semyon Semyonov created NUTCH-2524: -------------------------------------- Summary: Crawl Script , if file exists in HDFS doesnt work. Key: NUTCH-2524 URL: https://issues.apache.org/jira/browse/NUTCH-2524 Project: Nutch Issue Type: Bug Components: bin Reporter: Semyon Semyonov
In crawl script you can find something likeĀ if [[ -d "$CRAWL_PATH"/hostdb ]]; then echo "Processing sitemaps based on hosts in HostDB" __bin_nutch sitemap "$CRAWL_PATH"/crawldb -hostdb "$CRAWL_PATH"/hostdb -threads $NUM_THREADS fi if [[ -d "$CRAWL_PATH"/hostdb ]]; doesnt work for HDFS only for local mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)