We've been crawling with nutch and deleting the crawldb between crawls. I believe I've managed to get my recrawl script to finally work, but I was disappointed to see that in my db, the modified time of all of my pages is Jan 1 1970. Since I control both the crawler and the web server in our setup, is there some setting that we can set to enable Nutch to successfully get the modified time for the pages? I want to reduce the number of fetches as much as possible.
Thanks!

