I've testing Tom's nice S3/EC2 patch on couple of EC2/S3 machine. Injector fails to inject urls, because fs.rename() in line 145 of CrawlDb.javadeletes the whole content and only renames the parent folder from xxxxx to current. Basiclly,. crawl_dir/crawldb/current will an empty folder after renaming.
I have not gone through the hadoop fs.rename code, I thought maybe somebody have solved this problem before. Thanks, Mike