Hi, (a) Is it possible to crawl URL of a Zip file using Nutch and index in Solr? (pls see example below)
(b) Also, if a zip file URL has PDF files in them, is it possible to use Nutch to crawl the Zip file URL and also the PDF file inside the Zip file URL? E.g. *https://www.abc123.xxx/sites/docs/testing.zip <https://www.abc123.xxx/sites/docs/testing.zip>* When I unzip above URL - I would have the following: *def.pdf* *lmn.pdf* *reg.pdf* Please advise. Thanks! AL