Hi,

(a) Is it possible to crawl URL of a Zip file using Nutch and index in
Solr? (pls see example below)

(b) Also, if a zip file URL has PDF files in them, is it possible to use
Nutch to crawl the Zip file URL and also the PDF file inside the Zip file
URL?


E.g.
*https://www.abc123.xxx/sites/docs/testing.zip
<https://www.abc123.xxx/sites/docs/testing.zip>*
When I unzip above URL - I would have the following:


*def.pdf*

*lmn.pdf*
*reg.pdf*


Please advise.

Thanks!

AL

Reply via email to