crawling html in asynchronous service?

2011-02-24 Thread sam lee
Hey, I am using Scheduler to crawl html files. It runs every minute. And it needs to crawl /content/foo.html If I use Apache commons HttpClient for GET /content/foo.html, I need to set up authentication (Basic Auth?). However, since all html pages that I want to crawl are served within Sling,

Re: crawling html in asynchronous service?

2011-02-24 Thread Bertrand Delacretaz
Hi, On Thu, Feb 24, 2011 at 9:20 PM, sam lee skyn...@gmail.com wrote: ...since all html pages that I want to crawl are served within Sling, is there an API that resolves (or renders) paths like /content/foo.html, /content/bar.json ... etc? You can use the SlingRequestProcessor to make