Jon -

Very cool use of VelocityResponseWriter!

Would you happen to have a sitemap.vm template to contribute? I realize there'd need to be an external URL configurable, but this would be trivially added as a request parameter and leveraged in the template.

        Erik

p.s. Anyone else using VelocityResponseWriter out there? Sitemaps is a great use of it. And also I've got a report of a big company in Brazil using it for e-mail generation of search results. I'm in the process of baking VrW into the main Solr example (it's there on trunk, basically) and more examples are better.

On Mar 18, 2010, at 7:40 PM, Jon Baer wrote:

It's also possible to try and use the Velocity contrib response writer and paging it w/ the sitemap elements.

BTW generating a sitemap was a big reason of a switch we did from GSA to Solr because (for some reason) the map took way too long to generate (even simple requests).

If you page through w/ Solr (ie rows=100&wt=velocity&v.template=sitemap) its fairly painless to build on cron.

- Jon

On Mar 18, 2010, at 6:25 PM, Chris Hostetter wrote:


: Been testing nutch to crawl for solr and I was wondering if anyone had : already worked on a system for getting the urls out of solr and generating
: an XML sitemap for Google.

it's pretty easy to just paginate through all docs in solr, so you could do that -- but I'd be really suprised if Nutch wasn't also loggign all the
URLs it indexed, so you could just post-process that log to build the
sitemap as well.



-Hoss



Reply via email to