Jon -
Very cool use of VelocityResponseWriter!
Would you happen to have a sitemap.vm template to contribute? I
realize there'd need to be an external URL configurable, but this
would be trivially added as a request parameter and leveraged in the
template.
Erik
p.s. Anyone else using VelocityResponseWriter out there? Sitemaps is
a great use of it. And also I've got a report of a big company in
Brazil using it for e-mail generation of search results. I'm in the
process of baking VrW into the main Solr example (it's there on trunk,
basically) and more examples are better.
On Mar 18, 2010, at 7:40 PM, Jon Baer wrote:
It's also possible to try and use the Velocity contrib response
writer and paging it w/ the sitemap elements.
BTW generating a sitemap was a big reason of a switch we did from
GSA to Solr because (for some reason) the map took way too long to
generate (even simple requests).
If you page through w/ Solr (ie
rows=100&wt=velocity&v.template=sitemap) its fairly painless to
build on cron.
- Jon
On Mar 18, 2010, at 6:25 PM, Chris Hostetter wrote:
: Been testing nutch to crawl for solr and I was wondering if
anyone had
: already worked on a system for getting the urls out of solr and
generating
: an XML sitemap for Google.
it's pretty easy to just paginate through all docs in solr, so you
could
do that -- but I'd be really suprised if Nutch wasn't also loggign
all the
URLs it indexed, so you could just post-process that log to build the
sitemap as well.
-Hoss