More info: I guess what I'm wanting is a way to get the rootURL for any hit
result. I don't suppose this is already indexed, is it?



Mark_Fletcher wrote:
> 
> I build my index via an intranet crawl starting with a few high-level
> "toc" files. I need to be able to preserve the url of the toc file in
> which a hit was found by appending it as, say, a url param at the end of
> the hit url. 
> 
> For example:
> http://my.server.com/hits/hit.html?toc=http%3A%2F%2Fmy.server.com%2Ftoc%2Ftoc1.html
> 
> Does something like this exist?
> 
> Thanks!
> 

-- 
View this message in context: 
http://www.nabble.com/Is-there-a-plugin-that-allows-modification-of-the-hit-url-before-it%27s-added-to-the-index--tf4001713.html#a11368720
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to