Hi, If we filter and normalize hyperlinks in the parse job, we wouldn't have to filter and normalize during all other jobs (perhaps except injector). This would spair a lot of CPU time for updating crawl and link db. It would also, i think, help the WebGraph as it operates on segments' ParseData.
Thoughts? Thanks,