Hi Mike, The rule is that a repository connector can only include metadata that is not likely to change as the result of the actual circumstances of the crawl. Otherwise, incremental crawling is a fiction. Unfortunately, cookies are exactly the kind of data that would change every time the document is fetched.
Karl On Sun, May 5, 2013 at 10:37 PM, Michael Kelleher <[email protected]>wrote: > Hello, > > I was not sure which to post to, although I think this group is the most > pertinent. > > For the web connector, is there a document that defines what "data" is > sent to the output connector? I am wondering if in addition to the HTML > payload, if I wrote a custom connector would I have access to the: Headers > generated during the "current" page request, Cookies generated during the > "current"page request? > > Thanks. > > --mike >
