tc wrote.. apparently on 18-Apr-2005/15:59:43+2:00 >> Yes again. But it does indeed capture the nytimes.com cookie for me .. >> I just tried it. > >hmm, weird. you were able to access pages which require login? it did >store a couple cookies for me but i wasn't able to reuse them to >access pages. i didn't spend much time on it but it might have to do >with the fact that nytimes uses half a dozen different cookies. i'll >look more into it...
Ah, you didn't state that you wanted to login .. I don't have a nytimes account so didn't try. And I've noticed that there are quite a few websites that are built to deliberately make it hard to scrape the data .. using javascript etc to generate values which are needed to be posted into forms. >> And yes again. Turned out to be quite straight forward to do for both http >> and >> https. > >what was the functionnality you added? did you replace them or did you >add them as another scheme? would you mind posting them? (I won't I added to it. It was a few years ago now, but I recall that I created a global cookie-jar object, and added cookies to it that were encountered using the http scheme. And when sites were being accessed that had cookies in the jar, I sent them. Some sites used up to 6 cookies, and it seemed to handle those okay, as well as handling redirections. It wasn't perfect, but it did the job I needed to do. As for posting, I'll have to discuss with Carl. Posting the schemes, modified or not, seems to be a breach of the sdk license. -- Graham Chiu http://www.compkarori.com/cerebrus http://www.compkarori.com/rebolml -- To unsubscribe from the list, just send an email to lists at rebol.com with unsubscribe as the subject.
