On 2005-02-25 16:13 Aengus wrote:
On Friday, February 25, 2005 3:36 PM [GMT], Hugh Morris <[EMAIL PROTECTED]> wrote:
Could someone tell me if it is possible to get the search query from requests referred from images.google.com (and the other image search engines).
The request strings are a bit untidy, like:
http://images.google.com/imgres?imgurl=http://www.washto2004.org/net/external/images/sunday/BNSF%2520PHOTOS/MOUNTAIN%2520PEAK.JPG&imgrefurl=http://www.washto2004.org/net/external/images/sunday/BNSF%2520PHOTOS/&h=1200&w=1600&sz=696&tbnid=A3gVV6ed4DkJ:&tbnh=112&tbnw=149&start=1&prev=/images%3Fq%3Dmountain%2Bpeak%26hl%3Den%26lr%3D%26sa%3DG
Your search string is in the last part of that referring URL - prev=/images%3Fq%3Dmountain%2Bpeak%26hl%3Den%26lr%3D%26sa%3DG
That's an escaped version of the original quesrystring - prev=/images?q=mountain+peak&hl=en&lr=&sa=G
If I add SEARCHENGINE http://images.google.com/* prev to my analog.cfg file, then I get this entry in my Search Query Report: 1 /images?q=mountain peak&hl=en&lr=&sa=g
And I get these 2 entries in my Search Word Report: 1 peak&hl=en&lr=&sa=g 1 /images?q=mountain
I don't know if you could clean the results up any more than that without preprocessing.
Aengus
Thanks for that. It was interesting to see the report with that line added to the cfg file but it won't be very practical for daily use.
<rhetorical>I wonder if it would be difficult to modify analog to deal with those encoded query strings?</rhetorical>
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------