On Wed, Jul 18, 2007 at 07:38:43PM +0200, Hendrik-Jan Heins wrote: > Hello Thomas, > > I'm sorry, but I'm not that good at reading the trace lynx gives. I > can sort of see what's happening, and it looks like lynx approaches > the request from different paths. Apparently they don't work as in the
not exactly - it's parsing the URL for different things. To make sense of it, I guess you have to read the C code. > end lynx resorts to some sort of apache root. > If that is happening, what doesn't lynx "see" the page right at the > correct location? I already pointed out the most likely issue: the string that lynx shows in the trace file has embedded newlines: GET /query_part.php?brandname= 1st%20Wave HTTP/1.0\r A quick check with wireshark for w3m shows this line: GET /query_part.php?brandname=1st+Wave HTTP/1.0\r Checking the same thing for lynx shows just the first line: GET /query_part.php?brandname= The other text shows up in the miscellaneous data display, but is not recognized. Aside from a syntax error in the page (which doesn't _seem_ to be related), I don't see anything in the vicinity of the query which would do that. But it's something that can be debugged (I just haven't gotten to it yet). > > Hendrik-Jan > > > 2007/7/17, Thomas Dickey <[EMAIL PROTECTED]>: >> On Tue, Jul 17, 2007 at 09:41:21AM +0200, Hendrik-Jan Heins wrote: >> > Exactly. >> > >> > blackbox isn't explicitly defined, but it is the box hostname. However >> > I see no reason why lynx switched virtual domains here! >> > Moreover: lynx appears to be the only browser to do so. as I said, in >> > a lot of other browsers it just works. The pages are also validated as >> > correct html. >> > How can I find out why this is happening? >> >> You can see _where_ lynx does it by examining the trace file - >> run "lynx -trace http://linux-wless.passys.nl/", and look in >> $HOME/Lynx.trace >> >> The lines with "HTParse" show what URL data lynx was using, and what >> it thinks it's looking for. As I noted, it's possible that it gets >> confused about whether it should use an absolute or relative path in >> the GET. >> >> For reference, here's the last chunk of trace leading up to the GET: >> >> HTParse: aName:`http://linux-wless.passys.nl/query_part.php?brandname= >> 1st Wave >> ' >> relatedName:`' >> want: path >> HTParse: (ABS) >> HTParse: encode:`query_part.php?brandname= >> 1st Wave >> ' >> HTParse: result:`query_part.php?brandname= >> 1st%20Wave >> ' >> HTParse: aName:`http://linux-wless.passys.nl/query_part.php?brandname= >> 1st Wave >> ' >> relatedName:`' >> want: host >> HTParse: result:`linux-wless.passys.nl' >> LYCookie: Searching for 'linux-wless.passys.nl:80', >> '/query_part.php?brandname= >> 1st%20Wave >> '. >> Checking cookie 0x3c10f7d0 linux-wless[]=Linux+wireless+website >> linux-wless.passys.nl linux-wless.passys.nl 1 >> /query_part.php?brandname= >> 1st%20Wave >> / 0 >> HTTP: Sending Cookie2: $Version ="1" >> HTTP: Sending Cookie: linux-wless[]=Linux+wireless+website >> Composing Authorization for >> linux-wless.passys.nl:80/query_part.php?brandname= >> 1st%20Wave >> >> HTAASetup_lookup: No template matched `query_part.php?brandname= >> 1st%20Wave >> ' (so probably not protected) >> HTTP: Not sending authorization (yet). >> Writing: >> GET /query_part.php?brandname= >> 1st%20Wave >> HTTP/1.0\r >> Host: linux-wless.passys.nl\r >> Accept: text/html, text/plain, text/css, text/sgml, */*;q=0.01\r >> Accept-Encoding: gzip, compress, bzip2\r >> Accept-Language: en\r >> User-Agent: Lynx/2.8.7dev.4 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7d\r >> Referer: http://linux-wless.passys.nl/\r >> Cookie2: $Version="1"\r >> Cookie: linux-wless[]=Linux+wireless+website\r >> \r >> ---------------------------------- >> Sending HTTP request. >> HTTP: WRITE delivered OK >> HTTP request sent; waiting for response. >> HTTP: Trying to read 1535 >> >> >> > >> > Hendrik-Jan >> > >> > 2007/7/16, Stefan Caunter <[EMAIL PROTECTED]>: >> > >Server sends 400 Bad Request if you look at ']' Head request from lynx, >> but >> > >sends 'Not Found' page from a different virtual host. I didn't notice >> > >anything in the trace about this though aside from the "blackbox" >> domain. >> > > >> > >Stefan Caunter >> > > >> > > >> > > >> > >On 7/16/07, Hendrik-Jan Heins <[EMAIL PROTECTED]> wrote: >> > >> I did some more testing: >> > >> If I ask for: >> > >> >> > >http://linux-wless.passys.nl/query_part.php?brandname=1st+Wave >> > >> or: >> > >> >> > >http://linux-wless.passys.nl:80/query_part.php?brandname=1st+Wave >> > >> >> > >> Lynx just works. So it seems to be something about the way lynx >> parses >> > >> the code from the requesting page. >> > >> >> > >> >> > > >> > >> > >> > _______________________________________________ >> > Lynx-dev mailing list >> > [email protected] >> > http://lists.nongnu.org/mailman/listinfo/lynx-dev >> >> -- >> Thomas E. Dickey >> http://invisible-island.net >> ftp://invisible-island.net >> > > > _______________________________________________ > Lynx-dev mailing list > [email protected] > http://lists.nongnu.org/mailman/listinfo/lynx-dev -- Thomas E. Dickey <[EMAIL PROTECTED]> http://invisible-island.net ftp://invisible-island.net _______________________________________________ Lynx-dev mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lynx-dev
