ID:               45533
 User updated by:  signe at cothlamadh dot net
 Reported By:      signe at cothlamadh dot net
 Status:           Open
 Bug Type:         cURL related
 Operating System: FreeBSD 7.0
 PHP Version:      5.2.6
 New Comment:

Of course, after posting the reproduction, the server that was causing
the issue modified something and it's no longer reproducing against
them.  This was the original output from a request to their server:

telnet www.crn.com 80
Trying 66.77.24.10...
Connected to crn.com.
Escape character is '^]'.
GET /rss/cisco/index.xml HTTP/1.1
Host: www.crn.com

HTTP/1.1 302 Found
Date: Wed, 16 Jul 2008 21:52:30 GMT
Server: Apache
Location: http://feeds.pheedo.com/rss/cisco
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
Vary: Accept-Encoding, User-Agent

119
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>302 Found</TITLE>
</HEAD><BODY>
<H1>Found</H1>
The document has moved <A
HREF="http://feeds.pheedo.com/rss/cisco";>here</A>.<P>
<HR>
<ADDRESS>Apache/1.3.29 Server at www.crn.com Port 80</ADDRESS>
</BODY></HTML>

0

Connection closed by foreign host.


Previous Comments:
------------------------------------------------------------------------

[2008-07-16 21:30:14] signe at cothlamadh dot net

Description:
------------
When retrieving a url that utilizes a 302 redirect, along with viewable
error-document content, the error-document is prepended to any REAL
content that is retrieved after following the redirect.

This issue is compounded when CURLOPT_HEADER is enabled, because the
error-document content is not counted in any of the getinfo data.

Reproduce code:
---------------
http://www.cothlamadh.net/~signe/.outgoing/curl_location.phps

Tested with curl 7.18.0 on FreeBSD 7 and 7.16.4-2ubuntu1 on Ubuntu
Gutsy.

Expected result:
----------------
Non-header data from redirects should not be included in the returned
content.

Actual result:
--------------
Without headers enabled, the content returned looks like this:


"""
RedirectErrorDocumentContent
ActualDocument
"""

There is no whitespace between the two documents.

With headers enabled, it's much much worse.

"""
RedirectHeader

RedirectErrorDocumentContent
ActualDocumentHeader

ActualDocument
"""

There is whitespace between each set of headers and its respective
content, but not between the first content and the second batch of
headers.

To make matters worse, curl_getinfo($cUrl, CURLINFO_HEADER_SIZE)
returns the combined length of both header sections, as is expected, and
curl_getinfo($cUrl, CURLINFO_CONTENT_LENGTH_DOWNLOAD) returns the length
of the ActualDocument, also as expected.  The result of this is that
RedirectErrorDocumentContent gets tossed in the middle invisibly.  This
makes it impossible to cleanly split the document into header and
content sections.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=45533&edit=1

Reply via email to