Fantastic response Mark, thank you.
For the record the generic C# connections for Microsoft are apparently:
a) restricted to 2 at a time.
b) try and resolve the proxy every single time using the complex proxy
settings under windows.
c) just generally perform badly.
I grabbed some code of the internet with direct connection using sockets and
run times dropped from approx 23 seconds to 11 seconds. The code I grabbed
was 'ordinary' with constantly rebuilding strings, so it makes me wonder how
bad the code is inside the Microsoft Library. I did think about running
curl on the command line then parsing the files downloaded, wish I had now
:-)
Method in this case is "GET", I am not currently using any headers. URI is
the complete path...
public void SendRequest(WebRequestFast request)
{
ResponseUri = request.RequestUri;
request.Header = request.Method + " " + ResponseUri.PathAndQuery
+ " HTTP/1.0\r\n" + request.Headers;
Socket.Send(Encoding.ASCII.GetBytes(request.Header));
}
This code basically reads until "\r\n\r\n" which is end of header in http.
One character at a time. It then analyses the headers and puts them into a
list of headers (Collection).
public void ReceiveHeader()
{
Header = "";
Headers = new WebHeaderCollection();
var text = new StringBuilder();
byte[] bytes = new byte[10];
short count = 0;
while (Socket.Receive(bytes, 0, 1, SocketFlags.None) > 0)
{
text.Append( Encoding.ASCII.GetString(bytes, 0, 1));
if (bytes[0] == '\n' || bytes[0] == '\r')
count++;
else
count = 0;
if (count > 3)
{
Header = text.ToString();
if(Header.EndsWith("\r\n\r\n"))
break;
}
}
Header = text.ToString();
var matches = new Regex("[^\r\n]+").Matches(Header.TrimEnd('\r',
'\n'));
for (int n = 1; n < matches.Count; n++)
{
var strItem = matches[n].Value.Split(new char[] {':'}, 2);
try
{
if (strItem.Length > 1)
Headers[strItem[0].Trim()] = strItem[1].Trim();
else if (strItem.Length > 0)
Headers[strItem[0].Trim()] = null;
}
catch (Exception ex)
{
Logger.Error(ex, string.Format("Skipping header {0}",
matches[n]));
}
}
// check if the page should be transferred to another location
if (matches.Count > 0 && (
matches[0].Value.IndexOf(" 302 ")
!= -1 ||
matches[0].Value.IndexOf(" 301 ")
!= -1))
// check if the new location is sent in the "location" header
if (Headers["Location"] != null)
{
try
{
ResponseUri = new Uri(Headers["Location"]);
}
catch
{
ResponseUri = new Uri(ResponseUri,
Headers["Location"]);
}
}
ContentType = Headers["Content-Type"];
if (Headers["Content-Length"] != null)
ContentLength = int.Parse(Headers["Content-Length"]);
KeepAlive = (Headers["Connection"] != null &&
Headers["Connection"].ToLower() == "keep-alive") ||
(Headers["Proxy-Connection"] != null &&
Headers["Proxy-Connection"].ToLower() == "keep-alive");
}
The next bit reads chunks of data that appears after the header. It just
grabs 'output' in 10k chunks and joins them together. html page, json or
xml depending on what was requested.
byte[] recvBuffer = new byte[24 * 1024];
int nBytes, nTotalBytes = 0;
while ((nBytes = response.Socket.Receive(recvBuffer,
0, 10240, SocketFlags.None)) > 0)
{
// increment total received bytes
nTotalBytes += nBytes;
// write received buffer to file
output.Append(Encoding.ASCII.GetString(recvBuffer,
0, nBytes));
// check if the uri type not binary to can be
parsed for refs
// check if connection Keep-Alive to can break
the loop if response completed
if (response.KeepAlive && nTotalBytes >=
response.ContentLength && response.ContentLength > 0)
break;
}
All up I find this code a little ugly but it does work reasonably well
(compared to original).
Thanks
Ken
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.3408 / Virus Database: 3222/6709 - Release Date: 09/29/13
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html