Zane wrote:
Jesse Phillips Wrote:

On Tue, 03 Nov 2009 20:05:17 -0500, Zane wrote:

If I am to receive
these in arbitrarily sized chunks for concatenation, I don't see a
sensible way of constructing a loop.  Example?

Zane
You can use the number of bytes read to determine and use string slicing to concatenation into the final array.

Thanks Jesse,

Can you (or someone) confirm that this program should work.  I added a loop with array 
slicing, but it does not seem to work for me.  The final output of "num" is 
17593, and the file of that size is created, but it is not a valid gif image.  The code 
is below (note that this is assuming google still has their 'big-bird logo up :-P)

import std.stream;
import std.stdio;
import std.socket;
import std.socketstream;

import std.c.time;

int main()
{
        char[] line;
        ubyte[] data = new ubyte[17593];
        uint num = 0;

        TcpSocket socket = new TcpSocket(new InternetAddress("www.google.com", 
80));

        socket.send("GET /logos/bigbird-hp.gif HTTP/1.0\r\n\r\n");

        SocketStream socketStream = new SocketStream(socket);
        
        while(!socketStream.eof)
        {
                line = socketStream.readLine();

                if (line=="")
                        break;

                writef("%s\n", line);
        }
        
        num = socketStream.readBlock(data.ptr, 17593);
        writef("\n\nNum: %d\n", num);

        while(num < 17593)
        {
                num += socketStream.readBlock(data[(num-1)..length].ptr, 
data.length-num);
                writef("\n\nNum: %d\n", num);
        }

        socketStream.close;
        socket.close;

        File file = new File("logo.gif", FileMode.Out);
        file.write(data);
        file.close;

        return 0;
}

Thanks for everyone's help so far!

There are a few issues with your implementation.

First, parse the headers properly. Below see my trivial implementation. You want to parse them properly so you can find the correct end-of-headers, and check the size of the content from the headers.

readLine() looks to be designed for a text based protocol. The biggest issue is with the end-of-line detection. "\r", "\n" and "\r\n" are all valid end-of-line combinations and it doesn't seem to do the detection in a greedy manor. This leaves us with a trailing '\n' at the end of the headers.

The implementation of readBlock() doesn't seem to really wait to fill the buffer. It fills the buffer, if it can. This is pretty standard of a read on a socket. So wrap it in a loop and read chunks. You want to do it this way anyway for many reasons. The implementation below double-buffers which does result in an extra copy. Although logically this seems like a pointless copy, but in a real application it is very useful many reasons.

Below is a working version (but still has its own issues).

#!/usr/bin/gdmd -run

import std.stream;
import std.stdio;
import std.socket;
import std.socketstream;

import std.string;              // for header parsing
import std.conv;                // for toInt

import std.c.time;

int main()
{
        char[] line;
        ubyte[] data;
        uint num = 0;

TcpSocket socket = new TcpSocket(new InternetAddress("www.google.com", 80));

        socket.send("GET /logos/bigbird-hp.gif HTTP/1.0\r\n\r\n");

        SocketStream socketStream = new SocketStream(socket);
        
        string[] response;      // Holds the lines in the response
        while(!socketStream.eof)
        {
                line = socketStream.readLine();

                if (line=="")
                        break;

                // Append this line to array of response lines
                response ~= line;
        }

        // Due to how readLine() works, we might end up with a
        //trailing \n, so
        // get rid of it if we do.
        ubyte ncr;
        socketStream.read(ncr);
        if (ncr != '\n')
                data ~= ncr;


        // D's builtin associative arrays (safe & easy hashtables!)
        string[char[]] headers; 
        
        // Parse the HTTP response.  NOTE: This is a REALLY bad HTTP
        // parser. a real parser would handle header parsing properly.
        // See RFC2616 for proper rules.
        foreach (v; response) {
                // There is likely a better way to do this then
                // a join(split())
                string[] kv_pair = split(v, ": ");
                headers[tolower(kv_pair[0])] = join(kv_pair[1 .. $], ":");
        }

        foreach (k, v; headers)
                writefln("[%s] [%s]", k, v);

        uint size;
        if (isNumeric(headers["content-length"])) {
                size = toInt(headers["content-length"]);
        } else {
                writefln("Unable to parse content length of '%s' to a number.",
                        headers["content-length"]);
                return 0;
        }
        // This fully buffers the data, if you are fetching large files you
        // process them in chunks rather then in a big buffer.  Also, this
        // does not handle chunked encoding, see RFC2616 for details.
        while (data.length < size && !socketStream.eof) {
                ubyte[4096] buffer;
                num = socketStream.readBlock(buffer.ptr, 4096); // read 4k at a 
time
                writefln("Read %s bytes [%s/%s] (%s%%)",
                        num, data.length, size, 
(cast(float)data.length/cast(float)size)*100);

                // Process the buffer, in this case just copy it to the data
                // buffer.  This double buffering process may seem bad, but
                // has the advantage of allowing you to thread around data,
                // process the buffer in chunks, etc.
                data ~= buffer[0..num];
        }

        socketStream.close;
        socket.close;

        // It might be worthwhile to chunk this as well in some cases.
        File file = new File("logo.gif", FileMode.Out);
        file.write(data);
        file.close;

        return 0;
}

Reply via email to