"Craig A. Berry" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> At 11:25 AM -0700 11/7/03, Michael Downey wrote:
> >The limitation I believe you are talking about is the transmit and
receive
> >buffer sizes for the socket. These generally can't be bigger than 64K on
> >most systems. But a TCP/IP socket is by definition a STREAM oriented
> >device. So if you write a 800 Meg buffer to a socket it means that the
I/O
> >layer likely has to do many low level writes to get all that data pushed
> >out. On VMS they don't treat the socket that way and only do 1 SYS$QIO
call
> >for the buffer and don't check to make sure the buffer isn't bigger then
> >64K. Really they should of checked the size and did a number of SYS$QIOs
to
> >get the full buffer pushed out. TCP/IP generally won't create a single
> >packet with the whole buffer inside it. UDP does this but it is not
stream
> >oriented. Really at the lowest layer the TCP/IP driver would have to
query
> >the transport protocol to see what the biggest packet size that the
> >transport will take. It then has to break up the message given to it
into a
> >number of packets of the queried size and then wait for acknowledges for
> >those packets. So if a program does a write of 800 Megs the TCP/IP
driver
> >should take the first 64K into it's buffer and write that out. Once it
gets
> >all the acknowledges back it will then take the next 64K into it's buffer
> >and write that out as well. Until the full 800 Meg buffer is written out
> >the calling program is blocked. An example of where this works on VMS is
in
> >file I/O. You can open a file and write an 800 Meg buffer to the file
using
> >write and it will block you until all the data is written to the file.
That
> >is because the file layer is properly handling the 64K limit that VMS has
> >with SYS$QIO. I don't know why they did not do the same thing when they
> >wrote the TCP/IP layer for VMS.
> >
> >If you have a different understanding of how that works please let me
know
> >as I currently use this information in an abstraction layer that we use
for
> >both UNIX and VMS.
>
> Thanks for the explanation. I had a feeling I was oversimplifying
> somewhat. I believe writes to files also had a similar limitation at
> some point in the past but the CRTL took upon itself the task of
> breaking up large writes. It would have to do the same thing for
> sockets, and I agree it probably should. Regardless of if and when
> that happens, I think it's reasonable to expect a Perl program that
> is inserting a large number of rows in a database to make some
> reasonable assumption about the maximum number of rows it can insert
> in one go.
> --
> ________________________________________
> Craig A. Berry
> mailto:[EMAIL PROTECTED]
>
> "... getting out of a sonnet is much more
> difficult than getting in."
> Brad Leithauser
Mark Berryman replied after this but I don't add it here as my mail viewer
blocked his attachment. But his argument was that it should be up to the
programmer to code around the limitations of the OS. In some cases this is
inevitable but in this case I would say we do not want to force this limit
on the calling program. What does it solve if we make the programmer do
something like:
$num_remaining = sizeof(buffer);
for ($i = 0; $i <= BUFFER_SIZE; $i += MAX_SIZE_TCPIP_CAN_TAKE)
{
if ($num_remaining < MAX_SIZE_TCPIP_CAN_TAKE) {
write (buffer[$i], $num_remaining, ...);
}
else
{
write (buffer[$i], $num_remaining, ...);
}
}
We still do the same number of I/O calls and we are now causing the
programmer to realize that he has this limit and to write platform specific
code that will need to be changed whenever he wants to use this script on a
different platform. Why shouldn't it be the responsibility of write() to
know how to deal with this or the Perl compatibility layer? Sockets are
stream oriented things not record oriented things. I can't stress that
enough. If we try to treat the socket in a record oriented fashion then we
loose a huge amount of flexibility that the socket layer is suppose to
provide. Maybe I've confusing the TCP/IP layer with the socket layer in my
last post. Yes, the TCP/IP layer does not need to be responsible for
dealing with this but the socket layer absolutely needs to. Otherwise we
run into portability problems like we are right now. Really to take a C++
type approach to this problem we should be able to do something like:
FILE_HANDLE >> SOCKET_HANDLE;
And that should send all the data in a given file over a socket. It should
be the responsibility of the run time libraries to handle the optimization
of the reads and writes. It may not be as efficient as we could of done by
writing it all ourselves but it makes writing programs and scripts a whole
lot easier.
I don't intend these responses to seem like I'm a zealous programmer
frothing at the mouth when something doesn't work, but the biggest headaches
are the cases where one things is supported on one platform and not another.
It's understood that if your programming in C or FORTRAN that you will have
to deal with these issues as you may need to access the system calls, but
when your writing Perl you should not. The only thing that you should have
to deal with as a Perl programmer is optimizing the script so it will run
better and the optimizations should be platform independent. I.E. we want
to load a 800 Meg file into an array and sort the array. We will likely not
be able to do that on any server if we just load it all in and call sort().
We will have to change our algorithm so we do it in parts as we are limited
by Memory not by the API.
So should Carl have to be a bit better at how he passes the data around?
Sure, but he should only have to concern himself about how he's going to get
this script to work on the servers he wants to run it on. If the file he
tries to load is 2 Gigs in size then he better not try to load it all into
an array and pass it to the socket as it likely will page the server to
death. It's possible to have Perl even handle this issue but I would think
that that's a bit too much to ask.
I hope that makes my point a bit clearer, I'm sorry if I miss quote or
misunderstand anyones statements.
Michael Downey