On 03/07/2013 23:26, Jim Schueler wrote: > >> Second, if there's no Content-Length header then how >> does one know how much >> data to read using $r->read? >> >> One answer is until $r->read returns zero bytes, of >> course. But, is >> that guaranteed to always be the case, even for, >> say, pipelined requests? >> My guess is yes because whatever is de-chunking the > > read() is blocking. So it never returns 0, even in a pipeline request > (if no data is available, it simply waits). I don't wish to discuss the > merits here, but there is no technical imperative for a content-length > request in the request header. > > -Jim
Probably. If you, for some reason, were doing the chunking work yourself, each chunk says how many bytes are in it (or in the next one perhaps; I forget offhand), so you'd know what size read to do. > > > > > > On Wed, 3 Jul 2013, Bill Moseley wrote: > >> Hi Jim, >> This is the Transfer-Encoding: chunked I was writing about: >> >> http://tools.ietf.org/html/rfc2616#section-3.6.1 >> >> >> >> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler <jschue...@eloquency.com> >> wrote: >> I played around with chunking recently in the context of media >> streaming: The client is only requesting a "chunk" of data. >> "Chunking" is how media players perform a "seek". It was >> originally implemented for FTP transfers: E.g, to transfer a >> large file in (say 10K) chunks. In the case that you describe >> below, if no Content-Length is specified, that indicates "send >> the remainder". >> >> >From what I know, a "chunk" request header is used this way to >> specify the server response. It does not reflect anything about >> the data included in the body of the request. So first, I would >> ask if you're confused about this request information. >> >> Hypothetically, some browsers might try to upload large files in >> small chunks and the "chunk" header might reflect a push >> transfer. I don't know if "chunk" is ever used for this >> purpose. But it would require the following characteristics: >> >> 1. The browser would need to originally inquire if the server >> is >> capable of this type of request. >> 2. Each chunk of data will arrive in a separate and >> independent HTTP >> request. Not necessarily in the order they were sent. >> 3. Two or more requests may be handled by separate processes >> simultaneously that can't be written into a single >> destination. >> 4. Somehow the server needs to request a resend if a chunk is >> missing. >> Solving this problem requires an imaginitive use of HTTP. >> >> Sounds messy. But might be appropriate for 100M+ sized uploads. >> This *may* reflect your situation. Can you please confirm? >> >> For a single process, the incoming content-length is >> unnecessary. Buffered I/O automatically knows when transmission >> is complete. The read() argument is the buffer size, not the >> content length. Whether you spool the buffer to disk or simply >> enlarge the buffer should be determined by your hardware >> capabilities. This is standard IO behavior that has nothing to >> do with HTTP chunk. Without a "Content-Length" header, after >> looping your read() operation, determine the length of the >> aggregate data and pass that to Catalyst. >> >> But if you're confident that the complete request spans several >> smaller (chunked) HTTP requests, you'll need to address all the >> problems I've described above, plus the problem of re-assembling >> the whole thing for Catalyst. I don't know anything about >> Plack, maybe it can perform all this required magic. >> >> Otherwise, if the whole purpose of the Plack temporary file is >> to pass a file handle, you can pass a buffer as a file handle. >> Used to be IO::String, but now that functionality is built into >> the core. >> >> By your last paragraph, I'm really lost. Since you're already >> passing the request as a file handle, I'm guessing that Catalyst >> creates the tempororary file for the *response* body. Can you >> please clarify? Also, what do you mean by "de-chunking"? Is > > that the same think as re-assembling? >> >> Wish I could give a better answer. Let me know if this helps. >> >> -Jim >> >> >> On Tue, 2 Jul 2013, Bill Moseley wrote: >> >> For requests that are chunked (Transfer-Encoding: >> chunked and no >> Content-Length header) calling $r->read returns >> unchunked data from the >> socket. >> That's indeed handy. Is that mod_perl doing that >> un-chunking or is it >> Apache? >> >> But, it leads to some questions. >> >> First, if $r->read reads unchunked data then why is >> there a >> Transfer-Encoding header saying that the content is >> chunked? Shouldn't >> that header be removed? How does one know if the >> content is chunked or >> not, otherwise? >> >> Second, if there's no Content-Length header then how >> does one know how much >> data to read using $r->read? >> >> One answer is until $r->read returns zero bytes, of >> course. But, is >> that guaranteed to always be the case, even for, >> say, pipelined requests? >> My guess is yes because whatever is de-chunking the >> request knows to stop >> after reading the last chunk, trailer and empty >> line. Can anyone elaborate >> on how Apache/mod_perl is doing this? >> >> >> Perhaps I'm approaching this incorrectly, but this >> is all a bit untidy. >> >> I'm using Catalyst and Catalyst needs a >> Content-Length. So, I have a Plack >> Middleware component that creates a temporary file >> writing the buffer from >> $r->read( my $buffer, 64 * 1024 ) until that returns >> zero bytes. I pass >> this file handle onto Catalyst. >> >> Then, for some content-types, Catalyst (via >> HTTP::Body) writes the body to >> another temp file. I don't know how >> Apache/mod_perl does its de-chunking, >> but I can call $r->read with a huge buffer length >> and Apache returns that. >> So, maybe Apache is buffering to disk, too. >> >> In other words, for each tiny chunked JSON POST or >> PUT I'm creating two (or >> three?) temp files which doesn't seem ideal. >> >> >> -- >> Bill Moseley >> mose...@hank.org >> >> >> >> >> -- >> Bill Moseley >> mose...@hank.org >> >>