On Sun, Aug 26, 2001 at 04:48:48PM -0700, Ryan Bloom wrote: >... > > Greg Stein wrote: > > > Untrue. Please explain why a PIPE bucket cannot be split at byte 100? > > > Sure, it doesn't know its length, but it can easily read in 100 bytes, > > > give you that, and leave itself as the second part of the split.
> OtherBill wrote: > > I agree here, we can split an unknown length pipe bucket at known point. > > I'd suggest that we really want a constant, APR_BUCKET_UNKNOWN_LEN (which > > needs to map to MAX_SIZE_T or MAX_OFF_T, see below). But I'm not certain > > that the apr_bucket_split_simple() should learn how to do this, sounds like > > a job for apr_bucket_split_indeterminate() or something. > > No, it is not possible to split a bucket with an unknown length. I have > explained > this at great length, many times. Two things: (1) I don't recall these "corrective explanations", and (2) explaining multiple times, over and over, doesn't mean it becomes true. That said... let's return to the problem. > Bucket split is a non-destructive operation, Who said? Bucket split means "split this data into two buckets, so that I can manipulate the two pieces." > but splitting a bucket without a length changes the make-up of the brigade. > Instead of having two buckets of the same type, you can end up having > mutliple heap buckets, and one pipe bucket. Who cares what the bucket types are? The *whole point* of the system is to isolate the type of the data from the user. I bucket A happens to have a different type from bucket B, then who cares? Nobody *should* care whatsoever. By definition, bucket->split creates two buckets from one. Why does it matter that the two resulting buckets have different types from the original? Are you saying that my GREG bucket cannot be split into JACOB and STEIN buckets? That I *must* split them into GREG and GREG buckets? That if I do otherwise, I will have to report to the FSF for Anti-Freedom Lashings? ;-) > Yes, I agree that splitting a PIPE or SOCKET bucket is easy in the simple > case. > Mt problem is the non-trivial case. What do you do when there is an error, or > when you can't read enough data from the bucket? How many times do you > try to read? These are not problems which prevent the definition of splitting an indeterminite length bucket. The split operation can return errors. If a problem occurs during the split, then return an error. Any bucket can generate an error trying to split itself. Pipes and sockets are not unique in that regard. I can say right now that a "database record" bucket is going to exist in some third-party module, some time. You can bet that will return errors from any bucket function -- that DB connection could drop any time. That said, what should a bucket do on a split with a short read? It could return an error, it could return a "warning" with APR_INCOMPLETE, or it could be totally undefined. > It is best to leave these decisions to the filter that is doing the > splitting. The filter certainly cannot handle these kinds of problems. How could it possibly know what is happening within the bucket? It shouldn't have to know. It can try to split it, and it can receive success or an error. Requiring it to know more than that breaks the abstraction of the buckets. Let's return to some previous points... > OtherBill wrote: > > I agree here, we can split an unknown length pipe bucket at known point. > > I'd suggest that we really want a constant, APR_BUCKET_UNKNOWN_LEN (which > > needs to map to MAX_SIZE_T or MAX_OFF_T, see below). But I'm not certain > > that the apr_bucket_split_simple() should learn how to do this, sounds like > > a job for apr_bucket_split_indeterminate() or something. The APR_BUCKET_UNKNOWN_LEN is an excellent idea. Much better than the free-floating "-1" constants throughout the code. The symbol is much more descriptive, and it is more resilient to changes in the underlying bucket system. Note that split_simple() will probably never be called in the case of an unknown length. Simple buckets essentially have all their data already. Pipes and sockets (those with unknown lengths) will generally have their own split functions. And your idea of split_indeterminate actually works well here: read N bytes (which creates a HEAP bucket or somesuch), then "split". This falls back to a previous point. Note that read() changes the makeup of a brigade. If we were so concerned about brigade makeup and stability, then I'd be *much* more surprised to find that a read() changed the brigade, than finding that a split() did. If anybody wants to complain about split() changing bucket types, then they probably ought to start with complaining about how read() can change types of buckets. Cheers, -g -- Greg Stein, http://www.lyra.org/
