Re: Tomcat 10.1.x: Using CoyoteInputStream to read a Chunked Transfer Encoding (CTE) stream, manually, skiping ChunkedInputFilter

Daniel Andres Pelaez Lopez Tue, 27 Jun 2023 12:41:01 -0700

El mar, 27 jun 2023 a las 13:48, Christopher Schultz (<
[email protected]>) escribió:


> Daniel,
>
> On 6/27/23 12:56, Daniel Andres Pelaez Lopez wrote:
> > Christopher,
> >
> > El mar, 27 jun 2023 a las 9:33, Christopher Schultz (<
> > [email protected]>) escribió:
> >
> >> Daniel,
> >>
> >> On 6/26/23 16:15, Daniel Andres Pelaez Lopez wrote:
> >>> El lun, 26 jun 2023 a las 14:53, Mark Thomas (<[email protected]>)
> >> escribió:
> >>>
> >>>> On 26/06/2023 20:34, Christopher Schultz wrote:
> >>>>> Daniel,
> >>>>>
> >>>>> On 6/26/23 12:47, Daniel Andres Pelaez Lopez wrote:
> >>>>>> Hi Tomcat community,
> >>>>>>
> >>>>>> I have a requirement where we want to manually decode a Chunked
> >> Transfer
> >>>>>> Encoding (CTE) stream using CoyoteInputStream to have access to the
> >>>> chunk
> >>>>>> size. This means I want to use CoyoteInputStream.read method and get
> >> the
> >>>>>> whole CTE bytes. Saying it in another way: we want to decode the CTE
> >> at
> >>>>>> hand skipping Tomcat defaults.
> >>>>>
> >>>>> Dumb question: why?
> >>>>
> >>>> Not a dumb question at all. It is the key question. I'm curious as to
> >>>> what the answer is.
> >>>>
> >>>> Mark
> >>>>
> >>>>
> >>> Not dumb question at all. Let me expand the use case: we are working on
> >> an
> >>> HTTP origin (Tomcat) for video streaming using HLS and DASH. Our video
> >>> packager generates video segments of X size, and each segment is also
> >>> divided into fragments (CMAF). The segment size is fixed, but the
> >> fragment
> >>> size is variable. Our packager transfers the segment meanwhile it
> >> generates
> >>> it, a fragment at the time (a chunk), in a CTE, to the HTTP origin
> >>> (Tomcat). Now, video players want to download the segment, but as for
> the
> >>> HLS spec, we require to transfer the segment to the video player, as we
> >>> received, a fragment a the time. To be able of sending a fragment at
> the
> >>> time, we need to know its size, which is implicit inside the CTE (each
> >>> chunk declares the chunk size).
> >>>
> >>> Our current implementation sends the segment using CTE to the video
> >>> players, but we cannot guarantee we are sending a fragment by chunk.
> >>>
> >>> This is why having access to each chunk and its size will help us.
> >>
> >> Thanks for the details. I think I've got it, but I want to clarify a
> >> little bit.
> >>
> >> Is your video-chunk-generator producing anything HTTP-related? It almost
> >> sounds like Tomcat is a reverse-proxy and your video-generator is the
> >> origin. Maybe you are just generating byte[] from the video-generator?
> >>
> >> Or maybe your video-generator is UPLOADING the chunks to the HTTP
> >> server? It's not entirely clear to me, and the details matter.
> >>
> >
> > Thanks for staying in the conversation.
> >
> > The packager (video-chunk-generator) sends an HTTP PUT with
> > Transfer-Encoding: chunked header, the content is a video segment, where
> > each chunk is a fragment, so, yes, the video-chunk-generator uploads the
> > segment in chunks to the Tomcat server (origin)
> >
> > Sorry for the confusion regarding the word "origin", that is a video
> > streaming term that doesn't matter for the question.
>
> Yes, that's important information to have: in HTTPD, the "origin" is the
> web server which actually has the desired resource. Contrast that with a
> reverse proxy, etc.
>
> >> It sounds like you are trying to optimize things such that video-chunk
> >> size ends up being equal to the HTTP-chunk size. Is that the real goal?
> >>
> >
> > The video-chunk-generator does it for us, it sends each video fragment as
> > an HTTP chunk. What we want to optimize is not the transfer from the
> > video-chunk-generator to the server, but from the server to its clients.
> > Clients will do an HTTP GET against the server to grab the segment, that
> > GET we want to optimize in a way that we keep the fragment-by-chunk
> > strategy, using Transfer-Encoding: chunked. This is why, accessing each
> > chunk size when the video-chunk-generator does the PUT, and saving that
> > info in the server, we can use it when clients do a GET, to assure we
> > transfer the same way we received.
>
> Is there no way to observe the video-chunk-size by looking at the raw
> bytes of the video file itself? Take the MP3 audio format, with which
> I'm more familiar. MP3 frame lengths can be computed based upon some
> information at the start of each frame including the version number, bit
> rate, sample rate, etc. So by reading a few bytes into the file, you
> know how big each chunk would need to be. Then you can bush the bytes
> and go to the next chunk, etc.
>
> If you can do that with your files, there is no reason to record the
> chunk-sizes that you got at the time of upload unless you just want the
> download to be as absolutely screaming-fast as possible and you don't
> want to perform any mathematical operations at all during the download
> (though you will presumably have to read a file from storage, which has
> a much higher cost than a little bit of math IMHO).
>

You are right, the CMAF format of the segment might bring the fragment size
information, but as you state, we might need to parse the segment as it is
being uploaded to figure out the fragment size, that's an option over the
table, but being fast is also important here, as we are creating low
latency streams (under 3 seconds glass to glass). Seems easier to just read
the chunk size from the CTE, as this DASH server example shows
https://gitlab.com/fflabs/dash_server/-/blob/master/dash_server.py#L62
that's a DASH server in Python with pretty low-level network access.


>
> Let's assume you CAN determine chunk-size from your source file. You can
> get Tomcat to chunk your file the same way just like this:
>
> public void goGet(HttpServletRequest request, HttpServletResponse
> response) throws IOException {
>
>    response.setHeader("Transfer-Encoding", "chunked");
>    response.setBufferSize(MAXIMUM_VIDEO_FRAME_SIZE); // This is important
>
>    InputStream video = ...; // You figure this out
>    OutputStream out = response.getOutputStream();
>
>    boolean eof = false;
>
>    byte[] buffer = new buffer[1024]; // Or something appropriate
>
>    while(!eof) {
>      int c = video.read(buffer);
>
>      if(-1 == c) {
>        eof = true;
>      } else {
>        int chunkSize = getChunkSize(buffer);
>
>        chunkSize =- c; // We have already read c bytes from video
>
>        out.write(buffer, 0, c);
>
>        for(i=c; i<chunkSize; ++i) { // TODO: Optimize this copy operation
>          out.write(i);
>        }
>
>        out.flush(); // This triggers Tomcat to generate a chunked
>                     // response
>      }
>    }
> }
>
> There are lots of way the above code can fail, etc. and so it needs to
> be much more robust, but I just wanted you to get the general idea.
>
> There are two very important things in the code:
>
> 1. The line which sets the output buffer size. If you use the default
> buffer size, Tomcat may (okay, WILL) "chunk" the response in the middle
> of your video-chunk of a video-chunk can get bigger than the current
> buffer size. So you need to make sure that doesn't happen.
>
> Or, maybe it's okay if that happens, but you want to minimize the number
> of times that happens or you waste bytes, cycles, etc.
>

This is great info, I didn't know, as we would like to transfer full
fragments, we might need to increase that above the max, I have seen 20 kb
fragments.


>
> 2. You must call ServletOutputStream.flush, which is how Tomcat knows to
> actually chunk the response.
>

Yes, we are doing that today.


>
> >> In that case, you want to force the chunk size to something specific,
> >> rather than just trying to see what the chunk size is.
> >>
> >> How you do that depends on whether your video-generator is sending data
> >> in the *request entity* in e.g. PUT or POST or if you are fetching the
> >> data in a *response entity*.
> >>
> >> I *think* you want to inspect chunk-size of an upload-to-Tomcat, but I
> >> want to be sure. Might this be easier to do on the client to force a
> >> certain chunk-size?
> >
> > You are right, we want to inspect the chunk-size of an upload to Tomcat.
> We
> > have no control over the video-chunk-generator, so, the only way to know
> > the fragment/chunk size they are sending is by inspecting the CTE.
>
> The only way to know the chunk size THEY are sending is to inspect and
> record it. But you don't really care what they send; instead you care
> what chunk-size to use for your Tomcat responses. They *should* be the
> same thing, but I wanted to re-frame (hah!) the problem to me more
> accurate, because I think you are trying to solve problem X (how to
> observe inbound chunk size) when you really want to solve problem Y
> (optimize outbound chunk size).


> >> Finally... for video, perhaps a Websocket connection would be better
> >> since there is less protocol-overhead once the ws connection is
> >> established?
> >
> > True, but the video-chunk-generator only offers two ways of transfer:
> HTTP
> > PUT or writing to disk. The second option was discarded as we will need
> to
> > listen to file system events and do some magic there, which we don't need
> > to do for the HTTP PUT, as the protocol/Tomcat guarantee when the
> transfer
> > starts and ends.
>
> Sounds good to me. Plus, if you use HTTP then you can de-couple the
> services easily at any time.
>
> >>>>>> The current flow from the point of view of CoyoteInputStream is:
> >>>>>> CoyoteInputStream.read -> Request.read -> ChunkedInputFilter.read.
> >>>>>>
> >>>>>> ChunkedInputFilter handles the CTE decoding and the read method only
> >>>>>> returns the chunks, with no other information, like chunk size.
> >>>>>>
> >>>>>> I found that the method Request.setInputBuffer might allow to set a
> >>>>>> different InputBuffer implementation, for instance, the
> >>>>>> IdentityInputFilter, which I understand returns all the stream
> bytes,
> >>>>>> with
> >>>>>> no decoding. However, not sure if this is the right way and which
> >>>>>> consequences might have.
> >>>>>>
> >>>>>> I would like to know if there are other ways to override the CTE
> >>>>>> behavior,
> >>>>>> any help would be appreciated.
> >>>>>
> >>>>> A problem I can see is that you are working with a blocking streaming
> >>>>> interface e.g. read(byte[]) and you also want to get the chunk size.
> >>>>> When? The chunk-size can change for every chunk, so if you call
> >>>>> getChunkSize() before the read() and after the read(), they may be
> >>>>> different if the read() returns data from multiple chunks. It may
> have
> >>>>> changed multiple times between read() was called and when it
> completed.
> >>>>>
> >>>>> If you want to always size byte byte[] to read full-chunks at once
> ...
> >> I
> >>>>> guess I would again ask "why?"
> >>>>>
> >>>>> Would it be sufficient for ChunkedInputFilter to maybe send an
> >>>>> event-notification each time a chunk boundary was crossed? For
> example:
> >>>>>
> >>>>> public interface ChunkListener {
> >>>>>      public void chunkStarted(ChunkedInputFilter source, long offset,
> >> long
> >>>>> length);
> >>>>>      public void chunkFinished(ChunkedInputFilter source, long
> offset,
> >>>>> long length);
> >>>>> }
> >>>>>
> >>>>> Then, every time the Filter begins or ends a chunk it could notify
> your
> >>>>> code and you can do whatever you want with that information.
> >>>
> >>>> You might be able to subclass the (somewhat confusingly-named)
> >>>>> ChunkInputFilter and bolt-on your own logic like what I have above.
> >>>>>
> >>>
> >>>
> >>> Yes, a listener like that looks great. Any more clues on how to inject
> my
> >>> own ChunkInputFilter implementation in Tomcat configuration? seems
> quite
> >>> hard to do it well.  Also, the listener must be linked by HTTP request.
> >>
> >> I think doing so would require some internal support for messing-around
> >> with the chain of objects that handle the requests. I don't think you
> >> can do this "on your own". One option would be for us to add the ability
> >> to register a "ChunkListener" with the ChunkInputFilter but honestly
> >> this is a pretty odd use-case and having that code running on every
> >> server worldwide seems like a waste. The other option would be to allow
> >> you to specify your own ChunkInputFilter class at some point during
> >> server initialization, which seems like a much better option.
> >>
> >
> > I totally agree Tomcat shouldn't add anything specific regarding this
> > uncommon use case, I am happy having a workaround. Specifying my own
> > ChunkInputFilter seems the way to go, I have access to the Request object
> > (which Spring Boot can inject), so, using Request.setInputBuffer should
> be
> > enough? I am a little concerned about playing with Tomcat defaults, but
> not
> > many options on my plate.
>
> One more frame-challenge (a bit of an intentional joke, there) for you:
> why bother "optimizing" the HTTP chunk-size? Most networking components
> and software work with buffers of sized sizes and end up naturally
> filling and emptying those buffers on a schedule that is pretty regular.
> By introducing an artificial "chunk size" which likely doesn't match any
> of those, you are definitely making things more complicated... but is it
> actually *improving* anything?
>
> If you have a 1MB video (small, I know) and it's video-chunked into
> segments of weird sizes like 1243, 6873, 2341, 7654, and 8790 bytes,
> does it matter to the client/recipient if they get HTTP-chunks of those
> exact same sizes or if they get HTTP-chunks which are all, say, 4096
> bytes in size (except the final chunk, which will be short)?
>
> Most media-players download several frames in advance of actually
> starting playing and continue to buffer throughout the playback.
> Additionally, any decent player will not just do something naive like this:
>
> HTTP GET /movies/guardians_of_the_galaxy.h264
>
> And download the entire file. Instead, the player will most likely make
> a range-request like this:
>
> HTTP GET /movies/guardians_of_the_galaxy.h264
> Range: bytes=0-1023
>
> Then the server sends the first 1k of data and the client decides what
> to do, next. The client makes many of these requests as playback
> continues. This allows the user to pause, scrub-around the timeline,
> rewind, etc. without ever download the entire file each time.
>
> I'm making a lot of assumptions about your usage of this service, but I
> think you may be trying to solve a problem that doesn't need to be
> solved... at least not the way you think it needs to be solved.
>

Streaming video is hard and harder in low latency glass to glass, so, seems
like optimizations on how to transfer the video are important, for
instance, the HLS spec mentions how those fragments/byteranges should be
returned
https://datatracker.ietf.org/doc/html/draft-pantos-hls-rfc8216bis#section-6.2.6
(partial segments = fragments):

   When processing requests for a URI or a byte range of a URI that
   includes one or more Partial Segments that are not yet completely
   available to be sent - such as requests made in response to an EXT-X-
   PRELOAD-HINT tag - the server MUST refrain from transmitting any
   bytes belonging to a Partial Segment until all bytes of that Partial
   Segment can be transmitted at the full speed of the link to the
   client.  If the requested range includes more than one Partial
   Segment then the server MUST enforce this delivery guarantee for each
   Partial Segment in turn.  This enables the client to perform accurate
   Adaptive Bit Rate (ABR) measurements

Our understanding of that statement is that we must have the whole
chunk/fragment/partial segment ready before transmitting it through the
network, as a chunk.

Regarding using org.apache.coyote.Request.setInputBuffer as a workaround,
seems like we don't have access to org.apache.coyote.Request directly, we
have access to org.apache.catalina.connector.RequestFacade, which doesn't
offer any way to access the
underlying org.apache.catalina.connector.Request, and therefore
org.apache.coyote.Request. Any way to have access to
org.apache.coyote.Request?


> -chris
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

-- 
Daniel Andrés Pelaez López
Master’s Degree in IT Architectures, Universidad de los Andes.
Software Construction Specialist, Universidad de los Andes.
Bachelor's Degree in Computer Sciences, Universidad del Quindio.
e. [email protected]

Re: Tomcat 10.1.x: Using CoyoteInputStream to read a Chunked Transfer Encoding (CTE) stream, manually, skiping ChunkedInputFilter

Reply via email to