Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-26 Thread Joel Cunningham
Attached is a git format-patch file



0001-HTTP-improve-performance-by-reducing-forward-seeks.patch
Description: Binary data

> On Jan 26, 2017, at 6:52 PM, Michael Niedermayer  wrote:
> 
> On Thu, Jan 26, 2017 at 10:44:33AM -0600, Joel Cunningham wrote:
>> Michael,
>> 
>> Thanks for the review and testing!  New patch included, see comments below
>> 
>>> 
>>> this segfaults with many fuzzed files
>>> backtrace looks like this:
>>> 
>>> #0  0x7fffb368 in ?? ()
>>> #1  0x005f9280 in avio_seek (s=0x7fffb278, offset=31, whence=0) 
>>> at libavformat/aviobuf.c:259
>>> 
>> 
>> I found I wasn’t setting s->short_seek_get to NULL in ffio_init_context() 
>> and was relying on the zeroing aspect of av_mallocz() in 
>> avio_alloc_context().  From a grep of the source, it looks like 
>> fifo_init_context() can be called from other functions without having zero’d 
>> the AVIOContext first.
>> 
>> Is the fuzz test exercising the AVIOContext in this manner?  I’d also like 
>> to reproduce this test if there are instructions
>> 
 
 diff --git a/libavformat/avio.c b/libavformat/avio.c
 index 3606eb0..ecf6801 100644
 --- a/libavformat/avio.c
 +++ b/libavformat/avio.c
 @@ -645,6 +645,13 @@ int ffurl_get_multi_file_handle(URLContext *h, int 
 **handles, int *numhandles)
return h->prot->url_get_multi_file_handle(h, handles, numhandles);
 }
 
 +int ffurl_get_short_seek(URLContext *h)
 +{
 +if (!h->prot->url_get_short_seek)
>>> 
 +return -1;
>>> 
>>> should be some AVERROR code
>> 
>> Fixed
>> 
 +return h->prot->url_get_short_seek(h);
 +}
 +
 int ffurl_shutdown(URLContext *h, int flags)
 {
if (!h->prot->url_shutdown)
 diff --git a/libavformat/avio.h b/libavformat/avio.h
 index b1ce1d1..0480981 100644
 --- a/libavformat/avio.h
 +++ b/libavformat/avio.h
 @@ -287,6 +287,12 @@ typedef struct AVIOContext {
int short_seek_threshold;
 
/**
 + * A callback that is used instead of short_seek_threshold.
 + * This is current internal only, do not use from outside.
 + */
 +int (*short_seek_get)(void *opaque);
 +
 +/**
 * ',' separated list of allowed protocols.
 */
const char *protocol_whitelist;
>>> 
>>> thats not ok to add this way, the docs say this:
>>> /**
>>> * Bytestream IO Context.
>>> * New fields can be added to the end with minor version bumps.
>>> * Removal, reordering and changes to existing fields require a major
>>> * version bump.
>>> * sizeof(AVIOContext) must not be used outside libav*.
>>> *
>>> * @note None of the function pointers in AVIOContext should be called
>>> *   directly, they should only be set by the client application
>>> *   when implementing custom I/O. Normally these are set to the
>>> *   function pointers specified in avio_alloc_context()
>>> */
>>> 
>> 
>> I moved short_seek_get to the end of AVIOContext.  I’m not sure if the minor 
>> version bump referenced in the comment requires any in-code changes.  Is 
>> this an acceptable magnitude of change for this feature?
>> 
>>> [...]
 --- a/libavformat/tcp.c
 +++ b/libavformat/tcp.c
 @@ -265,6 +265,26 @@ static int tcp_get_file_handle(URLContext *h)
return s->fd;
 }
 
 +static int tcp_get_window_size(URLContext *h)
 +{
 +TCPContext *s = h->priv_data;
 +int avail;
 +int avail_len = sizeof(avail);
 +
>>> 
 +#if HAVE_WINSOCK2_H
>>> 
>>> this formating is IIRC not allowed in C the # must be in the first
>>> column
>>> 
>>> 
 +/* SO_RCVBUF with winsock only reports the actual TCP window size when
 +auto-tuning has been disabled via setting SO_RCVBUF */
 +if (s->recv_buffer_size < 0) {
 +return AVERROR(ENOSYS);
 +}
 +#endif
 +
 +if (getsockopt(s->fd, SOL_SOCKET, SO_RCVBUF, , _len)) {
 +   return ff_neterrno();
 +}
>>> 
>>> the indention is inconsisntent
>> 
>> Both formatting issues have been corrected
>> 
>> From bac0f351afd14c56c56376d20557b7330bcea23e Mon Sep 17 00:00:00 2001
>> From: Joel Cunningham 
>> Date: Fri, 13 Jan 2017 10:52:25 -0600
>> Subject: [PATCH] HTTP: improve performance by reducing forward seeks
> 
> 
> please attach a proper git format-patch or use git send-email
> 
> applying this is not as smooth and easy as it should be
> 
> [...]
> -- 
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> If you fake or manipulate statistics in a paper in physics you will never
> get a job again.
> If you fake or manipulate statistics in a paper in medicin you will get
> a job for life at the pharma industry.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-26 Thread Michael Niedermayer
On Thu, Jan 26, 2017 at 10:44:33AM -0600, Joel Cunningham wrote:
> Michael,
> 
> Thanks for the review and testing!  New patch included, see comments below
> 
> > 
> > this segfaults with many fuzzed files
> > backtrace looks like this:
> > 
> > #0  0x7fffb368 in ?? ()
> > #1  0x005f9280 in avio_seek (s=0x7fffb278, offset=31, whence=0) 
> > at libavformat/aviobuf.c:259
> > 
> 
> I found I wasn’t setting s->short_seek_get to NULL in ffio_init_context() and 
> was relying on the zeroing aspect of av_mallocz() in avio_alloc_context().  
> From a grep of the source, it looks like fifo_init_context() can be called 
> from other functions without having zero’d the AVIOContext first.
> 
> Is the fuzz test exercising the AVIOContext in this manner?  I’d also like to 
> reproduce this test if there are instructions
> 
> >> 
> >> diff --git a/libavformat/avio.c b/libavformat/avio.c
> >> index 3606eb0..ecf6801 100644
> >> --- a/libavformat/avio.c
> >> +++ b/libavformat/avio.c
> >> @@ -645,6 +645,13 @@ int ffurl_get_multi_file_handle(URLContext *h, int 
> >> **handles, int *numhandles)
> >> return h->prot->url_get_multi_file_handle(h, handles, numhandles);
> >> }
> >> 
> >> +int ffurl_get_short_seek(URLContext *h)
> >> +{
> >> +if (!h->prot->url_get_short_seek)
> > 
> >> +return -1;
> > 
> > should be some AVERROR code
> 
> Fixed
> 
> >> +return h->prot->url_get_short_seek(h);
> >> +}
> >> +
> >> int ffurl_shutdown(URLContext *h, int flags)
> >> {
> >> if (!h->prot->url_shutdown)
> >> diff --git a/libavformat/avio.h b/libavformat/avio.h
> >> index b1ce1d1..0480981 100644
> >> --- a/libavformat/avio.h
> >> +++ b/libavformat/avio.h
> >> @@ -287,6 +287,12 @@ typedef struct AVIOContext {
> >> int short_seek_threshold;
> >> 
> >> /**
> >> + * A callback that is used instead of short_seek_threshold.
> >> + * This is current internal only, do not use from outside.
> >> + */
> >> +int (*short_seek_get)(void *opaque);
> >> +
> >> +/**
> >>  * ',' separated list of allowed protocols.
> >>  */
> >> const char *protocol_whitelist;
> > 
> > thats not ok to add this way, the docs say this:
> > /**
> > * Bytestream IO Context.
> > * New fields can be added to the end with minor version bumps.
> > * Removal, reordering and changes to existing fields require a major
> > * version bump.
> > * sizeof(AVIOContext) must not be used outside libav*.
> > *
> > * @note None of the function pointers in AVIOContext should be called
> > *   directly, they should only be set by the client application
> > *   when implementing custom I/O. Normally these are set to the
> > *   function pointers specified in avio_alloc_context()
> > */
> > 
> 
> I moved short_seek_get to the end of AVIOContext.  I’m not sure if the minor 
> version bump referenced in the comment requires any in-code changes.  Is this 
> an acceptable magnitude of change for this feature?
> 
> > [...]
> >> --- a/libavformat/tcp.c
> >> +++ b/libavformat/tcp.c
> >> @@ -265,6 +265,26 @@ static int tcp_get_file_handle(URLContext *h)
> >> return s->fd;
> >> }
> >> 
> >> +static int tcp_get_window_size(URLContext *h)
> >> +{
> >> +TCPContext *s = h->priv_data;
> >> +int avail;
> >> +int avail_len = sizeof(avail);
> >> +
> > 
> >> +#if HAVE_WINSOCK2_H
> > 
> > this formating is IIRC not allowed in C the # must be in the first
> > column
> > 
> > 
> >> +/* SO_RCVBUF with winsock only reports the actual TCP window size when
> >> +auto-tuning has been disabled via setting SO_RCVBUF */
> >> +if (s->recv_buffer_size < 0) {
> >> +return AVERROR(ENOSYS);
> >> +}
> >> +#endif
> >> +
> >> +if (getsockopt(s->fd, SOL_SOCKET, SO_RCVBUF, , _len)) {
> >> +   return ff_neterrno();
> >> +}
> > 
> > the indention is inconsisntent
> 
> Both formatting issues have been corrected
> 
> From bac0f351afd14c56c56376d20557b7330bcea23e Mon Sep 17 00:00:00 2001
> From: Joel Cunningham 
> Date: Fri, 13 Jan 2017 10:52:25 -0600
> Subject: [PATCH] HTTP: improve performance by reducing forward seeks


please attach a proper git format-patch or use git send-email

applying this is not as smooth and easy as it should be

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-26 Thread Joel Cunningham
Michael,

Thanks for the review and testing!  New patch included, see comments below

> 
> this segfaults with many fuzzed files
> backtrace looks like this:
> 
> #0  0x7fffb368 in ?? ()
> #1  0x005f9280 in avio_seek (s=0x7fffb278, offset=31, whence=0) 
> at libavformat/aviobuf.c:259
> 

I found I wasn’t setting s->short_seek_get to NULL in ffio_init_context() and 
was relying on the zeroing aspect of av_mallocz() in avio_alloc_context().  
From a grep of the source, it looks like fifo_init_context() can be called from 
other functions without having zero’d the AVIOContext first.

Is the fuzz test exercising the AVIOContext in this manner?  I’d also like to 
reproduce this test if there are instructions

>> 
>> diff --git a/libavformat/avio.c b/libavformat/avio.c
>> index 3606eb0..ecf6801 100644
>> --- a/libavformat/avio.c
>> +++ b/libavformat/avio.c
>> @@ -645,6 +645,13 @@ int ffurl_get_multi_file_handle(URLContext *h, int 
>> **handles, int *numhandles)
>> return h->prot->url_get_multi_file_handle(h, handles, numhandles);
>> }
>> 
>> +int ffurl_get_short_seek(URLContext *h)
>> +{
>> +if (!h->prot->url_get_short_seek)
> 
>> +return -1;
> 
> should be some AVERROR code

Fixed

>> +return h->prot->url_get_short_seek(h);
>> +}
>> +
>> int ffurl_shutdown(URLContext *h, int flags)
>> {
>> if (!h->prot->url_shutdown)
>> diff --git a/libavformat/avio.h b/libavformat/avio.h
>> index b1ce1d1..0480981 100644
>> --- a/libavformat/avio.h
>> +++ b/libavformat/avio.h
>> @@ -287,6 +287,12 @@ typedef struct AVIOContext {
>> int short_seek_threshold;
>> 
>> /**
>> + * A callback that is used instead of short_seek_threshold.
>> + * This is current internal only, do not use from outside.
>> + */
>> +int (*short_seek_get)(void *opaque);
>> +
>> +/**
>>  * ',' separated list of allowed protocols.
>>  */
>> const char *protocol_whitelist;
> 
> thats not ok to add this way, the docs say this:
> /**
> * Bytestream IO Context.
> * New fields can be added to the end with minor version bumps.
> * Removal, reordering and changes to existing fields require a major
> * version bump.
> * sizeof(AVIOContext) must not be used outside libav*.
> *
> * @note None of the function pointers in AVIOContext should be called
> *   directly, they should only be set by the client application
> *   when implementing custom I/O. Normally these are set to the
> *   function pointers specified in avio_alloc_context()
> */
> 

I moved short_seek_get to the end of AVIOContext.  I’m not sure if the minor 
version bump referenced in the comment requires any in-code changes.  Is this 
an acceptable magnitude of change for this feature?

> [...]
>> --- a/libavformat/tcp.c
>> +++ b/libavformat/tcp.c
>> @@ -265,6 +265,26 @@ static int tcp_get_file_handle(URLContext *h)
>> return s->fd;
>> }
>> 
>> +static int tcp_get_window_size(URLContext *h)
>> +{
>> +TCPContext *s = h->priv_data;
>> +int avail;
>> +int avail_len = sizeof(avail);
>> +
> 
>> +#if HAVE_WINSOCK2_H
> 
> this formating is IIRC not allowed in C the # must be in the first
> column
> 
> 
>> +/* SO_RCVBUF with winsock only reports the actual TCP window size when
>> +auto-tuning has been disabled via setting SO_RCVBUF */
>> +if (s->recv_buffer_size < 0) {
>> +return AVERROR(ENOSYS);
>> +}
>> +#endif
>> +
>> +if (getsockopt(s->fd, SOL_SOCKET, SO_RCVBUF, , _len)) {
>> +   return ff_neterrno();
>> +}
> 
> the indention is inconsisntent

Both formatting issues have been corrected

From bac0f351afd14c56c56376d20557b7330bcea23e Mon Sep 17 00:00:00 2001
From: Joel Cunningham 
Date: Fri, 13 Jan 2017 10:52:25 -0600
Subject: [PATCH] HTTP: improve performance by reducing forward seeks

This commit optimizes HTTP performance by reducing forward seeks, instead
favoring a read-ahead and discard on the current connection (referred to
as a short seek) for seeks that are within a TCP window's worth of data.
This improves performance because with TCP flow control, a window's worth
of data will be in the local socket buffer already or in-flight from the
sender once congestion control on the sender is fully utilizing the window.

Note: this approach doesn't attempt to differentiate from a newly opened
connection which may not be fully utilizing the window due to congestion
control vs one that is. The receiver can't get at this information, so we
assume worst case; that full window is in use (we did advertise it after all)
and that data could be in-flight

The previous behavior of closing the connection, then opening a new
with a new HTTP range value results in a massive amounts of discarded
and re-sent data when large TCP windows are used.  This has been observed
on MacOS/iOS which starts with an initial window of 256KB and grows up to
1MB depending on the bandwidth-product delay.

When seeking within a window's worth of data and we close 

Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-25 Thread Michael Niedermayer
On Fri, Jan 13, 2017 at 11:36:41AM -0600, Joel Cunningham wrote:
> 
> > On Jan 12, 2017, at 10:59 AM, Joel Cunningham  
> > wrote:
> > 
> > Nicolas,
> > 
> > I’ve found existing “read-ahead logic” in avio_seek to do what I’ve 
> > implemented in http_stream_forward().  This is controlled by 
> > SHORT_SEEK_THRESHOLD, currently set to 4KB.  I proto-typed increasing this 
> > to the 256KB (matches the initial TCP window in my test setup) and saw the 
> > same number of reduction in HTTP GETs and the number of seeks!  Thanks for 
> > the heads up, this should reduce my patch size!
> > 
> > I could plumb this setting (s->short_seek_threshold) to a URL function that 
> > would get the desired value from HTTP/TCP.  Looking at how 
> > ffio_init_context() is implemented, it doesn’t appear to have access to the 
> > URLContext pointer.  Any guidance on how I can plumb the protocol layer 
> > with aviobuf without a public API change?
> > 
> > Thanks,
> > 
> > Joel
> 
> 
> I managed to figure the plumbing out in a way that doesn’t modify any public 
> APIs.  I added a new callback to struct AVIOContext that gets a short seek 
> threshold value from URLContext if available (otherwise fallback to 
> s->short_seek_threshold).  This allows using the read-ahead/discard logic in 
> avio_seek and eliminates my forwarding logic in the HTTP layer (thanks to 
> Nicolas for pointing me in this direction).  See new patch below
> 
> I also updated my commit message to include assumptions/considerations about 
> congestion control on the sender (thanks to Andy for calling out the 
> discussion on it).  Lastly, I have upload new captures/log in dropbox for 
> those that want to take a look at my testing output (see 
> ffplay_short_seek_output.log and mac-ffplay-short-seek-patch.pcapng)
> 
> https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0
> 
> Thanks for the feedback so far,
> 
> Joel
> 
> From 9bb2f184591c2d6e6a91d3760e63b013ca4c95e5 Mon Sep 17 00:00:00 2001
> From: Joel Cunningham 
> Date: Fri, 13 Jan 2017 10:52:25 -0600
> Subject: [PATCH] HTTP: improve performance by reducing forward seeks
> 
> This commit optimizes HTTP performance by reducing forward seeks, instead
> favoring a read-ahead and discard on the current connection (referred to
> as a short seek) for seeks that are within a TCP window's worth of data.
> This improves performance because with TCP flow control, a window's worth
> of data will be in the local socket buffer already or in-flight from the
> sender once congestion control on the sender is fully utilizing the window.
> 
> Note: this approach doesn't attempt to differentiate from a newly opened
> connection which may not be fully utilizing the window due to congestion
> control vs one that is. The receiver can't get at this information, so we
> assume worst case; that full window is in use (we did advertise it after all)
> and that data could be in-flight
> 
> The previous behavior of closing the connection, then opening a new
> with a new HTTP range value results in a massive amounts of discarded
> and re-sent data when large TCP windows are used.  This has been observed
> on MacOS/iOS which starts with an inital window of 256KB and grows up to
> 1MB depending on the bandwidth-product delay.
> 
> When seeking within a window's worth of data and we close the connection,
> then open a new one within the same window's worth of data, we discard
> from the current offset till the end of the window.  Then on the new
> connection the server ends up re-sending the previous data from new
> offset till the end of old window.
> 
> Example (assumes full window utilization):
> 
> TCP window size: 64KB
> Position: 32KB
> Forward seek position: 40KB
> 
>   *  (Next window)
> 32KB |--| 96KB |---| 160KB
> *
>   40KB |---| 104KB
> 
> Re-sent amount: 96KB - 40KB = 56KB
> 
> For a real world test example, I have MP4 file of ~25MB, which ffplay
> only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
> in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
> patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
> communication.
> 
> To support this feature, the short seek logic in avio_seek() has been
> extended to call a function to get the short seek threshold value.  This
> callback has been plumbed to the URLProtocol structure, which now has
> infrastructure in HTTP and TCP to get the underlying receiver window size
> via SO_RCVBUF.  If the underlying URL and protocol don't support returning
> a short seek threshold, the default s->short_seek_threshold is used
> 
> This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
> is slightly complicated by the fact that when TCP window auto-tuning is
> enabled, SO_RCVBUF doesn't report the real window size, but it does if
> SO_RCVBUF was manually set (disabling 

Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-25 Thread Joel Cunningham
Ping!  Anyone have a chance to review my latest patch which leverages the 
read-ahead/discard logic in avio_seek?

Thanks,

Joel

> On Jan 13, 2017, at 11:36 AM, Joel Cunningham  wrote:
> 
>> 
>> On Jan 12, 2017, at 10:59 AM, Joel Cunningham  wrote:
>> 
>> Nicolas,
>> 
>> I’ve found existing “read-ahead logic” in avio_seek to do what I’ve 
>> implemented in http_stream_forward().  This is controlled by 
>> SHORT_SEEK_THRESHOLD, currently set to 4KB.  I proto-typed increasing this 
>> to the 256KB (matches the initial TCP window in my test setup) and saw the 
>> same number of reduction in HTTP GETs and the number of seeks!  Thanks for 
>> the heads up, this should reduce my patch size!
>> 
>> I could plumb this setting (s->short_seek_threshold) to a URL function that 
>> would get the desired value from HTTP/TCP.  Looking at how 
>> ffio_init_context() is implemented, it doesn’t appear to have access to the 
>> URLContext pointer.  Any guidance on how I can plumb the protocol layer with 
>> aviobuf without a public API change?
>> 
>> Thanks,
>> 
>> Joel
> 
> 
> I managed to figure the plumbing out in a way that doesn’t modify any public 
> APIs.  I added a new callback to struct AVIOContext that gets a short seek 
> threshold value from URLContext if available (otherwise fallback to 
> s->short_seek_threshold).  This allows using the read-ahead/discard logic in 
> avio_seek and eliminates my forwarding logic in the HTTP layer (thanks to 
> Nicolas for pointing me in this direction).  See new patch below
> 
> I also updated my commit message to include assumptions/considerations about 
> congestion control on the sender (thanks to Andy for calling out the 
> discussion on it).  Lastly, I have upload new captures/log in dropbox for 
> those that want to take a look at my testing output (see 
> ffplay_short_seek_output.log and mac-ffplay-short-seek-patch.pcapng)
> 
> https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0
> 
> Thanks for the feedback so far,
> 
> Joel
> 
> From 9bb2f184591c2d6e6a91d3760e63b013ca4c95e5 Mon Sep 17 00:00:00 2001
> From: Joel Cunningham 
> Date: Fri, 13 Jan 2017 10:52:25 -0600
> Subject: [PATCH] HTTP: improve performance by reducing forward seeks
> 
> This commit optimizes HTTP performance by reducing forward seeks, instead
> favoring a read-ahead and discard on the current connection (referred to
> as a short seek) for seeks that are within a TCP window's worth of data.
> This improves performance because with TCP flow control, a window's worth
> of data will be in the local socket buffer already or in-flight from the
> sender once congestion control on the sender is fully utilizing the window.
> 
> Note: this approach doesn't attempt to differentiate from a newly opened
> connection which may not be fully utilizing the window due to congestion
> control vs one that is. The receiver can't get at this information, so we
> assume worst case; that full window is in use (we did advertise it after all)
> and that data could be in-flight
> 
> The previous behavior of closing the connection, then opening a new
> with a new HTTP range value results in a massive amounts of discarded
> and re-sent data when large TCP windows are used.  This has been observed
> on MacOS/iOS which starts with an inital window of 256KB and grows up to
> 1MB depending on the bandwidth-product delay.
> 
> When seeking within a window's worth of data and we close the connection,
> then open a new one within the same window's worth of data, we discard
> from the current offset till the end of the window.  Then on the new
> connection the server ends up re-sending the previous data from new
> offset till the end of old window.
> 
> Example (assumes full window utilization):
> 
> TCP window size: 64KB
> Position: 32KB
> Forward seek position: 40KB
> 
>  *  (Next window)
> 32KB |--| 96KB |---| 160KB
>*
>  40KB |---| 104KB
> 
> Re-sent amount: 96KB - 40KB = 56KB
> 
> For a real world test example, I have MP4 file of ~25MB, which ffplay
> only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
> in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
> patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
> communication.
> 
> To support this feature, the short seek logic in avio_seek() has been
> extended to call a function to get the short seek threshold value.  This
> callback has been plumbed to the URLProtocol structure, which now has
> infrastructure in HTTP and TCP to get the underlying receiver window size
> via SO_RCVBUF.  If the underlying URL and protocol don't support returning
> a short seek threshold, the default s->short_seek_threshold is used
> 
> This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
> is slightly complicated by the fact that when TCP window 

Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-13 Thread Joel Cunningham

> On Jan 12, 2017, at 10:59 AM, Joel Cunningham  wrote:
> 
> Nicolas,
> 
> I’ve found existing “read-ahead logic” in avio_seek to do what I’ve 
> implemented in http_stream_forward().  This is controlled by 
> SHORT_SEEK_THRESHOLD, currently set to 4KB.  I proto-typed increasing this to 
> the 256KB (matches the initial TCP window in my test setup) and saw the same 
> number of reduction in HTTP GETs and the number of seeks!  Thanks for the 
> heads up, this should reduce my patch size!
> 
> I could plumb this setting (s->short_seek_threshold) to a URL function that 
> would get the desired value from HTTP/TCP.  Looking at how 
> ffio_init_context() is implemented, it doesn’t appear to have access to the 
> URLContext pointer.  Any guidance on how I can plumb the protocol layer with 
> aviobuf without a public API change?
> 
> Thanks,
> 
> Joel


I managed to figure the plumbing out in a way that doesn’t modify any public 
APIs.  I added a new callback to struct AVIOContext that gets a short seek 
threshold value from URLContext if available (otherwise fallback to 
s->short_seek_threshold).  This allows using the read-ahead/discard logic in 
avio_seek and eliminates my forwarding logic in the HTTP layer (thanks to 
Nicolas for pointing me in this direction).  See new patch below

I also updated my commit message to include assumptions/considerations about 
congestion control on the sender (thanks to Andy for calling out the discussion 
on it).  Lastly, I have upload new captures/log in dropbox for those that want 
to take a look at my testing output (see ffplay_short_seek_output.log and 
mac-ffplay-short-seek-patch.pcapng)

https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0

Thanks for the feedback so far,

Joel

From 9bb2f184591c2d6e6a91d3760e63b013ca4c95e5 Mon Sep 17 00:00:00 2001
From: Joel Cunningham 
Date: Fri, 13 Jan 2017 10:52:25 -0600
Subject: [PATCH] HTTP: improve performance by reducing forward seeks

This commit optimizes HTTP performance by reducing forward seeks, instead
favoring a read-ahead and discard on the current connection (referred to
as a short seek) for seeks that are within a TCP window's worth of data.
This improves performance because with TCP flow control, a window's worth
of data will be in the local socket buffer already or in-flight from the
sender once congestion control on the sender is fully utilizing the window.

Note: this approach doesn't attempt to differentiate from a newly opened
connection which may not be fully utilizing the window due to congestion
control vs one that is. The receiver can't get at this information, so we
assume worst case; that full window is in use (we did advertise it after all)
and that data could be in-flight

The previous behavior of closing the connection, then opening a new
with a new HTTP range value results in a massive amounts of discarded
and re-sent data when large TCP windows are used.  This has been observed
on MacOS/iOS which starts with an inital window of 256KB and grows up to
1MB depending on the bandwidth-product delay.

When seeking within a window's worth of data and we close the connection,
then open a new one within the same window's worth of data, we discard
from the current offset till the end of the window.  Then on the new
connection the server ends up re-sending the previous data from new
offset till the end of old window.

Example (assumes full window utilization):

TCP window size: 64KB
Position: 32KB
Forward seek position: 40KB

  *  (Next window)
32KB |--| 96KB |---| 160KB
*
  40KB |---| 104KB

Re-sent amount: 96KB - 40KB = 56KB

For a real world test example, I have MP4 file of ~25MB, which ffplay
only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
patch, ffmpeg issues 4 HTTP GETs and 3 seeks for a total of ~22MB of TCP data
communication.

To support this feature, the short seek logic in avio_seek() has been
extended to call a function to get the short seek threshold value.  This
callback has been plumbed to the URLProtocol structure, which now has
infrastructure in HTTP and TCP to get the underlying receiver window size
via SO_RCVBUF.  If the underlying URL and protocol don't support returning
a short seek threshold, the default s->short_seek_threshold is used

This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
is slightly complicated by the fact that when TCP window auto-tuning is
enabled, SO_RCVBUF doesn't report the real window size, but it does if
SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
this optimization on Windows in the later case
---
 libavformat/avio.c|  7 +++
 libavformat/avio.h|  6 ++
 libavformat/aviobuf.c | 18 +-
 libavformat/http.c|  8 
 libavformat/tcp.c | 21 

Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Michael Niedermayer
On Thu, Jan 12, 2017 at 10:59:56AM -0600, Joel Cunningham wrote:
> Nicolas,
> 
> I’ve found existing “read-ahead logic” in avio_seek to do what I’ve 
> implemented in http_stream_forward().  This is controlled by 
> SHORT_SEEK_THRESHOLD, currently set to 4KB.  I proto-typed increasing this to 
> the 256KB (matches the initial TCP window in my test setup) and saw the same 
> number of reduction in HTTP GETs and the number of seeks!  Thanks for the 
> heads up, this should reduce my patch size!
> 
> I could plumb this setting (s->short_seek_threshold) to a URL function that 
> would get the desired value from HTTP/TCP.  Looking at how 
> ffio_init_context() is implemented, it doesn’t appear to have access to the 
> URLContext pointer.  Any guidance on how I can plumb the protocol layer with 
> aviobuf without a public API change?

i just quickly read this thread and didnt think deeply about it but
remembered  this old (somewhat ugly) patchset:

88066 O   0327  0:24 Michael Niederm (1.1K) [FFmpeg-devel] [PATCH 1/2] 
avformat/aviobuf: Use a 512kb IO buffer for http* instead of 32kb
88067 0327  0:24 Michael Niederm (0.7K) ├─>[FFmpeg-devel] [PATCH 2/2] 
avformat/aviobuf: increase SHORT_SEEK_THRESHOLD from 4096 to 8192

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Joel Cunningham

> On Jan 12, 2017, at 11:04 AM, Steinar H. Gunderson 
>  wrote:
> 
> On Thu, Jan 12, 2017 at 04:59:33PM +, Andy Furniss wrote:
>> I've seen plenty of "legal" shrinks looking at tcpdumps - usually the
>> app is throttling it's read speed.
> 
> You're not really allowed to shrink by more than you've received, though,
> are you? Typically the buffer going down is just that the app hasn't read all
> the data from the socket -- but that's a different case.
> 

You’re correct, decreasing the advertised window by more than you’ve received 
(relative to last advertised window value) is what the term “shrinking the 
window” refers to.  Check out RFC 793, section 3.7 Data Communication.

http://www.faqs.org/rfcs/rfc793.html

All that aside, I don’t think illegal window shrinking is that relevant to this 
patch because we are fetching the value of SO_RCVBUF each time before 
determining how to perform the seek (either read-ahead/discard or close/open 
new connection).

> The point of the “no-shrink” rule is that once you've advertised a window to
> the sender, the sender should be allowed to send that much data with no ack,
> without keeping a copy, and without worrying it might get lost. If you could
> shrink the window, that guarantee would disappear.
> 

Joel

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Andy Furniss

Steinar H. Gunderson wrote:

On Thu, Jan 12, 2017 at 04:59:33PM +, Andy Furniss wrote:

I've seen plenty of "legal" shrinks looking at tcpdumps - usually the
app is throttling it's read speed.


You're not really allowed to shrink by more than you've received, though,
are you? Typically the buffer going down is just that the app hasn't read all
the data from the socket -- but that's a different case.


Correct - I was probably being over pedantic about one statement in the
context of buffer sizes.


The point of the “no-shrink” rule is that once you've advertised a window to
the sender, the sender should be allowed to send that much data with no ack,


Yes.


without keeping a copy, and without worrying it might get lost. If you could
shrink the window, that guarantee would disappear.


Not so sure about this bit = when the ack comes it may well inform that
a re-transmit is needed.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Steinar H. Gunderson
On Thu, Jan 12, 2017 at 04:59:33PM +, Andy Furniss wrote:
> I've seen plenty of "legal" shrinks looking at tcpdumps - usually the
> app is throttling it's read speed.

You're not really allowed to shrink by more than you've received, though,
are you? Typically the buffer going down is just that the app hasn't read all
the data from the socket -- but that's a different case.

The point of the “no-shrink” rule is that once you've advertised a window to
the sender, the sender should be allowed to send that much data with no ack,
without keeping a copy, and without worrying it might get lost. If you could
shrink the window, that guarantee would disappear.

/* Steinar */
-- 
Homepage: https://www.sesse.net/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Joel Cunningham
Nicolas,

I’ve found existing “read-ahead logic” in avio_seek to do what I’ve implemented 
in http_stream_forward().  This is controlled by SHORT_SEEK_THRESHOLD, 
currently set to 4KB.  I proto-typed increasing this to the 256KB (matches the 
initial TCP window in my test setup) and saw the same number of reduction in 
HTTP GETs and the number of seeks!  Thanks for the heads up, this should reduce 
my patch size!

I could plumb this setting (s->short_seek_threshold) to a URL function that 
would get the desired value from HTTP/TCP.  Looking at how ffio_init_context() 
is implemented, it doesn’t appear to have access to the URLContext pointer.  
Any guidance on how I can plumb the protocol layer with aviobuf without a 
public API change?

Thanks,

Joel

> On Jan 12, 2017, at 9:29 AM, Joel Cunningham  wrote:
> 
> 
>> On Jan 12, 2017, at 6:53 AM, Nicolas George  wrote:
>> 
>> Le duodi 22 nivôse, an CCXXV, Joel Cunningham a écrit :
>>> This commit optimizes HTTP forward seeks by advancing the stream on
>>> the current connection when the seek amount is within the current
>>> TCP window rather than closing the connection and opening a new one.
>>> This improves performance because with TCP flow control, a window's
>>> worth of data is always either in the local socket buffer already or
>>> in-flight from the sender.
>> 
>> Thanks for the patch. You may have not noticed, but there is already a
>> similar logic in the higher-level API, aviobuf. See the code for
>> avio_seek(). I suspect it would be a better place to implement the logic
>> of avoiding seeks on slow-ish protocols.
> 
> Thanks for the feedback, I’ll take a look into how the forwarding logic could 
> be implemented in avio_seek().  I do want to be careful that this logic 
> doesn’t execute for non-HTTP or non-stream based protocols which don’t have 
> flow control/automatic buffering of data.
> 
> Joel
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Andy Furniss

Joel Cunningham wrote:


According TCP RFCs, a receiver is not allowed to decrease the window,


Not sure that's true.

It's certainly possible to do it illegally, on a linux box tcp will log
a message something like heresy  shrank window.
That IME is where some middle box is "interfearing".

I've seen plenty of "legal" shrinks looking at tcpdumps - usually the
app is throttling it's read speed.

Just had a look between two Linux boxes on my lan and

wget --limit-rate=10k will do to see shrinking advertised window by
receiver using tcpdump.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Joel Cunningham

> On Jan 12, 2017, at 10:08 AM, Andy Furniss  wrote:
> 
> Joel Cunningham wrote:
>> Hi,
>> 
>> I’ve been working on optimizing HTTP forward seek performance for ffmpeg and 
>> would like to contribute this patch into mainline ffmpeg.  Please see the 
>> below patch for an explanation of the issue and proposed fix.  I have 
>> provided evidence of the current performance issue and my sample MP4 so 
>> others can reproduce and observe the behavior.
>> 
>> Files are located in Dropbox here: 
>> https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0
>> 
>> GRMT0003.MP4 - test video file
>> mac-ffplay-baseline.pcapng - wireshark capture of ffplay (49abd) playing the 
>> above test file on MacOS 10.12.2 from a remote NGINX server
>> mac-ffplay-optimize-patch.pcapng - same ffplay setup but with patch applied
>> ffplay_output.log - console output of ffplay with patch (loglevel debug)
>> 
>> I’m happy to discuss this issue further if the below description doesn’t 
>> fully communicate the issue.
>> 
>> Thanks,
>> 
>> Joel
>> 
>> From 89a3ed8aab9168313b4f7e83c00857f9b715ba4e Mon Sep 17 00:00:00 2001
>> From: Joel Cunningham 
>> Date: Wed, 11 Jan 2017 13:55:02 -0600
>> Subject: [PATCH] HTTP: optimize forward seek performance
>> 
>> This commit optimizes HTTP forward seeks by advancing the stream on
>> the current connection when the seek amount is within the current
>> TCP window rather than closing the connection and opening a new one.
>> This improves performance because with TCP flow control, a window's
>> worth of data is always either in the local socket buffer already or
>> in-flight from the sender.
> 
> Not saying there is anything wrong with this patch but this statement
> doesn't seem quite right.
> 
> I may be wrong, but IIRC what's in flight + buffer will also depend on
> the state of the senders congestion control window.
> 

You’re correct that the sender's congestion control algorithm will gate how 
much is in-flight and while the sender is discovering the available network 
bandwidth, it won’t be sending a full window’s worth at a time.  I didn’t 
really bring this into the conversation (and kind of hand-waved it) because 
there isn’t any way that the receiver would know this information, thus we 
can’t use it to potentially reduce our “read-ahead/discard” size based on the 
receiver’s window size. 

If we wanted to be conservative about our read-ahead/discard size, taking into 
account a possible congestion control preventing a full window’s worth of data 
from being in-flight, we could potentially take the value from SO_RCVBUF and 
divide it by 1/2 or 1/4, but then we’d still have some possibility for 
discarded and re-sent data for larger seeks within the window once the window 
has fully opened up.  This kind of information isn’t available via socket APIs 
and typically isn’t needed by TCP applications (ffmpeg really has an unique use 
case by issuing HTTP range requests from current position till end, then 
aborting at some point during the HTTP response when needing to seek)

Joel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Andy Furniss

Joel Cunningham wrote:

Hi,

I’ve been working on optimizing HTTP forward seek performance for ffmpeg and 
would like to contribute this patch into mainline ffmpeg.  Please see the below 
patch for an explanation of the issue and proposed fix.  I have provided 
evidence of the current performance issue and my sample MP4 so others can 
reproduce and observe the behavior.

Files are located in Dropbox here: 
https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0

GRMT0003.MP4 - test video file
mac-ffplay-baseline.pcapng - wireshark capture of ffplay (49abd) playing the 
above test file on MacOS 10.12.2 from a remote NGINX server
mac-ffplay-optimize-patch.pcapng - same ffplay setup but with patch applied
ffplay_output.log - console output of ffplay with patch (loglevel debug)

I’m happy to discuss this issue further if the below description doesn’t fully 
communicate the issue.

Thanks,

Joel

 From 89a3ed8aab9168313b4f7e83c00857f9b715ba4e Mon Sep 17 00:00:00 2001
From: Joel Cunningham 
Date: Wed, 11 Jan 2017 13:55:02 -0600
Subject: [PATCH] HTTP: optimize forward seek performance

This commit optimizes HTTP forward seeks by advancing the stream on
the current connection when the seek amount is within the current
TCP window rather than closing the connection and opening a new one.
This improves performance because with TCP flow control, a window's
worth of data is always either in the local socket buffer already or
in-flight from the sender.


Not saying there is anything wrong with this patch but this statement
doesn't seem quite right.

I may be wrong, but IIRC what's in flight + buffer will also depend on
the state of the senders congestion control window.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Joel Cunningham

> On Jan 12, 2017, at 6:47 AM, Ronald S. Bultje  wrote:
> 
> Hi Joel,
> 
> On Wed, Jan 11, 2017 at 6:01 PM, Joel Cunningham 
> wrote:
> 
>> Hi,
>> 
>> I’ve been working on optimizing HTTP forward seek performance for ffmpeg
>> and would like to contribute this patch into mainline ffmpeg.  Please see
>> the below patch for an explanation of the issue and proposed fix.  I have
>> provided evidence of the current performance issue and my sample MP4 so
>> others can reproduce and observe the behavior.
>> 
>> Files are located in Dropbox here: https://www.dropbox.com/sh/
>> 4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0
>> 
>> GRMT0003.MP4 - test video file
>> mac-ffplay-baseline.pcapng - wireshark capture of ffplay (49abd) playing
>> the above test file on MacOS 10.12.2 from a remote NGINX server
>> mac-ffplay-optimize-patch.pcapng - same ffplay setup but with patch
>> applied
>> ffplay_output.log - console output of ffplay with patch (loglevel debug)
>> 
>> I’m happy to discuss this issue further if the below description doesn’t
>> fully communicate the issue.
>> 
>> Thanks,
>> 
>> Joel
>> 
>> From 89a3ed8aab9168313b4f7e83c00857f9b715ba4e Mon Sep 17 00:00:00 2001
>> From: Joel Cunningham 
>> Date: Wed, 11 Jan 2017 13:55:02 -0600
>> Subject: [PATCH] HTTP: optimize forward seek performance
>> 
>> This commit optimizes HTTP forward seeks by advancing the stream on
>> the current connection when the seek amount is within the current
>> TCP window rather than closing the connection and opening a new one.
>> This improves performance because with TCP flow control, a window's
>> worth of data is always either in the local socket buffer already or
>> in-flight from the sender.
>> 
>> The previous behavior of closing the connection, then opening a new
>> with a new HTTP range value results in a massive amounts of discarded
>> and re-sent data when large TCP windows are used.  This has been observed
>> on MacOS/iOS which starts with an inital window of 256KB and grows up to
>> 1MB depending on the bandwidth-product delay.
>> 
>> When seeking within a window's worth of data and we close the connection,
>> then open a new one within the same window's worth of data, we discard
>> from the current offset till the end of the window.  Then on the new
>> connection the server ends up re-sending the previous data from new
>> offset till the end of old window.
>> 
>> Example:
>> 
>> TCP window size: 64KB
>> Position: 32KB
>> Forward seek position: 40KB
>> 
>>  *  (Next window)
>> 32KB |--| 96KB |---| 160KB
>>*
>>  40KB |---| 104KB
>> 
>> Re-sent amount: 96KB - 40KB = 56KB
>> 
>> For a real world test example, I have MP4 file of ~25MB, which ffplay
>> only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
>> in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
>> patch, ffmpeg issues 4 HTTP GETs for a total of ~20MB of TCP data
>> communication.
>> 
>> To support this feature, a new URL function has been added to get the
>> stream buffer size from the TCP protocol.  The stream advancement logic
>> has been implemented in the HTTP layer since this the layer in charge of
>> the seek and creating/destroying the TCP connections.
>> 
>> This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
>> is slightly complicated by the fact that when TCP window auto-tuning is
>> enabled, SO_RCVBUF doesn't report the real window size, but it does if
>> SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
>> this optimization on Windows in the later case
>> ---
>> libavformat/avio.c |  7 ++
>> libavformat/http.c | 69 ++
>> 
>> libavformat/tcp.c  | 21 +
>> libavformat/url.h  |  8 +++
>> 4 files changed, 105 insertions(+)
> 
> 
> Very interesting. There's some minor stylistic nits (double brackets where
> there shouldn't be, I didn't check super-closely for that), but overall
> this is a pretty thoughtful patch in what it's trying to do.

My apologies for any stylistic issues, this is my first ffmpeg change and I’m 
still getting familiar with the style

> 
> As for Windows, I'm surprised that there wouldn't be any API to get the
> real current window size. I mean, I understand that they would decrease
> window size on-demand, but I would expect them to do that w/o throwing out
> already-downloaded data - so essentially I expect delayed auto-tuning. In
> that case, getting the current window size should be trivial and valid
> until the next read. Very strange. I suppose no workarounds (perhaps with
> windows-specific API) exist? Also, what is the practical implication? Does
> it work as before? Or better but not optimal? Or worse?

For Windows, when TCP receive window auto-tuning is enabled, SO_RCVBUF returns 
8192 regardless of what the window value is.  According 

Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Nicolas George
Le duodi 22 nivôse, an CCXXV, Joel Cunningham a écrit :
> This commit optimizes HTTP forward seeks by advancing the stream on
> the current connection when the seek amount is within the current
> TCP window rather than closing the connection and opening a new one.
> This improves performance because with TCP flow control, a window's
> worth of data is always either in the local socket buffer already or
> in-flight from the sender.

Thanks for the patch. You may have not noticed, but there is already a
similar logic in the higher-level API, aviobuf. See the code for
avio_seek(). I suspect it would be a better place to implement the logic
of avoiding seeks on slow-ish protocols.

Regards,

-- 
  Nicolas George


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-12 Thread Ronald S. Bultje
Hi Joel,

On Wed, Jan 11, 2017 at 6:01 PM, Joel Cunningham 
wrote:

> Hi,
>
> I’ve been working on optimizing HTTP forward seek performance for ffmpeg
> and would like to contribute this patch into mainline ffmpeg.  Please see
> the below patch for an explanation of the issue and proposed fix.  I have
> provided evidence of the current performance issue and my sample MP4 so
> others can reproduce and observe the behavior.
>
> Files are located in Dropbox here: https://www.dropbox.com/sh/
> 4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0
>
> GRMT0003.MP4 - test video file
> mac-ffplay-baseline.pcapng - wireshark capture of ffplay (49abd) playing
> the above test file on MacOS 10.12.2 from a remote NGINX server
> mac-ffplay-optimize-patch.pcapng - same ffplay setup but with patch
> applied
> ffplay_output.log - console output of ffplay with patch (loglevel debug)
>
> I’m happy to discuss this issue further if the below description doesn’t
> fully communicate the issue.
>
> Thanks,
>
> Joel
>
> From 89a3ed8aab9168313b4f7e83c00857f9b715ba4e Mon Sep 17 00:00:00 2001
> From: Joel Cunningham 
> Date: Wed, 11 Jan 2017 13:55:02 -0600
> Subject: [PATCH] HTTP: optimize forward seek performance
>
> This commit optimizes HTTP forward seeks by advancing the stream on
> the current connection when the seek amount is within the current
> TCP window rather than closing the connection and opening a new one.
> This improves performance because with TCP flow control, a window's
> worth of data is always either in the local socket buffer already or
> in-flight from the sender.
>
> The previous behavior of closing the connection, then opening a new
> with a new HTTP range value results in a massive amounts of discarded
> and re-sent data when large TCP windows are used.  This has been observed
> on MacOS/iOS which starts with an inital window of 256KB and grows up to
> 1MB depending on the bandwidth-product delay.
>
> When seeking within a window's worth of data and we close the connection,
> then open a new one within the same window's worth of data, we discard
> from the current offset till the end of the window.  Then on the new
> connection the server ends up re-sending the previous data from new
> offset till the end of old window.
>
> Example:
>
> TCP window size: 64KB
> Position: 32KB
> Forward seek position: 40KB
>
>   *  (Next window)
> 32KB |--| 96KB |---| 160KB
> *
>   40KB |---| 104KB
>
> Re-sent amount: 96KB - 40KB = 56KB
>
> For a real world test example, I have MP4 file of ~25MB, which ffplay
> only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
> in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
> patch, ffmpeg issues 4 HTTP GETs for a total of ~20MB of TCP data
> communication.
>
> To support this feature, a new URL function has been added to get the
> stream buffer size from the TCP protocol.  The stream advancement logic
> has been implemented in the HTTP layer since this the layer in charge of
> the seek and creating/destroying the TCP connections.
>
> This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
> is slightly complicated by the fact that when TCP window auto-tuning is
> enabled, SO_RCVBUF doesn't report the real window size, but it does if
> SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
> this optimization on Windows in the later case
> ---
>  libavformat/avio.c |  7 ++
>  libavformat/http.c | 69 ++
> 
>  libavformat/tcp.c  | 21 +
>  libavformat/url.h  |  8 +++
>  4 files changed, 105 insertions(+)


Very interesting. There's some minor stylistic nits (double brackets where
there shouldn't be, I didn't check super-closely for that), but overall
this is a pretty thoughtful patch in what it's trying to do.

As for Windows, I'm surprised that there wouldn't be any API to get the
real current window size. I mean, I understand that they would decrease
window size on-demand, but I would expect them to do that w/o throwing out
already-downloaded data - so essentially I expect delayed auto-tuning. In
that case, getting the current window size should be trivial and valid
until the next read. Very strange. I suppose no workarounds (perhaps with
windows-specific API) exist? Also, what is the practical implication? Does
it work as before? Or better but not optimal? Or worse?

I'm wondering if there's some way to make the interaction between tcp and
http less awkward. Similar problems exist between rtp/udp, where we want
deeper protocol integration and the generic API basically inhibits us.
Probably out of scope for this patch and not a terribly big deal because
the URL API is currently not exposed, but we eventually want to expose this
API (IMO) so at some point this will need some more thoughts.

If nobody else has comments, I'm 

[FFmpeg-devel] [PATCH] HTTP: optimize forward seek performance

2017-01-11 Thread Joel Cunningham
Hi,

I’ve been working on optimizing HTTP forward seek performance for ffmpeg and 
would like to contribute this patch into mainline ffmpeg.  Please see the below 
patch for an explanation of the issue and proposed fix.  I have provided 
evidence of the current performance issue and my sample MP4 so others can 
reproduce and observe the behavior.

Files are located in Dropbox here: 
https://www.dropbox.com/sh/4q4ru8isdv22joj/AABU3XyXmgLMiEFqucf1LdZ3a?dl=0

GRMT0003.MP4 - test video file
mac-ffplay-baseline.pcapng - wireshark capture of ffplay (49abd) playing the 
above test file on MacOS 10.12.2 from a remote NGINX server
mac-ffplay-optimize-patch.pcapng - same ffplay setup but with patch applied 
ffplay_output.log - console output of ffplay with patch (loglevel debug)

I’m happy to discuss this issue further if the below description doesn’t fully 
communicate the issue.

Thanks,

Joel

From 89a3ed8aab9168313b4f7e83c00857f9b715ba4e Mon Sep 17 00:00:00 2001
From: Joel Cunningham 
Date: Wed, 11 Jan 2017 13:55:02 -0600
Subject: [PATCH] HTTP: optimize forward seek performance

This commit optimizes HTTP forward seeks by advancing the stream on
the current connection when the seek amount is within the current
TCP window rather than closing the connection and opening a new one.
This improves performance because with TCP flow control, a window's
worth of data is always either in the local socket buffer already or
in-flight from the sender.

The previous behavior of closing the connection, then opening a new
with a new HTTP range value results in a massive amounts of discarded
and re-sent data when large TCP windows are used.  This has been observed
on MacOS/iOS which starts with an inital window of 256KB and grows up to
1MB depending on the bandwidth-product delay.

When seeking within a window's worth of data and we close the connection,
then open a new one within the same window's worth of data, we discard
from the current offset till the end of the window.  Then on the new
connection the server ends up re-sending the previous data from new
offset till the end of old window.

Example:

TCP window size: 64KB
Position: 32KB
Forward seek position: 40KB

  *  (Next window)
32KB |--| 96KB |---| 160KB
*
  40KB |---| 104KB

Re-sent amount: 96KB - 40KB = 56KB

For a real world test example, I have MP4 file of ~25MB, which ffplay
only reads ~16MB and performs 177 seeks. With current ffmpeg, this results
in 177 HTTP GETs and ~73MB worth of TCP data communication.  With this
patch, ffmpeg issues 4 HTTP GETs for a total of ~20MB of TCP data
communication.

To support this feature, a new URL function has been added to get the
stream buffer size from the TCP protocol.  The stream advancement logic
has been implemented in the HTTP layer since this the layer in charge of
the seek and creating/destroying the TCP connections.

This feature has been tested on Windows 7 and MacOS/iOS.  Windows support
is slightly complicated by the fact that when TCP window auto-tuning is
enabled, SO_RCVBUF doesn't report the real window size, but it does if
SO_RCVBUF was manually set (disabling auto-tuning). So we can only use
this optimization on Windows in the later case
---
 libavformat/avio.c |  7 ++
 libavformat/http.c | 69 ++
 libavformat/tcp.c  | 21 +
 libavformat/url.h  |  8 +++
 4 files changed, 105 insertions(+)

diff --git a/libavformat/avio.c b/libavformat/avio.c
index 3606eb0..34dcf09 100644
--- a/libavformat/avio.c
+++ b/libavformat/avio.c
@@ -645,6 +645,13 @@ int ffurl_get_multi_file_handle(URLContext *h, int 
**handles, int *numhandles)
 return h->prot->url_get_multi_file_handle(h, handles, numhandles);
 }
 
+int ffurl_get_stream_size(URLContext *h)
+{
+if (!h->prot->url_get_stream_size)
+return -1;
+return h->prot->url_get_stream_size(h);
+}
+
 int ffurl_shutdown(URLContext *h, int flags)
 {
 if (!h->prot->url_shutdown)
diff --git a/libavformat/http.c b/libavformat/http.c
index 944a6cf..0026168 100644
--- a/libavformat/http.c
+++ b/libavformat/http.c
@@ -1462,6 +1462,61 @@ static int http_close(URLContext *h)
 return ret;
 }
 
+#define DISCARD_BUFFER_SIZE (256 * 1024)
+static int http_stream_forward(HTTPContext *s, int amount)
+{
+int len;
+int ret;
+int discard = amount;
+int discard_buf_size;
+uint8_t * discard_buf;
+
+/* advance local buffer first */
+len = FFMIN(discard, s->buf_end - s->buf_ptr);
+if (len > 0) {
+av_log(s, AV_LOG_DEBUG, "advancing input buffer by %d bytes\n", len);
+s->buf_ptr += len;
+discard -= len;
+if (discard == 0) {
+return amount;
+}
+}
+
+/* if underlying protocol is a stream with flow control and we are seeking
+within the stream buffer size, it performs better to advance through the 
stream
+rather than