On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <[email protected]> wrote: > > On Wed, 3 Jul 2019 at 18:37, Joe Orton <[email protected]> wrote: > > > > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote: > > > They also make the existing old API unusable for many cases thus > > > making a switch to the new API mandatory, even though it doesn't have > > > to be so (see below). > > > > > > APR 1.x did not specify that the result of apr_dir_read() has to be > > > allocated > > > in dir->pool: > > > > > > > > > https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8 > > > > > > This function does not accept a pool argument, and I think that it's > > > natural > > > to assume limited lifetime in such case. This is what the current major > > > APR > > > users (httpd, svn) do, and other users should probably be ready for that > > > too, > > > since on Win32 the subsequent calls to apr_dir_read() already invalidate > > > the > > > previous apr_finfo_t. > > > > Are there many examples in the APR API where returned strings are not > > either pool-allocated or constant (process lifetime)? It seems unusual > > and very surprising to me. > > > > A hypothetical API consumer can have made either assumption: > > > > a) of constant memory (true with limits on Win32, breaks for Unix), or > > b) of pool-allocated string lifetime (works for Unix, breaks for Win32) > > > > neither is documented and both are broken by current implementations. It > > is impossible to satisfy both so anybody can argue that fixing either > > implementation to match the other is a regression. > > > > Doubling down on (a) over (b) seems strongly inferior to me, the API > > cannot support this without introducing runtime overhead for all callers > > (a new iterpool in the dir object), so I'd rather fix this properly with > > a new API. > > > > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at > > the same as introducing _pread() if desired - making the strdup only > > happen for _pread would only be a minor tweak, not a whole new subpool. > > > > There is example of apr_hash_t.ITERATOR that is statically allocated in > the hash object and returned to the caller. > > Also the native readdir() API behaves the same way [1]: > [[[ > The returned pointer, and pointers within the structure, might be invalidated > or the structure or the storage areas might be overwritten by a subsequent > call > to readdir() on the same directory stream. > ]]] > > From this perspective the current behavior of apr_dir_read on Unix looks > inconsistent and surprising for the API user because it is impossible to use > in a loop without unbounded memory usage. > > I strongly agree that the revved version of this function that accepts a POOL > argument will be good from both usage and implementation terms. I would be +1 > on adding such API. But I also think that unless we fix the original issue > by using the preallocated buffer/iterpool, the whole thing would still be > problematic for any existing API user. > > More detailed: if we take this opportunity to choose and document the API to > consistently return a temporary result, then as the caller I can either > process > the data during iteration or manually put it into a long-living pool if that > is > necessary. I think that all current callers work with the API this way. And I > can also switch to the new revved function to manually specify the pool and > better control the allocations. If I would like to support both APR 1.x and > 2.x > the compatibility is also straightforward to achieve. > > If we instead choose an option where the results may be allocated > in the dir object pool, then as the caller I cannot avoid unbounded memory > usage when supporting both APR 1.x and 2.x. And this approach also introduces > the unavoidable the unbounded memory usage for all Win32 users that > do not update their code. > > [1] > https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475 > Is there any feedback or thoughts on these proposed changes?
Thanks! -- Ivan Zhakov
