On Tue, Jun 11, 2019 at 9:14 AM William A Rowe Jr <wr...@rowe-clan.net>
wrote:

> On Tue, Jun 11, 2019 at 4:15 AM Branko Čibej <br...@apache.org> wrote:
>
>> On 07.06.2019 21:58, William A Rowe Jr wrote:
>> > I think the optimal way is to allocate a pair of apr thread-specific
>> > wchar buffers in each thread's pool on startup, and use those
>> > exclusively per-thread for wchar translations. We could be looking at
>> > 64k/thread exclusively for name translation, but it doesn't seem
>> > unreasonable.
>> >
>> > The alternative is to continue to use stack, we surely don't want to
>> > lock on acquiring or allocating name translation buffers. /shrug
>>
>> Since this is Windows, and there's no embedded Windows environment left
>> that I'm aware of that's still alive, we can continue with using the
>> stack ... but wouldn't it be so much better to alloca() the required
>> size instead of blindly burning through 64k every time? Obviously that
>> means counting the characters first.
>>
>
> I don't think there is any need to do so. A name buffer needs 32k (right
> now
> we limit that to 8k, but if we wanted to satisfy the OS-acceptable pathname
> length, we would blow that out 4x the current arbitrary limit.) For a
> rename,
> make that 2x buffers. But there is no benefit to wasting cycles determining
> the string length ahead of allocation, because the very next call can run
> right past that limit and hit the wall on available stack. Demanding that
> there
> always be a potential 32k runway of available stack doesn't seem excessive.
>
> The point to the stack is that it contracts immediately on return. So we
> aren't
> burning through a 64k buffer - we are ensuring that we have been topped-up
> to 64k remaining. The targets of these calls are all Win32 API
> invocations,
> we are never nesting them inside further big-buffer allocations of our own.
> What might happen in ntdll we have little control over.
>
>  We either reserve about 2x buffers for file name transliteration in heap
> per thread, or we use the thread stack. As long as we trust that our utf-8
> to ucs-2 logic is rock solid and the allocations and limits are correctly
> coded, this continues to be a safe approach. The 32k/64k are immediately
> given back in the current stack-based approach, but would simply be boat
> anchors most of the time if they are set aside in heap.
>
> I'm +/-0 on switching from stack to heap for this particular transformation
> but welcome good suggestions.
>

I failed to call this out, but guessing that we agree that if we preserve
wide
character string results (without discarding them after a single API call),
using the string size or moving towards counted byte string representation
makes a whole lot more sense for longer-term heap allocations.

Reply via email to