Yes, good point. I should have been more specific. I believe this math might
overflow and then truncate:
template<typename T, unsigned inlineCapacity, typename OverflowHandler>
bool Vector<T, inlineCapacity, OverflowHandler>::tryExpandCapacity(unsigned
newMinCapacity)
{
return tryReserveCapacity(std::max(newMinCapacity, std::max(16u,
static_cast<unsigned>(capacity() + capacity() / 4 + 1))));
}
Geoff
> On Nov 19, 2014, at 2:04 PM, Chris Dumez <[email protected]> wrote:
>
> Hi,
>
> I believe the Vector class is already checking for overflows. I looked
> quickly and it seems that when trying to allocate / reallocate the Vector
> buffer, we will call CRASH() if:
> (newCapacity > std::numeric_limits<unsigned>::max() / sizeof(T))
>
> Kr,
> --
> Chris Dumez - Apple Inc.
> Cupertino, CA
>
>
>
>
>> On Nov 19, 2014, at 1:57 PM, Geoffrey Garen <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> I don’t haver a specific opinion on size_t vs unsigned, but I have a related
>> point:
>>
>> If Vector uses unsigned, then it must RELEASE_ASSERT that it does not
>> overflow unsigned when growing.
>>
>> Even if most normal content allocates < 4GB, exploit content surely will try
>> to allocate > 4GB.
>>
>> Chris, maybe you can make this change in trunk, since trunk uses unsigned?
>>
>> Geoff
>>
>>> On Nov 19, 2014, at 1:50 PM, Alexey Proskuryakov <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>
>>> That makes good sense for internal implementation, do you think that class
>>> API should also use a custom type, or should it use size_t?
>>>
>>> With Vector though, I don't know how we would differentiate code paths that
>>> need large allocations from ones that don't. Nearly anything that is
>>> exposed as a JS API or deals with external world can hit sizes over 4Gb.
>>> That's not out of reach in most scenarios, not even for resources loaded
>>> from network.
>>>
>>> - Alexey
>>>
>>>
>>> 19 нояб. 2014 г., в 13:19, Filip Pizlo <[email protected]
>>> <mailto:[email protected]>> написал(а):
>>>
>>>> ArrayBuffers are very special because they are part of ECMAScript.
>>>>
>>>> We use unsigned for the length, because once upon a time, that would have
>>>> been the right type; these days the right type would be a 53-bit integer.
>>>> So, size_t would be wrong. I believe that ArrayBuffers should be changed
>>>> to use a very special size type; probably it wouldn't even be a primitive
>>>> type but a class that carefully checked that you never overflowed 53 bits.
>>>>
>>>> -Filip
>>>>
>>>>
>>>>> On Nov 19, 2014, at 12:54 PM, Alexey Proskuryakov <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>>
>>>>> This is not exactly about Vector, but if one uses
>>>>> FileReader.prototype.readAsArrayBuffer() on a large file, I think that it
>>>>> overflows ArrayBuffer. WebKit actually crashes when uploading
>>>>> multi-gigabyte files to YouTube, Google Drive and other similar services,
>>>>> although I haven't checked whether it's because of ArrayBuffers, or
>>>>> because of a use of int/unsigned in another code path.
>>>>>
>>>>> I think that we should use size_t everywhere except for perhaps a few key
>>>>> places where memory impact is critical (and of course we need explicit
>>>>> checks when casting down to an unsigned). Or perhaps the rule can be even
>>>>> simpler, and unsigned may never be used for indices and sizes, period.
>>>>>
>>>>> - Alexey
>>>>>
>>>>>
>>>>> 19 нояб. 2014 г., в 12:32, Filip Pizlo <[email protected]
>>>>> <mailto:[email protected]>> написал(а):
>>>>>
>>>>>> Whatever we do, the clients of Vector should be consistent about what
>>>>>> type they use. I'm actually fine with Vector internally using unsigned
>>>>>> even if the API uses size_t, but right now we have lots of code that
>>>>>> uses a mix of size_t and unsigned when indexing into Vectors. That's
>>>>>> confusing.
>>>>>>
>>>>>> If I picked one type to use for all Vector indices, it would be unsigned
>>>>>> rather than size_t. Vector being limited to unsigned doesn't imply 4GB
>>>>>> unless you do Vector<char>. Usually we have Vectors of pointer-width
>>>>>> things, which means 32GB on 64-bit systems (UINT_MAX * sizeof(void*)).
>>>>>> Even in a world where we had more than 32GB of memory to use within a
>>>>>> single web process, I would hope that we wouldn't use it all on a single
>>>>>> Vector and that if we did, we would treat that one specially for a bunch
>>>>>> of other sensible reasons (among them being that allocating a contiguous
>>>>>> slab of virtual memory of that size is rather taxing). So, size_t would
>>>>>> buy us almost nothing since if we had a vector grow large enough to need
>>>>>> it, we would be having a bad time already.
>>>>>>
>>>>>> I wonder: are there cases that anyone remembers where we have tried to
>>>>>> use Vector to store more than UINT_MAX things? Another interesting
>>>>>> question is: What's the largest number of things that we store into any
>>>>>> Vector? We could use such a metric to project how big Vectors might get
>>>>>> in the future.
>>>>>>
>>>>>> -Filip
>>>>>>
>>>>>>
>>>>>>> On Nov 19, 2014, at 12:20 PM, Chris Dumez <[email protected]
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I recently started updating the WTF::Vector API to use unsigned types
>>>>>>> instead of size_t [1][2], because:
>>>>>>> - WTF::Vector is already using unsigned type for size/capacity
>>>>>>> internally to save memory on 64-bit, causing a mismatch between the API
>>>>>>> and the internal representation [3]
>>>>>>> - Some reviewers have asked me to use unsigned for loop counters
>>>>>>> iterating over vectors (which looks unsafe as the Vector API, e.g.
>>>>>>> size(), returns a size_t).
>>>>>>> - I heard from Joseph that this type mismatch is forcing us (and other
>>>>>>> projects using WTF) to disable some build time warnings
>>>>>>> - The few people I talked to before making that change said we should
>>>>>>> do it
>>>>>>>
>>>>>>> However, Alexey recently raised concerns about this change. it doesn't
>>>>>>> "strike him as the right direction. 4Gb is not much, and we should have
>>>>>>> more of WebKit work with the right data types, not less.”.
>>>>>>> I did not initially realize that this change was controversial, but now
>>>>>>> that it seems it is, I thought I would raise the question on webkit-dev
>>>>>>> to see what people think about this.
>>>>>>>
>>>>>>> Kr,
>>>>>>> --
>>>>>>> Chris Dumez - Apple Inc.
>>>>>>> Cupertino, CA
>>>>>>>
>>>>>>>
>>>>>>> [1] http://trac.webkit.org/changeset/176275
>>>>>>> <http://trac.webkit.org/changeset/176275>
>>>>>>> [2] http://trac.webkit.org/changeset/176293
>>>>>>> <http://trac.webkit.org/changeset/176293>
>>>>>>> [3] http://trac.webkit.org/changeset/148891
>>>>>>> <http://trac.webkit.org/changeset/148891>
>>>>>>> _______________________________________________
>>>>>>> webkit-dev mailing list
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>>>>>> <https://lists.webkit.org/mailman/listinfo/webkit-dev>
>>>>>>
>>>>>> _______________________________________________
>>>>>> webkit-dev mailing list
>>>>>> [email protected] <mailto:[email protected]>
>>>>>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>>>>> <https://lists.webkit.org/mailman/listinfo/webkit-dev>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> webkit-dev mailing list
>>> [email protected] <mailto:[email protected]>
>>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>> <https://lists.webkit.org/mailman/listinfo/webkit-dev>
>>
>> _______________________________________________
>> webkit-dev mailing list
>> [email protected] <mailto:[email protected]>
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>
_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev