On Thu, 23 Feb 2023, Jakub Jelinek wrote: > On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote: > > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > > > uninitialized unsigned char. > > > > > > Given that we actually never reference the m_data array anywhere, > > > it is just to reserve space, I think even the alignas(T) there is > > > useless. The point is that m_auto has as data members: > > > vec_prefix m_vecpfx; > > > T m_vecdata[1]; > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > > > -fsanitize=bound-sstrict incompatible) being treated as > > > flexible array member flowing into the m_data storage after it. > > > > Doesn't the array otherwise eventually overlap with tail padding > > in m_auto? Or does an array of T never produce tail padding? > > The array can certainly overlap with tail padding in m_auto if any. > But whether m_data is aligned to alignof (T) or not doesn't change anything > on it. > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; > }, > so I think there is on most arches tail padding if T has smaller alignment > than int, so typically char/short or structs with the same size/alignments. > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int), > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto) > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T). > If alignof (T) >= alignof (int), then there won't be any tail padding > at the end of m_auto, there could be padding between m_vecpfx and > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and > so m_data will be again already properly aligned. > > So, I think your patch is fine without alignas(T), the rest is just that > there is more work to do incrementally, even for the case you want to > deal with (the point 1) in particular).
Looking at vec<T, A, vl_embed>::operator[] which just does template<typename T, typename A> inline const T & vec<T, A, vl_embed>::operator[] (unsigned ix) const { gcc_checking_assert (ix < m_vecpfx.m_num); return m_vecdata[ix]; } the whole thing looks fragile at best - we basically have struct auto_vec { struct vec<vl_embed> { ... T m_vecdata[1]; } m_auto; T m_data[N-1]; }; and access m_auto.m_vecdata[] as if it extends to m_data. That's not something supported by the middle-end - not by design at least. auto_vec *p; p->m_auto.m_vecdata[i] would never alias p->m_data[j], in practice we might not see this though. Also get_ref_base_and_extent will compute a maxsize/size of sizeof(T) for any m_auto.m_vecdata[i] access, but I think we nowhere actually replace 'i' by zero based on this knowledge, but we'd perform CSE with earlier m_auto.m_vecdata[0] stores, so that might be something one could provoke. Doing a self-test like static __attribute__((noipa)) void test_auto_alias (int i) { auto_vec<int, 8> v; v.quick_grow (2); v[0] = 1; v[1] = 2; int val = v[i]; ASSERT_EQ (val, 2); } shows _27 = &_25->m_vecdata[0]; *_27 = 1; ... _7 = &_12->m_vecdata[i.235_3]; val_13 = *_7; which is safe in middle-end rules though. So what "saves" us here is that we always return a reference and never a value. There's the ::iterate member function which fails to do this, the ::quick_push function does T *slot = &m_vecdata[m_vecpfx.m_num++]; *slot = obj; with static __attribute__((noipa)) void test_auto_alias (int i) { auto_vec<int, 8> v; v.quick_grow (2); v[0] = 1; v[1] = 2; int val; for (int ix = i; v.iterate (ix, &val); ix++) ; ASSERT_EQ (val, 2); } I get that optimzied to a FAIL. I have a "fix" for this. unordered_remove has a similar issue accesing the last element. There are a few functions using the [] access member which is at least sub-optimal due to repeated bounds checking but also safe. I suppose if auto_vec would be a union of vec<vl_embed> and a storage member with the vl_embed active that would work, but then that's likely not something C++11 supports. So I think to support auto_vec we'd need to make the m_vecdata[] member in vec<vl_embed> of templated size (defaulted to 1) and get rid of the m_data member in auto_vec instead. Or have another C++ way of increasing the size of auto_vec without actually adding any member? The vec<vl_embed> data accesses then would need to go through a wrapper obtaining a correctly typed pointer to m_vecdata[] since we'd like to have that as unsigned char[] to avoid the initialization. > > Yes, I'm not proposing to fix non-POD support. I want to make > > as-if-POD stuff like std::pair to work like it was intended. > > > > > Oh, and perhaps we should start marking such spots in GCC with > > > strict_flex_array attribute to make it clear where we rely on the > > > non-strict behavior. > > > > I think we never access the array directly as array, do we? > > Sure, the attribute should go to m_vecdata array, not to m_data. > And to op array in gimple_statement_with_ops, operands array in > operands, ops array in tree_omp_clause, val in tree_int_cst, > str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def, > elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we > use this stuff everywhere. Also often use GTY length similarly to the > proposed element_count... Here's a patch to fix vec.h to adhere to GCC middle-ends interpretation of array accesses - &array[i] is just address arithmetic and out-of-bounds accesses are OK. But that doesn't prevent other host compilers from miscompiling stage1 I guess, I'm not sure if there's any standard conforming way to compute the address of an element in m_data from the address of m_vecdata? Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? Thanks, Richard. >From b7d47dd1c166d216eba7160f11087312f8b5bbef Mon Sep 17 00:00:00 2001 From: Richard Biener <rguent...@suse.de> Date: Fri, 24 Feb 2023 09:54:09 +0100 Subject: [PATCH] Fix vec.h alias problems To: gcc-patches@gcc.gnu.org With auto_vec we have embedded storage that spans two members, the m_auto.m_vecdata[1] and the m_data[N-1] arrays and accesses to the data are all based on m_auto.m_vecdata[]. That's OK for GCC as long as the accesses are done indirectly through a pointer since the address computation based on m_auto.m_vecdata[] is considered OK by GCC (but not by the language standards). The following fixes the few places that failed to enfoce this indirection and adds one self-test that failed before. As I was here it also fixes ::contains to avoid repeated bounds checking and the same issue in :;lower_bound which also suffers from unnecessary copying around values. * vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to take a const reference to the object, access m_vecdata directly. (vec<T, A, vl_embed>::contains): Likewise. (vec<T, A, vl_embed>::iterate): Perform accesses to m_vecdata indirectly. (vec<T, A, vl_embed>::unordered_remove): Likewise. * vec.cc (test_auto_alias): New. (vec_cc_tests): Call it. --- gcc/vec.cc | 17 +++++++++++++++++ gcc/vec.h | 21 ++++++++++++++------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/gcc/vec.cc b/gcc/vec.cc index 511e6dff50d..afba66cb3d0 100644 --- a/gcc/vec.cc +++ b/gcc/vec.cc @@ -568,6 +568,22 @@ test_auto_delete_vec () ASSERT_EQ (dtor_count, 2); } +/* Verify accesses to m_vecdata are done indirectly. */ + +static void +test_auto_alias () +{ + volatile int i = 1; + auto_vec<int, 8> v; + v.quick_grow (2); + v[0] = 1; + v[1] = 2; + int val; + for (int ix = i; v.iterate (ix, &val); ix++) + ; + ASSERT_EQ (val, 2); +} + /* Run all of the selftests within this file. */ void @@ -587,6 +603,7 @@ vec_cc_tests () test_qsort (); test_reverse (); test_auto_delete_vec (); + test_auto_alias (); } } // namespace selftest diff --git a/gcc/vec.h b/gcc/vec.h index cee96254a31..b3ec977c23a 100644 --- a/gcc/vec.h +++ b/gcc/vec.h @@ -614,7 +614,7 @@ public: T *bsearch (const void *key, int (*compar)(const void *, const void *)); T *bsearch (const void *key, int (*compar)(const void *, const void *, void *), void *); - unsigned lower_bound (T, bool (*)(const T &, const T &)) const; + unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const; bool contains (const T &search) const; static size_t embedded_size (unsigned); void embedded_init (unsigned, unsigned = 0, unsigned = 0); @@ -929,7 +929,8 @@ vec<T, A, vl_embed>::iterate (unsigned ix, T *ptr) const { if (ix < m_vecpfx.m_num) { - *ptr = m_vecdata[ix]; + const T *slot = &m_vecdata[ix]; + *ptr = *slot; return true; } else @@ -1118,7 +1119,9 @@ inline void vec<T, A, vl_embed>::unordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; + const T *last_slot = &m_vecdata[--m_vecpfx.m_num]; + T *slot = &m_vecdata[ix]; + *slot = *last_slot; } @@ -1249,8 +1252,11 @@ vec<T, A, vl_embed>::contains (const T &search) const { unsigned int len = length (); for (unsigned int i = 0; i < len; i++) - if ((*this)[i] == search) - return true; + { + const T *slot = &m_vecdata[i]; + if (*slot == search) + return true; + } return false; } @@ -1262,7 +1268,8 @@ vec<T, A, vl_embed>::contains (const T &search) const template<typename T, typename A> unsigned -vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) +vec<T, A, vl_embed>::lower_bound (const T &obj, + bool (*lessthan)(const T &, const T &)) const { unsigned int len = length (); @@ -1273,7 +1280,7 @@ vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) half = len / 2; middle = first; middle += half; - T middle_elem = (*this)[middle]; + const T &middle_elem = this->m_vecdata[middle]; if (lessthan (middle_elem, obj)) { first = middle; -- 2.35.3