[PATCH] Improve PTA for PR91257, new bitmap_ior_into_and_free
The following patch addresses appearant quadraticness in PTAs SCC merging algorithm which does (simplified) for (node N in SCC) bitmap_ior_into (pred(leader), pred(N)); with the linked list implementation this leads to increasingly large amount of walking for pred(leader). The patch changes this to while (split /= 2) for (node N in lower-half-of-SCC) bitmap_ior_into_and_free (pred (N), pred (N in upper half of SCC)); so in each outer iteration reducing the number of bitmaps by a factor of two and thus hopefully (and in practice for the testcase) reducing the intermediate bitmap sizes we work on. This improves tree PTA : 8.38 ( 23%) 0.22 ( 33%) 8.62 ( 24%)7679 kB ( 1%) to tree PTA : 5.30 ( 16%) 0.22 ( 32%) 5.54 ( 16%) 28081 kB ( 4%) for the reduced testcase. I somehow didn't manage a 1:1 conversion so I need to re-check details here still I'd like to hear comments about the new bitmap API which is a source-destructive variant for bitmap_ior_into, instead of copying elements not in A moving them from B and in the end releasing the rest. Note using this API with keeping the merging as-is doesn't help PTA time. Richard. 2019-07-29 Richard Biener PR tree-optimization/91257 * bitmap.h (bitmap_ior_into_and_free): Declare. * bitmap.c (bitmap_list_unlink_element): Add defaulted param whether to add the unliked element to the freelist. (bitmap_list_insert_element_after): Add defaulted param for an already allocated element. (bitmap_ior_into_and_free): New function. * tree-ssa-structalias.c (condense_visit): Reduce the predecessor and edge bitmaps of the SCC members in a logarithmic fashion rather than all to one. Index: gcc/bitmap.c === --- gcc/bitmap.c(revision 273795) +++ gcc/bitmap.c(working copy) @@ -252,7 +252,8 @@ bitmap_list_link_element (bitmap head, b and return it to the freelist. */ static inline void -bitmap_list_unlink_element (bitmap head, bitmap_element *element) +bitmap_list_unlink_element (bitmap head, bitmap_element *element, + bool to_freelist = true) { bitmap_element *next = element->next; bitmap_element *prev = element->prev; @@ -279,18 +280,21 @@ bitmap_list_unlink_element (bitmap head, head->indx = 0; } - bitmap_elem_to_freelist (head, element); + if (to_freelist) +bitmap_elem_to_freelist (head, element); } -/* Insert a new uninitialized element into bitmap HEAD after element - ELT. If ELT is NULL, insert the element at the start. Return the - new element. */ +/* Insert a new uninitialized element (or NODE if not NULL) into bitmap + HEAD after element ELT. If ELT is NULL, insert the element at the start. + Return the new element. */ static bitmap_element * bitmap_list_insert_element_after (bitmap head, - bitmap_element *elt, unsigned int indx) + bitmap_element *elt, unsigned int indx, + bitmap_element *node = NULL) { - bitmap_element *node = bitmap_element_allocate (head); + if (!node) +node = bitmap_element_allocate (head); node->indx = indx; gcc_checking_assert (!head->tree_form); @@ -2009,6 +2013,56 @@ bitmap_ior_into (bitmap a, const_bitmap return changed; } +/* A |= B. Return true if A changes. Free B (re-using its storage + for the result). */ + +bool +bitmap_ior_into_and_free (bitmap a, bitmap *b_) +{ + bitmap b = *b_; + bitmap_element *a_elt = a->first; + bitmap_element *b_elt = b->first; + bitmap_element *a_prev = NULL; + bitmap_element **a_prev_pnext = &a->first; + bool changed = false; + + gcc_checking_assert (!a->tree_form && !b->tree_form); + gcc_assert (a->obstack == b->obstack); + if (a == b) +return false; + + while (b_elt) +{ + /* If A lags behind B, just advance it. */ + if (!a_elt || a_elt->indx == b_elt->indx) + { + changed = bitmap_elt_ior (a, a_elt, a_prev, a_elt, b_elt, changed); + b_elt = b_elt->next; + } + else if (a_elt->indx > b_elt->indx) + { + bitmap_element *b_elt_next = b_elt->next; + bitmap_list_unlink_element (b, b_elt, false); + bitmap_list_insert_element_after (a, a_prev, b_elt->indx, b_elt); + b_elt = b_elt_next; + } + + a_prev = *a_prev_pnext; + a_prev_pnext = &a_prev->next; + a_elt = *a_prev_pnext; +} + + gcc_checking_assert (!a->current == !a->first); + if (a->current) +a->indx = a->current->indx; + + if (b->obstack) +BITMAP_FREE (*b_); + else +bitmap_clear (b); + return changed; +} + /* DST = A ^ B */ void Index: gcc/bitmap.h === --- gcc/bitmap.h(revision 273795) +++ gcc/bit
Re: [PATCH] Improve PTA for PR91257, new bitmap_ior_into_and_free
On Mon, 29 Jul 2019, Richard Biener wrote: > > The following patch addresses appearant quadraticness in PTAs > SCC merging algorithm which does (simplified) > > for (node N in SCC) >bitmap_ior_into (pred(leader), pred(N)); > > with the linked list implementation this leads to increasingly large > amount of walking for pred(leader). > > The patch changes this to > > while (split /= 2) >for (node N in lower-half-of-SCC) > bitmap_ior_into_and_free (pred (N), pred (N in upper half of SCC)); > > so in each outer iteration reducing the number of bitmaps by a factor > of two and thus hopefully (and in practice for the testcase) reducing > the intermediate bitmap sizes we work on. > > This improves > > tree PTA : 8.38 ( 23%) 0.22 ( 33%) 8.62 ( > 24%)7679 kB ( 1%) > > to > > tree PTA : 5.30 ( 16%) 0.22 ( 32%) 5.54 ( > 16%) 28081 kB ( 4%) > > for the reduced testcase. > > I somehow didn't manage a 1:1 conversion so I need to re-check details > here still I'd like to hear comments about the new bitmap API > which is a source-destructive variant for bitmap_ior_into, instead of > copying elements not in A moving them from B and in the end releasing > the rest. Note using this API with keeping the merging as-is doesn't > help PTA time. This is what I applied. Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2019-07-30 Richard Biener PR tree-optimization/91257 * bitmap.h (bitmap_ior_into_and_free): Declare. * bitmap.c (bitmap_list_unlink_element): Add defaulted param whether to add the unliked element to the freelist. (bitmap_list_insert_element_after): Add defaulted param for an already allocated element. (bitmap_ior_into_and_free): New function. * tree-ssa-structalias.c (condense_visit): Reduce the ponts-to and edge bitmaps of the SCC members in a logarithmic fashion rather than all to one. Index: gcc/bitmap.c === --- gcc/bitmap.c(revision 273795) +++ gcc/bitmap.c(working copy) @@ -252,7 +252,8 @@ bitmap_list_link_element (bitmap head, b and return it to the freelist. */ static inline void -bitmap_list_unlink_element (bitmap head, bitmap_element *element) +bitmap_list_unlink_element (bitmap head, bitmap_element *element, + bool to_freelist = true) { bitmap_element *next = element->next; bitmap_element *prev = element->prev; @@ -279,18 +280,21 @@ bitmap_list_unlink_element (bitmap head, head->indx = 0; } - bitmap_elem_to_freelist (head, element); + if (to_freelist) +bitmap_elem_to_freelist (head, element); } -/* Insert a new uninitialized element into bitmap HEAD after element - ELT. If ELT is NULL, insert the element at the start. Return the - new element. */ +/* Insert a new uninitialized element (or NODE if not NULL) into bitmap + HEAD after element ELT. If ELT is NULL, insert the element at the start. + Return the new element. */ static bitmap_element * bitmap_list_insert_element_after (bitmap head, - bitmap_element *elt, unsigned int indx) + bitmap_element *elt, unsigned int indx, + bitmap_element *node = NULL) { - bitmap_element *node = bitmap_element_allocate (head); + if (!node) +node = bitmap_element_allocate (head); node->indx = indx; gcc_checking_assert (!head->tree_form); @@ -2009,6 +2013,56 @@ bitmap_ior_into (bitmap a, const_bitmap return changed; } +/* A |= B. Return true if A changes. Free B (re-using its storage + for the result). */ + +bool +bitmap_ior_into_and_free (bitmap a, bitmap *b_) +{ + bitmap b = *b_; + bitmap_element *a_elt = a->first; + bitmap_element *b_elt = b->first; + bitmap_element *a_prev = NULL; + bitmap_element **a_prev_pnext = &a->first; + bool changed = false; + + gcc_checking_assert (!a->tree_form && !b->tree_form); + gcc_assert (a->obstack == b->obstack); + if (a == b) +return false; + + while (b_elt) +{ + /* If A lags behind B, just advance it. */ + if (!a_elt || a_elt->indx == b_elt->indx) + { + changed = bitmap_elt_ior (a, a_elt, a_prev, a_elt, b_elt, changed); + b_elt = b_elt->next; + } + else if (a_elt->indx > b_elt->indx) + { + bitmap_element *b_elt_next = b_elt->next; + bitmap_list_unlink_element (b, b_elt, false); + bitmap_list_insert_element_after (a, a_prev, b_elt->indx, b_elt); + b_elt = b_elt_next; + } + + a_prev = *a_prev_pnext; + a_prev_pnext = &a_prev->next; + a_elt = *a_prev_pnext; +} + + gcc_checking_assert (!a->current == !a->first); + if (a->current) +a->indx = a->current->indx; + + if (b->obstack) +BITMAP_FREE (*b_); + else +bitmap_clear (b);