Take this code snippet from x264: h->mc.mc_luma( pix , 64, m->p_fref, m->i_stride[0], omx, omy-1, bw, bh, &m->weight[0] ); h->mc.mc_luma( pix+16, 64, m->p_fref, m->i_stride[0], omx, omy+1, bw, bh, &m->weight[0] ); h->mc.mc_luma( pix+32, 64, m->p_fref, m->i_stride[0], omx-1, omy, bw, bh, &m->weight[0] ); h->mc.mc_luma( pix+48, 64, m->p_fref, m->i_stride[0], omx+1, omy, bw, bh, &m->weight[0] );
After each call to mc_luma, gcc reloads h->mc.mc_luma, m->p_fref, and m->i_stride[0] even if restrict is used. It does this because it cannot prove at compile-time that none of these are pointers to some global data which mc_luma might modify. Being that mc_luma is a function pointer, even link-time optimization may have trouble proving this sort of thing. Obviously we could create local variables for all of these values, but when trying to optimize huge amounts of code, this quickly becomes ugly and messy. A solution for this problem might be an intrinsic to tell gcc that a particular pointer never aliases global/static data and thus can be assumed to be unchanged across function calls--and thus does not need to be reloaded. Another, similar solution might be an intrinsic that says that a given function never modifies global state. This could be applied to a function pointer as well as a function. That would instead offload the task to the individual function instead of the individual pointer. Alexander Strange suggested that some of this might be possible in 4.6 given the IPA-PTA optimization framework, so I'm curious whether these ideas are feasible or not. -- Summary: Intrinsic possibility: does not alias global data Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: darkshikari at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43827