Hi Jeff,
On Thu, Sep 17, 2020 at 05:12:17PM -0600, Jeff Law wrote:
> On 9/3/20 4:37 PM, Segher Boessenkool wrote:
> >> Apart from that, one P9 specific point is that the update form load isn't
> >> preferred, the reason is that the instruction can not retire until both
> >> parts complete, it can
On 9/3/20 4:37 PM, Segher Boessenkool wrote:
>> Apart from that, one P9 specific point is that the update form load isn't
>> preferred, the reason is that the instruction can not retire until both
>> parts complete, it can hold up subsequent instructions from retiring.
>> If the addi stalls (star
Hi Hans,
on 2020/9/6 上午10:47, Hans-Peter Nilsson wrote:
> On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote:
>>> Great idea! With explicitly specified -funroll-loops, it's bootstrapped
>>> but the regression testing did show one failure (the only one):
>>>
>>> PASS->FAIL: gcc.dg/sms-4.c scan-
On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote:
> > Great idea! With explicitly specified -funroll-loops, it's bootstrapped
> > but the regression testing did show one failure (the only one):
> >
> > PASS->FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms "SMS succeeded" 1
> >
> > It exposes two
Hi Segher,
on 2020/9/4 下午10:16, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Sep 04, 2020 at 04:47:37PM +0800, Kewen.Lin wrote:
Apart from that, one P9 specific point is that the update form load isn't
preferred, the reason is that the instruction can not retire until both
parts co
Hi!
On Fri, Sep 04, 2020 at 04:47:37PM +0800, Kewen.Lin wrote:
> >> Apart from that, one P9 specific point is that the update form load isn't
> >> preferred, the reason is that the instruction can not retire until both
> >> parts complete, it can hold up subsequent instructions from retiring.
> >
Hi Bin,
On Fri, Sep 04, 2020 at 04:27:32PM +0800, Bin.Cheng wrote:
> On Fri, Sep 4, 2020 at 6:37 AM Segher Boessenkool
> wrote:
> > It should have cost, certainly, but not address_cost I think. The total
> > cost of an ldu should be a tiny bit less than that of ld + that of addi;
> > the address
Hi Segher,
>> Good question! I agree that they can execute in parallel, but it depends
>> on how we interprete the addressing cost, if it's for required execution
>> resource, I think it's off, since comparing with ld, the ldu has two iops
>> and extra ALU requirement.
>
> OTOH, if you do not us
On Fri, Sep 4, 2020 at 6:37 AM Segher Boessenkool
wrote:
>
> On Thu, Sep 03, 2020 at 10:24:21AM +0800, Kewen.Lin wrote:
> > on 2020/9/2 下午6:25, Segher Boessenkool wrote:
> > > On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote:
> > >> on 2020/9/1 上午3:41, Segher Boessenkool wrote:
> > >>> On
On Thu, Sep 03, 2020 at 10:24:21AM +0800, Kewen.Lin wrote:
> on 2020/9/2 下午6:25, Segher Boessenkool wrote:
> > On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote:
> >> on 2020/9/1 上午3:41, Segher Boessenkool wrote:
> >>> On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote:
> 1) Cur
Hi Segher,
on 2020/9/2 下午6:25, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote:
>> on 2020/9/1 上午3:41, Segher Boessenkool wrote:
>>> On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote:
1) Currently address_cost hook on rs6000 always return
Hi!
On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote:
> on 2020/9/1 上午3:41, Segher Boessenkool wrote:
> > On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote:
> >> 1) Currently address_cost hook on rs6000 always return zero, but at least
> >> from Power7, pre_inc/pre_dec kind instru
Hi Bin,
I've updated the patch to punt ainc_use candidates as below:
> + /* Skip AINC candidate since it contains address update itself,
> +the replicated AINC computations when unrolling still have
> +updates, unlike reg_offset_p candidates ca
On Wed, Sep 2, 2020 at 11:50 AM Kewen.Lin wrote:
>
> Hi Bin,
>
> >> 2) This case makes me think we should exclude ainc candidates in function
> >> mark_reg_offset_candidates. The justification is that: ainc candidate
> >> handles step update itself and when we calculate the cost for it against
>
Hi Bin,
>> 2) This case makes me think we should exclude ainc candidates in function
>> mark_reg_offset_candidates. The justification is that: ainc candidate
>> handles step update itself and when we calculate the cost for it against
>> its ainc_use, the cost_step has been reduced. When unrolling
Hi Segher,
on 2020/9/1 上午3:41, Segher Boessenkool wrote:
> Hi!
>
> Just a note:
>
> On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote:
>> 1) Currently address_cost hook on rs6000 always return zero, but at least
>> from Power7, pre_inc/pre_dec kind instructions are cracked, it means we
>
On Tue, Aug 25, 2020 at 8:47 PM Kewen.Lin wrote:
>
> Hi Bin,
>
> >>
> >> For one particular case like:
> >>
> >> for (i = 0; i < SIZE; i++)
> >> y[i] = a * x[i] + z[i];
> >>
> >> we will mark reg_offset_p for IV candidates on x as below:
> >>- (unsigned long) (x_18(D)
Hi!
Just a note:
On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote:
> 1) Currently address_cost hook on rs6000 always return zero, but at least
> from Power7, pre_inc/pre_dec kind instructions are cracked, it means we
> have to take the address update into account (scalar normal operation
Hi Bin,
>>
>> For one particular case like:
>>
>> for (i = 0; i < SIZE; i++)
>> y[i] = a * x[i] + z[i];
>>
>> we will mark reg_offset_p for IV candidates on x as below:
>>- (unsigned long) (x_18(D) + 8)// only mark this before.
>>- x_18(D) + 8
>>- (unsigne
19 matches
Mail list logo