> Sure, reverting something that has a bug is reasonable.  The problem is 
> how this happened: this code clearly did not get tested, as it doesn't 
> even build and re-introduces a bunch of ICEs.  We're very late in stage 
> 4 and this is the second time the entire port has been broken in as many 
> weeks.  That's a really bad time to be breaking things.

Thanks Palmer to point this out. Let me share more details about the process.
The revert is not just a git command like git revert commit-id. Consider below 
scenario.

Day-1: commit-A: rename a attribute.
Day-2: commit-B: other patches reference this renamed attribute but no need to 
revert.

To follow the suggestion of Robin that keeping the revert commit pure, I 
followed below step(s)

1. Revert commit-A locally, and address the reference issue.
2. Run fully test to see if any surprise (take 80-90 mins for 64 cores. Sorry 
again only c/c++ parts, Fortran is missing here).
3. Revert commit-A to upstream (of course it will result in built failure).
4. File patch to address the reference attribute issue.

The other revert almost take similar step(s) but looks some mistake here.
Sorry for that each test result is comparing between code change and upstream, 
and it will be missed after
we moving on.

> Do you guys have a fix for the regressions that showed up over the 
> weekend?

Understand current we are in the sensitive stage, will take care of the 
failures ASAP.

Pan

-----Original Message-----
From: Palmer Dabbelt <pal...@dabbelt.com> 
Sent: Tuesday, April 23, 2024 8:43 AM
To: juzhe.zh...@rivai.ai
Cc: Patrick O'Neill <patr...@rivosinc.com>; Li, Pan2 <pan2...@intel.com>; Robin 
Dapp <rdapp....@gmail.com>; gcc-patches@gcc.gnu.org; Kito Cheng 
<kito.ch...@gmail.com>
Subject: Re: Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and 
xfail tests

On Mon, 22 Apr 2024 15:07:59 PDT (-0700), juzhe.zh...@rivai.ai wrote:
> Apologize that we didn't post our (me, kito and Li Pan) disscussions.

Some amount of off-list discussion is inevitable, but as far as I can 
tell here we ended up with some code committed that wasn't even posted 
to the lists and didn't even build.  I don't know exactly where the bar 
for public discussions is, but it's got to be somewhere higher than 
that.

> This is the story:
> We found that my previous patches which support highpart register overlap 
> with register filter for instructions like (vwadd.wv)
> cause ICE reported by:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714
> and this is obviously a regression (No ICE on GCC 13.2, but ICE on GCC 14)
> 
> We have tried several fixes to work around this ICE, however, we failed.
> And also I found my previous patches are quite wrong which is not the perfect 
> solution to support register group overlap
> for vwadd.wv. 
> So, finally we decide to revert those patches.

Sure, reverting something that has a bug is reasonable.  The problem is 
how this happened: this code clearly did not get tested, as it doesn't 
even build and re-introduces a bunch of ICEs.  We're very late in stage 
4 and this is the second time the entire port has been broken in as many 
weeks.  That's a really bad time to be breaking things.

I think we're really set the wrong precedent on what the bar is for 
review here.  We had a lot of breakages early on in the 14 development 
cycle and never really dug into that, I was hoping that getting CI set 
up would be a strong enough hint to stop the breakages.  Clearly that 
didn't work, though, so:

Please stop breaking the port.

It's exceedingly rare that any patch needs to be committed minutes after 
it's posted.  These port-wide breakages are really the only thing where 
it could be agreed that's the way to go, but as we can see here rushing 
is as likely to dig a bigger hole as it is to fix things.  I get the 
testsuite can be kind of hard to run, but if you don't want to run it 
locally you can just wait for the CI to do it for you.  That's not 
really asking for very much.

> Kito knows the details of this story, kito can share more details in GNU 
> patche meeting.

Ya, we can talk tomorrow morning.

Do you guys have a fix for the regressions that showed up over the 
weekend?

Either way I'd prefer to go with reverting all this and then taking 
Robin's more self-contained fix.  If you guys want to do a bigger change 
later that's fine, we're just really close to the release and it's not a 
good time to risk breaking things.  We've only had a few days of a 
functioning port over the last week or two, that's already put us behind 
on the distro prerelases/RCs so I'm kind of worried something else has 
slipped in.

> 
> Thanks.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Patrick O'Neill
> Date: 2024-04-23 01:20
> To: Li, Pan2; Robin Dapp; gcc-patches@gcc.gnu.org
> CC: juzhe.zh...@rivai.ai; kito.ch...@gmail.com
> Subject: Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail 
> tests
> Hi Pan,
> I'm not sure I'm following.  Did we miss something that should have been
> covered?  Like only an overlap on the srcs but not the dest?
> Are there testcases that fail?  If so we should definitely have one.
> Can you give some additional information on why these reverts are needed?
> +1 to the request for a failing testcase if there is one. Patrick If 
> something is broken then indeed we should revert it.
> Yes, we may need to support these in gcc-15.
> ... why not just revert everything and xfail all the tests in a
> follow up?  Your patch is essentially a revert but doesn't look like
> it.  I'd rather we let a revert be a revert and adjust the tests
> separately so it becomes clear.
> Sure, will revert b3b2799b872 and then file the patch for the xfail tests.
> Pan
> 

Reply via email to