Hi Anastasia, list,

On Thu, Apr 25, 2024 at 10:26 AM Anastasia Klimchuk <a...@chromium.org> wrote:
>
> Angel, great to see you here again! Also, you provided very useful 
> information.
>
> > Did you conside making it so that only SPI programmers / flash chips
> > skip the delay? The SPI bus' strict timing leaves no room for this
> > problem to occur, so it should be safe to skip this delay. And this
> > would keep non-SPI unaffected (which is most likely what needs this
> > extra delay).
>
> This sounds interesting.
> If we check for non-SPI chips, it only adds one more line to the patch
> (if I understand your idea correctly):
>
> if (flash->chip->bustype == BUS_NONSPI)

I haven't looked into the specifics at all, but that's the idea. It
would be good to test that the delay is still executed on non-SPI
flash chips. I think `BUS_NONSPI` actually encompasses multiple buses,
so a simple equality test might not work. I would try something like
this:

/*
 * Work around chips which need some time to calm down.
 * Not needed for SPI flash chips because of the bus' strict timing.
 */
if (flash->chip->bustype != BUS_SPI)
        programmer_delay(flashctx, 1000*1000);

[The following is a textual representation of my thought processes;
it's not particularly necessary but I felt it could be interesting]

My idea is to maintain backwards compatibility while still enabling
new features and improvements. To do this, my very general approach is
to look for "cut-off points", i.e. some criteria to differentiate
between "preserve existing behaviour to avoid regressions" and "add
new features and/or make existing stuff better", seeking to maximise
the extent of the improvements (e.g. improving SPI support in general
instead of artificially restricting the improvements to a few
chips/programmers) while also trying to minimise the chance of
regressions (which can be fatal in the case of flashrom: some LPC/FWH
flash chips are TSOP32, externally reflashing them sounds very
not-fun). Analytically solving this is most likely impossible, but for
specific cases it's possible to find a reasonably acceptable ("good
enough") balance with the usual problem-solving tools: reason (e.g.
check against specifications, test things, etc.), intuition (useful as
a heuristic to get "directions" to explore further), and communication
(e.g. this very mailing list thread).

In this case (removing the delay in the verify path), my thoughts were
(not necessarily in order):
- I feel this delay was introduced for a good reason (even though I
have no idea what the reason is), so removing it will probably break
something.
- But do we have to remove the delay for *everything*, or can we
conditionally skip it in some circumstances? This isn't a strict
yes/no scenario.
- Looks like the messages proposing this change pretty much only talk
about flashing SPI chips, maybe this delay relates to the bus somehow.
Plus, the delay was introduced a long time ago.
- The SPI bus has very strict timing requirements (emulating SPI flash
isn't easy). However, Non-SPI flash chips don't have the same
requirements (older laptop ECs used SPI flash chips, but exposed the
flash contents to the chipset over LPC).
- It's likely this delay isn't needed for SPI because of all of this,
so let's consider skipping the delay for SPI. It's what most (but not
all) people use these flashrom for these days.
- Even if some non-SPI chips don't need the delay, it's much harder to
test this. Plus, keeping the delay in there for non-SPI won't hurt.

> On Thu, Apr 25, 2024 at 7:35 AM Angel Pons <th3fan...@gmail.com> wrote:
> >
> > Hi Brian, list,
> >
> > Thanks for bringing this up on the mailing list.
> >
> > On Wed, Apr 24, 2024 at 9:16 PM Brian Norris <briannor...@chromium.org> 
> > wrote:
> > >
> > > Background: https://review.coreboot.org/c/flashrom/+/80807
> > >
> > > A long time ago, in the pre-git times [1], flashrom added a 1 second
> > > delay to all verification, and claimed that some chips "need some time
> > > to calm down." The commit message claims it "fixes a few reports where
> > > verify directly after erase had unpleasant side effects like
> > > corrupting flash or at least getting incorrect verify results." It
> > > provides no details of what systems, chips, or programmers were
> > > involved.
> >
> > Back then, SPI flash chips weren't as ubiquitous as they currently
> > are. This workaround is most likely for Parallel/LPC/FWH flash chips,
> > which can actually be quite weird.
> >
> > > This delay remains in the --verify path today, and IMO, it's a big
> > > waste of time. If there are truly chips that are, say, deasserting the
> > > BUSY line before they're truly finished programming ... well, we
> > > should track those chips down and add targeted quirk flags for them.
> > > We shouldn't penalize all flashrom users in all cases for all time.
> >
> > I agree that this delay shouldn't be unconditional.
> >
> > > Personally, I highly doubt that this delay is still relevant today --
> > > there may have been a bug in some programmer that has since been
> > > fixed; there may have been some malfunctioning system that is no
> > > longer in use; ... or it's possible this is still hiding a real buggy
> > > chip somewhere out there.
> >
> > If we don't know for sure whether this delay is no longer needed, I
> > wouldn't risk re-introducing issues in flashrom.
> >
> > > I propose: we still remove the delay. There's plenty of description in
> > > the above Gerrit link about why, and how we can handle regressions.
> > > (For one, it's relatively simple to split up a --verify operation into
> > > its constituent --write/sleep/--read operations, for debugging.)
> > >
> > > The request:
> > > 1. Tests: I've tested a few Chromebooks, but imagine this had to have
> > > been some more esoteric system. Extra testing is welcome.
> >
> > Do Chromebooks use non-SPI flash chips? They probably don't.
> >
> > > 2. Thoughts: does my proposal make sense? Am I missing something obvious?
> >
> > Did you conside making it so that only SPI programmers / flash chips
> > skip the delay? The SPI bus' strict timing leaves no room for this
> > problem to occur, so it should be safe to skip this delay. And this
> > would keep non-SPI unaffected (which is most likely what needs this
> > extra delay).
> >
> > > 3. Awareness: if nothing else, this email may serve to highlight in
> > > case we land the above patch and later hear back that there are some
> > > sketchy --verify reports from users.
> >
> > I agree that awareness is important, but bear in mind that external
> > programmers for non-SPI flash chips are increasingly rare, so
> > recovering from a brick can be quite convoluted (e.g. having to
> > desolder a TSOP32 flash chip to reflash it somehow).
> >
> > > Thanks,
> > > Brian
> > >
> > > [1] The svn->git commit in question:
> > >
> > >   commit 8ab49e72af8465d4527de2ec37b22cd44f7a1169
> > >   Date:   Wed Aug 19 13:55:34 2009 +0000
> > >   Disallow erase/write for known bad chips so people won't try without
> > > a clear understanding
> > > _______________________________________________
> > > flashrom mailing list -- flashrom@flashrom.org
> > > To unsubscribe send an email to flashrom-le...@flashrom.org
> >
> > Best regards,
> > Angel
> > _______________________________________________
> > flashrom mailing list -- flashrom@flashrom.org
> > To unsubscribe send an email to flashrom-le...@flashrom.org
>
> --
> Anastasia.

Best regards,
Angel
_______________________________________________
flashrom mailing list -- flashrom@flashrom.org
To unsubscribe send an email to flashrom-le...@flashrom.org

Reply via email to