Felix Fietkau wrote:
> > It is, because the error leads to either or both sides thinking about
> > DMA when they should not.
> 
> Either or both sides of what? Thinking about DMA? That sentence
> unfortunately makes no sense to me.

One side = driver
Other side = hardware


> Most of the time when the driver tries to stop DMA, the hardware doesn't
> respond in time, because either the MAC or the host interface state has
> locked up. It can also be caused by the MAC not fully waking up from a
> sleep state (hence the powersave related suggestion).

I should have added "or not thinking about DMA when they should"
above. This is of course just as likely.


> In the early stages of debugging, you will usually *never* get data
> points that only mean one thing. One data point leads to ideas for
> further tests

Yes it can, but the issue I have with the powersave suggestion is
that I haven't seen followup discussion after the test has been done.

So the takeaway becomes "you can maybe avoid the problem by disabling
powersave" instead of "if you first disable powersave and that makes
the issue go away then you can look at this-and-this and poke
there-and-there in order to make the driver do that-and-that which
will show what to do next."


> I believe acquiring *all* data at once is impossible (or at least
> completely impractical), and I believe that the structured approach of
> going up one level at a time is horribly inefficient and often grossly
> misleading for bugs that show low level symptoms but have a high level
> causes.

It is an investment into understanding both the hardware and the
code, which pays off quicker and quicker as the number of errors
found goes up. Of course it's a big effort when starting from zero
as the community always does. This is where the sharing of experience
by Atheros engineers is critical for semblance of efficiency.


> Incidentally, the 'trying to turn knobs to see what happens' approach
> plus code review have been very efficient for me in dealing with that
> class of bugs, and I'll take that over a random guy's random theory of
> how debugging should and *must* be done any day.

As I wrote in the last email it's not the only way, but it's the only
way to have a complete picture of what is going on. Obviously getting
there takes more effort than turning knobs, but it pays off because
it gives the complete picture.


> So one of the differences between us appears to be this: You advocate
> that there is one way that things must be done, and if the prerequisites
> for that approach way aren't met, then it's impossible.

No, I never said impossible. Please make more of an effort not to put
words in my mouth. I'm talking about what is reliable and efficient
to reach a driver that works perfectly.


> My opinion is that this is nothing but a lame excuse,

That's fine of course.


> proven by the fact that I've been able to easily deal with similar
> situations in other drivers, and that I know people that have found
> and fixed some weird bugs in ath9k without having had any access to
> documentation and without having spent weeks on what you call
> 'reverse engineering'.

Unfortunately all this says is that the ath9k code was really
*really* messy to begin with. :( This is already known, and it's
also known that you have done a lot of work to clean it up.

I don't think it would take weeks, but I think that loading ath9k
into head without help would take several days. That's not efficient
when others already have it loaded and can share information and
experience.

I'm not contesting that you've done good work. I'm saying that if
there are difficult-to-find errors then having a complete picture of
state and comparing the desired behavior with the observed behavior
is unbeatable. But it requires knowing what desired behavior is,
which requires information about the hardware.


> I will now refrain from any further discussion with you on debugging
> approaches, since you seem quite comfortable and content in staying
> within your view of limitations and impossibilities, whereas I
> prefer to get some real work done. :)

I think you may be misunderstanding what I write. But I agree on
getting work done, our discussion about prefered debugging technique
does not change anything either way.


//Peter
_______________________________________________
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel

Reply via email to