Hi Russell,
> Now because things have changed during the last merge window, I've got
> an even bigger problem sorting through that patch set and getting it
> back into a submittable state. I've just sent out v2 for it onto the
> net...@vger.kernel.org mailing list.
>
> The initial version (marked
Hi Mattis,
On Fri, Aug 29, 2014 at 7:57 AM, Mattis Lorentzon
wrote:
> Iain,
>
>> Interesting. We obviously have some differences in how we boot, my
>> changes to your config to get it to boot basically amount to reverting the
>> patch you attached and then enabling sata and mmc. So far I've been
Iain,
> Interesting. We obviously have some differences in how we boot, my
> changes to your config to get it to boot basically amount to reverting the
> patch you attached and then enabling sata and mmc. So far I've been unable
> to get your config to fail.
Our version of U-boot doesn't support
On 27/08/14 07:32, Mattis Lorentzon wrote:
> Hi Iain, Russell and Fabio,
>
>> The config is attached. Note that there's a lot of additional stuff enabled
>> as
>> I'm aiming for a single general purpose kernel that covers i.MX6, AM3359,
>> Allwinner A10/A20 along with several versions of boards u
Hi Iain, Russell and Fabio,
> The config is attached. Note that there's a lot of additional stuff enabled as
> I'm aiming for a single general purpose kernel that covers i.MX6, AM3359,
> Allwinner A10/A20 along with several versions of boards using those
> particular SoCs.
>
> Same kernel binary o
On 21/08/14 10:39, Iain Paton wrote:
> On 19/08/14 07:03, Iain Paton wrote:
>> On 17/08/14 22:46, Fabio Estevam wrote:
>>> Iain,
>>>
>>> On Sun, Aug 17, 2014 at 6:34 PM, Iain Paton wrote:
On 15/08/14 06:42, Mattis Lorentzon wrote:
> We mostly run SSH with benchmarks using NFS, it can
On 25/08/14 11:18, Russell King - ARM Linux wrote:
> On Wed, Aug 13, 2014 at 01:39:27PM +, Mattis Lorentzon wrote:
>> All our tests seem to behave the same way on the Sabrelite as on our own
>> board.
>> A working theory is that the switch (3Com Switch 4400) triggers the
>> degeneration
>> of
On Wed, Aug 13, 2014 at 01:39:27PM +, Mattis Lorentzon wrote:
> All our tests seem to behave the same way on the Sabrelite as on our own
> board.
> A working theory is that the switch (3Com Switch 4400) triggers the
> degeneration
> of the network stack from which Linux does not seem to recov
On 22/08/14 01:01, Fabio Estevam wrote:
> On Thu, Aug 21, 2014 at 6:39 AM, Iain Paton wrote:
>
>> two and a half days of running this against both a sabre-lite and a
>> wandboard quad B1 and I still have no reason to think there's any
>> sort of a problem.
>>
>> Up to now, my testing has been don
On Thu, Aug 14, 2014 at 02:43:56PM +, Mattis Lorentzon wrote:
> Fabio and Russell,
>
> > A working theory is that the switch (3Com Switch 4400) triggers the
> > degeneration of the network stack from which Linux does not seem to
> > recover, even if we later bypass the switch and directly conn
Fabio,
> What is the silicon version of the mx6 in your sabrelite? What GCC version do
> you use?
The silicon version is PCIMX6Q6AVT10AA and the GCC version we use is
arm-none-eabi-gcc (Fedora 2013.11.24-2.fc19) 4.8.1.
Iain,
> Up to now, my testing has been done with my own config, I'll now
> r
On Thu, Aug 21, 2014 at 6:39 AM, Iain Paton wrote:
> two and a half days of running this against both a sabre-lite and a
> wandboard quad B1 and I still have no reason to think there's any
> sort of a problem.
>
> Up to now, my testing has been done with my own config, I'll now
> repeat the whole
On 19/08/14 07:03, Iain Paton wrote:
> On 17/08/14 22:46, Fabio Estevam wrote:
>> Iain,
>>
>> On Sun, Aug 17, 2014 at 6:34 PM, Iain Paton wrote:
>>> On 15/08/14 06:42, Mattis Lorentzon wrote:
>>>
We mostly run SSH with benchmarks using NFS, it can probably be
triggered by using only SSH
On 17/08/14 22:46, Fabio Estevam wrote:
> Iain,
>
> On Sun, Aug 17, 2014 at 6:34 PM, Iain Paton wrote:
>> On 15/08/14 06:42, Mattis Lorentzon wrote:
>>
>>> We mostly run SSH with benchmarks using NFS, it can probably be
>>> triggered by using only SSH with the following loop:
>>>
>>> # while : ;
Iain,
On Sun, Aug 17, 2014 at 6:34 PM, Iain Paton wrote:
> On 15/08/14 06:42, Mattis Lorentzon wrote:
>
>> We mostly run SSH with benchmarks using NFS, it can probably be
>> triggered by using only SSH with the following loop:
>>
>> # while : ; do ssh arm-card date; done
>
> Mattis,
>
> What sort
On 15/08/14 06:42, Mattis Lorentzon wrote:
> We mostly run SSH with benchmarks using NFS, it can probably be
> triggered by using only SSH with the following loop:
>
> # while : ; do ssh arm-card date; done
Mattis,
What sort of time does it take for you to see a problem?
I've been running the
Fabio,
> Do the stalls also happen on a pure 3.16 kernel?
Yes, we just tried this out overnight and we get the same stalls here.
We have seen similar problems on a Zynq-based board. It might be
worth noting that a common chip between all three boards is, for
example, the KSZ9021RN, while the FEC
On Thu, Aug 14, 2014 at 11:43 AM, Mattis Lorentzon
wrote:
> After a few more tests we have finally been able to trigger the exact same
> stalls
> on the Sabrelite board with a direct network connection (i.e. without the
> switch).
Do the stalls also happen on a pure 3.16 kernel?
How can we re
Fabio and Russell,
> A working theory is that the switch (3Com Switch 4400) triggers the
> degeneration of the network stack from which Linux does not seem to
> recover, even if we later bypass the switch and directly connect the board to
> the server machine.
After a few more tests we have final
Fabio and Russell,
> In order to try to narrow down whether this is a board issue, could you try to
> run the same kernel on a mx6q development board, such as mx6qsabresd,
> cubox-i, wandboard, etc?
Indeed, we have a Sabrelite development board and have run the same kernel
configuration (please f
On Mon, Aug 11, 2014 at 10:32 AM, Mattis Lorentzon
wrote:
> Russell and Fabio,
>
>> I'd be interested to hear whether removing the
>>
>> interrupts-extended = ...
>>
>> property from your board's DT file, thereby causing you to revert back to the
>> default I list above, also fixes the insta
Russell and Fabio,
> I'd be interested to hear whether removing the
>
> interrupts-extended = ...
>
> property from your board's DT file, thereby causing you to revert back to the
> default I list above, also fixes the instability you are seeing.
We have tried to remove the board specific
On Thu, Aug 07, 2014 at 01:12:48PM +0100, Russell King - ARM Linux wrote:
> On Thu, Aug 07, 2014 at 11:11:06AM +, Mattis Lorentzon wrote:
> > Russell,
> >
> > > Can you ascertain whether these stalls are a result of some failure of the
> > > receive side or the transmit side - you should be ab
Mattis,
On Thu, Aug 7, 2014 at 11:20 AM, Fabio Estevam wrote:
> On Thu, Aug 7, 2014 at 9:12 AM, Russell King - ARM Linux
> wrote:
>
>> Hmm, I'm slightly confused. On my iMX6Q, I have:
>>
>> 150: 581754 0 0 0 GIC 150
>> 2188000.ethernet
>> 151: 0
On 8/7/2014 7:38 AM, Fabio Estevam wrote:
> On Thu, Aug 7, 2014 at 11:20 AM, Fabio Estevam wrote:
>
> ,but I am wondering if we should also do:
>
> --- a/arch/arm/boot/dts/imx6qdl-sabreauto.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-sabreauto.dtsi
> @@ -66,6 +66,7 @@
> pinctrl-0 = <&pinctrl_
On Thu, Aug 7, 2014 at 11:20 AM, Fabio Estevam wrote:
> On a imx6q sabreauto I also get:
>
> 151: 0 0 0 0 GIC 151
> 2188000.ethernet
> 166: 4577 0 0 0 gpio-mxc 6
> 2188000.ethernet
>
> and the GPIO1_6 interrupt come
On Thu, Aug 7, 2014 at 9:12 AM, Russell King - ARM Linux
wrote:
> Hmm, I'm slightly confused. On my iMX6Q, I have:
>
> 150: 581754 0 0 0 GIC 150
> 2188000.ethernet
> 151: 0 0 0 0 GIC 151
> 2188000.ethernet
Same h
On Thu, Aug 07, 2014 at 11:11:06AM +, Mattis Lorentzon wrote:
> Russell,
>
> > Can you ascertain whether these stalls are a result of some failure of the
> > receive side or the transmit side - you should be able to tell that if you
> > watch
> > the packet counts via ifconfig on the stalled
Russell,
> Can you ascertain whether these stalls are a result of some failure of the
> receive side or the transmit side - you should be able to tell that if you
> watch
> the packet counts via ifconfig on the stalled card. Also, it would be useful
> to
> know whether the FEC interrupt was fir
On Wed, Aug 06, 2014 at 11:10:06AM +, Mattis Lorentzon wrote:
> Russell,
>
> > What is on the other end of the link?
>
> 16 ARM cards connected to a 3Com Switch 4400 connected to a Linux FC 20
> machine (Intel Corporation 82541PI Gigabit Ethernet Controller rev 05).
>
> There may be multiple
Russell,
> What is on the other end of the link?
16 ARM cards connected to a 3Com Switch 4400 connected to a Linux FC 20
machine (Intel Corporation 82541PI Gigabit Ethernet Controller rev 05).
There may be multiple problems. The backtrace has only been seen a few
times, on two different cards. M
On Tue, Aug 05, 2014 at 01:31:29PM +, Mattis Lorentzon wrote:
> We have applied your V2 patch set of 30 patches on top of v3.16-rc2 and are
> currently running some stability tests.
>
> During our first test round we triggered a timeout which caused the fec driver
> to become unresponsive for
Hi Fabio,
> Could this problem be the same one as reported at:
> http://www.spinics.net/lists/arm-kernel/msg347914.html ?
The problem you link to describes a permanent issue, our problem seems
to be sporadic as most of our tests work fine (at least for a while).
> Which Ethernet PHY do you use?
On Tue, Aug 5, 2014 at 10:31 AM, Mattis Lorentzon
wrote:
> We have applied your V2 patch set of 30 patches on top of v3.16-rc2 and are
> currently running some stability tests.
>
> During our first test round we triggered a timeout which caused the fec driver
> to become unresponsive for several
Hi Russell!
> Now because things have changed during the last merge window, I've got an
> even bigger problem sorting through that patch set and getting it back into a
> submittable state. I've just sent out v2 for it onto the
> net...@vger.kernel.org mailing list.
>
> The initial version (marked
Hi Russell,
> -Original Message-
> > The initial version (marked RFC) attracted very little interest from
> > testers, or acks. I'd very much like to have some testing of it, so
> > if you want to try it out, I can provide you with a git URL, patches
> > or a combined patch.
>
> Sure! A
On 06/30/2014 07:30 AM, Fredrik Noring wrote:
>>
>> On Fri, Jun 27, 2014 at 04:16:57PM +, Fredrik Noring wrote:
>>> Please find below a trace that appeared once with 3.16-rc2. Perhaps it
>>> is of some interest?
>>
>> It's not that serious... I know that the FEC ethernet driver is horrendously
Hi Russell,
It seems to be a compiler issue, where (GCC) 4.8.2 does not produce a properly
working kernel. Happily, (Fedora 2013.11.24-2.fc19) 4.8.1 appears to do a lot
better. No crashes so far with v3.16-rc2!
All the best,
Fredrik
> -Original Message-
> Hi Fredrik,
>
> On Fri, Jun 27,
Hi Russell,
> -Original Message-
> It's not that serious... I know that the FEC ethernet driver is horrendously
> racy (I have had a patch set for about the last six months which fixes some of
> its problems) but as I've had a lot of patches to deal with, and it's been
> pushed to the back
Hi Russel,
> On Thu, Jun 26, 2014 at 04:14:24PM +0100, Russell King - ARM Linux wrote:
> > That's a similar workload to the one which is mentioned in the
> > previous report. I've just set a similar transfer going, but this
> > will be a 16GB file.
>
> I've run this transfer several times, but s
Hi Fredrik,
On Fri, Jun 27, 2014 at 04:16:57PM +, Fredrik Noring wrote:
> Please find below a trace that appeared once with 3.16-rc2. Perhaps it is of
> some interest?
It's not that serious... I know that the FEC ethernet driver is
horrendously racy (I have had a patch set for about the last
On Thu, Jun 26, 2014 at 04:14:24PM +0100, Russell King - ARM Linux wrote:
> On Thu, Jun 26, 2014 at 02:44:52PM +, Mattis Lorentzon wrote:
> > We have managed to trigger the Oops by just transferring a large file
> > over nfs
> > cat /mnt/foo > /dev/null
> > where foo is a file that is approxima
On Thu, Jun 26, 2014 at 02:44:52PM +, Mattis Lorentzon wrote:
> Thank you for your reply,
>
> > On Wed, Jun 25, 2014 at 01:55:05PM +, Mattis Lorentzon wrote:
> > > I have a similar issue with v3.16-rc2 as previously reported by Waldemar
> > Brodkorb for v3.15-rc4.
> > > https://lkml.org/lk
Thank you for your reply,
> On Wed, Jun 25, 2014 at 01:55:05PM +, Mattis Lorentzon wrote:
> > I have a similar issue with v3.16-rc2 as previously reported by Waldemar
> Brodkorb for v3.15-rc4.
> > https://lkml.org/lkml/2014/5/9/330
>
> This URL returns no useful information. I find that lkml
On Wed, Jun 25, 2014 at 01:55:05PM +, Mattis Lorentzon wrote:
> Hello kernel people,
You may wish to also copy linux-arm-ker...@lists.infradead.org, which is
where ARM kernel people are.
> I have a similar issue with v3.16-rc2 as previously reported by Waldemar
> Brodkorb for v3.15-rc4.
> ht
45 matches
Mail list logo