On 06/17/2015 05:12 PM, Dinh Nguyen wrote:
> On 06/17/2015 04:30 PM, Russell King - ARM Linux wrote:
>> On Wed, Jun 17, 2015 at 03:35:13PM -0500, Dinh Nguyen wrote:
>>> On Mon, Jun 1, 2015 at 6:50 AM, Geert Uytterhoeven <ge...@linux-m68k.org> 
>>> wrote:
>>>> Hi Russell,
>>>>
>>>> On Mon, Jun 1, 2015 at 12:53 PM, Russell King - ARM Linux
>>>> <li...@arm.linux.org.uk> wrote:
>>>>> On Mon, Jun 01, 2015 at 12:41:01PM +0200, Geert Uytterhoeven wrote:
>>>>>> FWIW, I have the feeling this has a slight influence on boot reliability 
>>>>>> on
>>>>>> two of my boards:
>>>>>>   - r8a7740/armadillo, which is known to suffer from a cache-related bug 
>>>>>> in
>>>>>>     its bootloader, seems to have a higher change of booting 
>>>>>> successfully on
>>>>>>     cold boot,
>>>>>>   - sh73a0/kzm9g, which has known cache-issues with secondary CPU boot 
>>>>>> up,
>>>>>>     seems to have a lower chance of booting successfully.
>>>>>>
>>>>>> No time to spend all week turning this into a statistical significant 
>>>>>> test
>>>>>> project... The reset button is my friend...
>>>>>
>>>>> Damn it, you sent this right after I merged and pushed out this change in
>>>>> my for-arm-soc branch, and was just about to send it to the arm-soc 
>>>>> people.
>>>>> What excellent timing you have. :)
>>>>
>>>> Don't worry, I didn't send that email to make you postpone this change.
>>>> Giving the fuzziness of reproduction, and the flakiness (esp. on Armadillo)
>>>> of the boot loader, and these are old SoCs, please go ahead.
>>>>
>>>>> What happens on the kzm9g if you revert the mach-shmobile changes?
>>>>
>>>> Seems to make no difference.
>>>>
>>>>> For armadillo, do you use the decompressor?  That should be doing all the
>>>>> cache cleaning already, prior to the kernel being entered.
>>>>
>>>> I think so.
>>>>
>>>> Corruption pattern ranges from lock up, over "Error: 
>>>> unrecognized/unsupported
>>>> machine ID", to booting almost completely, but lacking a few devices due to
>>>> a corrupted DTB. Been like that as long as I remember, i.e. since I got the
>>>> board ca. 1 year ago. Boots fine (100%) with kexec.
>>>>
>>>
>>> It seems like this patch is causing the SoCFPGA to not boot with SMP
>>> reliably. About 1 out of every 10 reboots, I'm seeing the boot failure
>>> below. The error seems to only happen when I do a cold or warm reboot,
>>> but never occurs during a power-up. If I revert this patch, or put
>>> back the call to v7_invalidate_l1 in socfpga_secondary_startup , then
>>> its able to boot 100% of the time.
>>
>> It really sucks that you're only just testing this change now, because
>> I've frozen my tree, and removing it for the next merge window is going
>> to be an entirely non-trivial matter.  You were copied on the original
>> patch, which you failed to test... I can't say I have _much_ sympathy
>> for a bug report at this point in time.
>>
> 
> I apologize for not catching this error while testing this patch. But I
> did test it when you first sent it out..I probably didn't do a stress
> test. Sometimes the reboot fails in the 1st attempt, sometimes it fails
> in the 9th attempt.
> 
> I only caught this error when I was testing my recent changes to use
> CPU_METHOD_OF_DECLARE.
> 
> For me, I don't think you need to revert this patch or anything, but a
> fix can go in for a -rcX?
> 

Also, I am not seeing the error on the SoCFPGA Arria 10 platform at all.
This Arria10 platform is running a different version of bootloader than
the Cyclone5. Although, I also did test with the latest version of
U-Boot on the Cyclone5.

Dinh

--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to