That theory could be true, and I certainly don't have any better ideas,
though I've never observed any hang on my local machine when recreating
this issue. It could be something specific to Kokoro.

I've fixed the gem5art error here:
https://gem5-review.googlesource.com/c/public/gem5/+/49044. We can see if
this fixes the timeout issue. If the timeout error persists  we can
consider increasing the timeout:
https://gem5-review.googlesource.com/c/public/gem5/+/48443


--
Dr. Bobby R. Bruce
Room 3050,
Kemper Hall, UC Davis
Davis,
CA, 95616

web: https://www.bobbybruce.net


On Thu, Jul 22, 2021 at 8:33 PM Gabe Black <gabe.bl...@gmail.com> wrote:

> Another possibility is that while the gem5-art error may not actually kill
> the run, it may, for instance, have failed trying to download something
> with a generous timeout, and waiting for that timeout pushed the rest of
> the run out enough to trip the timeout? Just a thought. I haven't checked
> exhaustively, but it feels like the timeout always goes along with the
> gem5-art error message.
>
> Gabe
>
> On Thu, Jul 22, 2021 at 5:27 PM Gabe Black <gabe.bl...@gmail.com> wrote:
>
>> Ok, thanks. I don't know if you saw the CL I put up recently where the
>> src/base/cprintftime.cc executable (the one built from that source) was
>> broken, which made kokoro fail. The breakage was real and worth fixing, but
>> I'm not sure why kokoro was trying to build it in the first place? Maybe
>> sometimes kokoro tries building things that we didn't really want it to.
>>
>> In my recent scons hacking, I ran into that accidentally when
>> build/X86/${BLAHBLAH} expanded into build/X86/ because that variable didn't
>> exist, so scons went of and started building EVERYTHING it knew about below
>> build/X86/. Hypothetically, that could explain the long build times and the
>> building of that random other binary? Maybe we have some sort of race
>> condition where a target expands to an empty string?
>>
>> Gabe
>>
>> On Thu, Jul 22, 2021 at 3:04 PM Bobby Bruce <bbr...@ucdavis.edu> wrote:
>>
>>> Ok, so I did look into this today and didn't find anything. On my
>>> desktop machine the difference in running the pre-submit tests from the
>>> stable branch and develop branch (including building the binaries) was only
>>> 10 minutes so we've really not done anything to increase the build/test
>>> times to a significant extent. My running theory is Kokoro was running
>>> slower (??? I have no idea what Kokoro is actually doing or running on
>>> behind the scenes so I don't know whether that makes sense, but I cannot
>>> think of any other explanation). I don't like the solution, but I've
>>> submitted a patch to increase the timeout to 7 hours which should stop this
>>> timeout event from happening:
>>> https://gem5-review.googlesource.com/c/public/gem5/+/48443
>>>
>>> I still haven't looked into the gem5 error yet but I'm pretty confident
>>> this shouldn't interfere with the presubmit validation.
>>>
>>> --
>>> Dr. Bobby R. Bruce
>>> Room 3050,
>>> Kemper Hall, UC Davis
>>> Davis,
>>> CA, 95616
>>>
>>> web: https://www.bobbybruce.net
>>>
>>>
>>> On Wed, Jul 21, 2021 at 5:37 PM Gabe Black <gabe.bl...@gmail.com> wrote:
>>>
>>>> Ok thanks, Bobby. Please let me know if you find anything, especially
>>>> if it looks like it's a bug in kokoro itself somehow.
>>>>
>>>> Gabe
>>>>
>>>> On Wed, Jul 21, 2021 at 3:52 PM Bobby Bruce <bbr...@ucdavis.edu> wrote:
>>>>
>>>>> There's definitely something funny going on with the gem5art tests
>>>>> there but I believe that error is happening without triggering a non-zero
>>>>> exit code. The gem5art test script is set to `set -e`, which means the
>>>>> script should exit immediately after a failure, yet it doesn't. The 
>>>>> testing
>>>>> also continues onto the other tests. I'll look into this.
>>>>>
>>>>> In the example you linked, the issue appears to be because it has
>>>>> reached the 6 hour timeout. We could increase the timeout to fix this, but
>>>>> I'd like to know why our build/test times have increased enough to push us
>>>>> over the 6 hour line.  I'll see if I can figure it out as well.
>>>>>
>>>>> --
>>>>> Dr. Bobby R. Bruce
>>>>> Room 3050,
>>>>> Kemper Hall, UC Davis
>>>>> Davis,
>>>>> CA, 95616
>>>>>
>>>>> web: https://www.bobbybruce.net
>>>>>
>>>>>
>>>>> On Wed, Jul 21, 2021 at 2:51 PM Gabe Black via gem5-dev <
>>>>> gem5-dev@gem5.org> wrote:
>>>>>
>>>>>> I've seen many kokoro failures lately, including this one which seems
>>>>>> to be from a problem in gem5-art? Any idea what's going on?
>>>>>>
>>>>>>
>>>>>> https://source.cloud.google.com/results/invocations/caae5aad-91a6-4c6e-9fbe-20962f9c5519/targets/gem5%2Fgcp_ubuntu%2Fpresubmit/log
>>>>>> _______________________________________________
>>>>>> gem5-dev mailing list -- gem5-dev@gem5.org
>>>>>> To unsubscribe send an email to gem5-dev-le...@gem5.org
>>>>>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>>>>>
>>>>>
_______________________________________________
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to