Thanks for looking into this! Did this happen in no specific job in particular or could it be pinned down to a single configuration? We have never had hangs like this, so this definitely seems related to a recent change.
-Marco On Thu, Mar 29, 2018 at 7:26 PM, kellen sunderland < kellen.sunderl...@gmail.com> wrote: > Debugging this a bit with Chris. I haven't looked at it closely but it > seems like there might be a genuine hang here between > CuDNNConvolutionOp<float>::SelectAlgo and a customop lambda invoke. What > do you guys think? > > Stack is here: > https://gist.github.com/KellenSunderland/84aa9bb7270c0483eeccde6f08e91489 > > -Kellen > > On Thu, Mar 29, 2018 at 6:24 PM, Chris Olivier <cjolivie...@gmail.com> > wrote: > > > Seems like a lot of PR builds are hanging at (what appears to be) the end > > of Python 3 GPU unit tests. Anyone have any idea why that might be? > > >