[ https://issues.apache.org/jira/browse/TINKERPOP-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Florian Hockmann closed TINKERPOP-2341. --------------------------------------- Fix Version/s: 3.4.8 3.5.0 Resolution: Fixed > GremlinClientExtensions.SubmitAsync hangs as it tries to dispose connection > --------------------------------------------------------------------------- > > Key: TINKERPOP-2341 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2341 > Project: TinkerPop > Issue Type: Bug > Components: dotnet > Affects Versions: 3.4.3 > Reporter: Wilson Li > Priority: Major > Fix For: 3.5.0, 3.4.8 > > > We have been seeing occasional hangs when logging to Cosmos DB graph using > the Gremlin.Net NuGet (version 3.4.3). > > Here's our code snippet: > var results = await > GremlinClientExtensions.SubmitAsync<dynamic>(this.Client, > gremlinExpression).ConfigureAwait(false); > > foreach (var result in results) > { > Tracer.TraceMessage()( > traceId, > "Successfully logged activity result (ElapsedMilliseconds=\{0}): > '\{1}'. Request charge is '\{2}'.", > sw.ElapsedMilliseconds, > JsonConvert.SerializeObject(result), > this.GetRequestCharge(results.StatusAttributes)); > } > > From the dumps, it seems the connection to the DB may have hit some > {color:#de350b}failure{color}: > 0:000> !PrintException /d 000001c61a0fd2c8 > Exception object: 000001c61a0fd2c8 > Exception type: System.Net.WebSockets.WebSocketException > Message: An internal WebSocket error occurred. Please see the innerException, > if present, for more details. > InnerException: System.IO.IOException, Use !PrintException 000001c61a0fa760 > to see more. > StackTrace (generated): > SP IP Function > 000000D5B446CA20 00007FFD9263D383 > System_ni!System.Net.WebSockets.WebSocketBase.ThrowIfConvertibleException(System.String, > System.Exception, System.Threading.CancellationToken, Boolean)+0x483 > 000000D5B446CAC0 00007FFD9203372B > System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x56b > 000000D5B446A360 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D5B446A390 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D5B446A3C0 00007FFD35D7B46C > Gremlin_Net!Gremlin.Net.Driver.WebSocketConnection+<ReceiveMessageAsync>d__9.MoveNext()+0x20c > 000000D5B4467CA0 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D5B4467CD0 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D5B4467D00 00007FFD35D7A96F > Gremlin_Net!Gremlin.Net.Driver.Connection+<ReceiveMessagesAsync>d__20.MoveNext()+0x16f > 000000D3EF8FE770 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D3EF8FE7A0 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D3EF8FE7D0 00007FFD35D8B1BC > Gremlin_Net!Gremlin.Net.Driver.ProxyConnection+<SubmitAsync>d__3`1[[System.__Canon, > mscorlib]].MoveNext()+0x15c > 000000D3EF8FC0D0 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D3EF8FC100 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D3EF8FC130 00007FFD35D8887C > Gremlin_Net!Gremlin.Net.Driver.GremlinClient+<SubmitAsync>d__6`1[[System.__Canon, > mscorlib]].MoveNext()+0x2ac > > StackTraceString: <none> > HResult: 80004005 > 0:000> !PrintException /d 000001c61a0fa760 > Exception object: 000001c61a0fa760 > Exception type: System.IO.IOException > Message: Unable to read data from the transport connection: An existing > connection was forcibly closed by the remote host. > InnerException: System.Net.Sockets.SocketException, Use !PrintException > 000001c61a0fa570 to see more. > StackTrace (generated): > SP IP Function > 000000D5B447A610 00007FFD928993C3 > System_ni!System.Net.Security._SslStream.EndRead(System.IAsyncResult)+0x7f3873 > 000000D5B447A660 00007FFD920A5B4C > System_ni!System.Net.TlsStream.EndRead(System.IAsyncResult)+0x10c > 000000D5B447A6B0 00007FFD94043F13 > mscorlib_ni!System.Threading.Tasks.TaskFactory`1+FromAsyncTrimPromise`1[[System.Int32, > mscorlib],[System.__Canon, mscorlib]].Complete(System.__Canon, > System.Func`3<System.__Canon,System.IAsyncResult,Int32>, System.IAsyncResult, > Boolean)+0x23 > 000000D5B4475DD0 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D5B4475E00 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D5B4475E30 00007FFD92035D25 > System_ni!System.Net.WebSockets.WebSocketConnectionStream+<ReadAsync>d__21.MoveNext()+0x385 > 000000D5B4471430 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D5B4471460 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D5B4471490 00007FFD92034541 > System_ni!System.Net.WebSockets.WebSocketBase+WebSocketOperation+<Process>d__19.MoveNext()+0x8e1 > 000000D5B446ED40 00007FFD93F7C3DF > mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f > 000000D5B446ED70 00007FFD935EDF2C > mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c > 000000D5B446EDA0 00007FFD9203339E > System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x1de > > StackTraceString: <none> > HResult: 80131620 > 0:000> !PrintException /d 000001c61a0fa570 > Exception object: 000001c61a0fa570 > {color:#de350b}Exception type: System.Net.Sockets.SocketException{color} > {color:#de350b}Message: An existing connection was forcibly closed by the > remote host{color} > InnerException: <none> > StackTrace (generated): > SP IP Function > 000000D5B447ED00 00007FFD928975F7 > System_ni!System.Net.Sockets.Socket.EndReceive(System.IAsyncResult)+0x7fe207 > 000000D5B447ED40 00007FFD9209924D > System_ni!System.Net.Sockets.NetworkStream.EndRead(System.IAsyncResult)+0x2d > > > But then the Gremlin.Net 's exception handling to {color:#ff8b00}dispose and > close{color} connections seemed to hit some deadlock: > 0:000> ~26e !clrstack > OS Thread Id: 0x31b8 (26) > Child SP IP Call Site > 000000d3ef8f8798 +00007ffda80b6594+ [GCFrame: 000000d3ef8f8798] > 000000d3ef8f8868 +00007ffda80b6594+ [HelperMethodFrame_1OBJ: > 000000d3ef8f8868] System.Threading.Monitor.ObjWait(Boolean, Int32, > System.Object) > 000000d3ef8f8980 +00007ffd935f6a0f+ > System.Threading.ManualResetEventSlim.Wait(Int32, > System.Threading.CancellationToken) > [f:\dd\ndp\clr\src\BCL\system\threading\ManualResetEventSlim.cs @ 669] > 000000d3ef8f8a10 +00007ffd935ef815+ > System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, > System.Threading.CancellationToken) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3320] > 000000d3ef8f8a80 +00007ffd93f60a88+ > System.Threading.Tasks.Task.InternalWait(Int32, > System.Threading.CancellationToken) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3259] > 000000d3ef8f8b50 +00007ffd935eef08+ System.Threading.Tasks.Task.Wait(Int32, > System.Threading.CancellationToken) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3167] > {color:#ff8b00}000000d3ef8f8b80 +00007ffd35d53f7f+ > Gremlin.Net.Process.Utils.WaitUnwrap(System.Threading.Tasks.Task){color} > {color:#ff8b00}000000d3ef8f8bc0 +00007ffd35db01dc+ > Gremlin.Net.Driver.ConnectionPool.ReturnConnectionIfOpen(Gremlin.Net.Driver.Connection){color} > 000000d3ef8f8bf0 +00007ffd35d88937+ > Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext() > 000000d3ef8fbfe8 +00007ffd94756f21+ [HelperMethodFrame: 000000d3ef8fbfe8] > 000000d3ef8fc0d0 +00007ffd93f7c3df+ > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs > @ 133] > 000000d3ef8fc100 +00007ffd935edf2c+ > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) > [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156] > 000000d3ef8fc130 +00007ffd35d8887c+ > Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext() > 000000d3ef8fc1e0 +00007ffd935a5ab3+ > System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, > System.Threading.ContextCallback, System.Object, Boolean) > [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954] > 000000d3ef8fc2b0 +00007ffd935a5944+ > System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, > System.Threading.ContextCallback, System.Object, Boolean) > [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902] > 000000d3ef8fc2e0 +00007ffd935f12bc+ > System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() > [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs > @ 1070] > 000000d3ef8fc330 +00007ffd935f80b2+ > System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, > Boolean, System.Threading.Tasks.Task ByRef) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\TaskContinuation.cs @ 810] > 000000d3ef8fc380 +00007ffd93573dc7+ > System.Threading.Tasks.Task.FinishContinuations() > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3617] > 000000d3ef8fc410 +00007ffd935ee890+ > System.Threading.Tasks.Task.Finish(Boolean) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2263] > 000000d3ef8fc470 +00007ffd935f0101+ > System.Threading.Tasks.Task`1[[System.__Canon, > mscorlib]].TrySetException(System.Object) > [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Future.cs @ 606] > 000000d3ef8fc4c0 +00007ffd935f06a8+ > System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, > mscorlib]].SetException(System.Exception) > 000000d3ef8fc500 +00007ffd35d8b27d+ > Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, > mscorlib]].MoveNext() > 000000d3ef8fe688 +00007ffd94756f21+ [HelperMethodFrame: 000000d3ef8fe688] > 000000d3ef8fe770 +00007ffd93f7c3df+ > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs > @ 133] > 000000d3ef8fe7a0 +00007ffd935edf2c+ > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) > [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156] > 000000d3ef8fe7d0 +00007ffd35d8b1bc+ > Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, > mscorlib]].MoveNext() > 000000d3ef8fe860 +00007ffd935a5ab3+ > System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, > System.Threading.ContextCallback, System.Object, Boolean) > [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954] > 000000d3ef8fe930 +00007ffd935a5944+ > System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, > System.Threading.ContextCallback, System.Object, Boolean) > [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902] > 000000d3ef8fe960 +00007ffd935f12bc+ > System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() > [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs > @ 1070] > 000000d3ef8fe9b0 +00007ffd935e4e22+ > System.Threading.ThreadPoolWorkQueue.Dispatch() > [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 820] > 000000d3ef8fee48 +00007ffd94646a93+ [DebuggerU2MCatchHandlerFrame: > 000000d3ef8fee48] > > Looks like Gremlin.Net handles a dead connection by attempting to forcing a > closure of the connection pool. This path calls Task.Wait() instead of using > await Task, so it is very possible that this is the cause of the hang. > > Might be related to this issue: > [https://issues.apache.org/jira/projects/TINKERPOP/issues/TINKERPOP-2268?filter=allopenissues&orderby=affectedVersion+DESC%2C+priority+DESC%2C+updated+DESC] > -- This message was sent by Atlassian Jira (v8.3.4#803005)