[ https://issues.apache.org/jira/browse/TINKERPOP-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309633#comment-17309633 ]
Kirk Marple commented on TINKERPOP-2541: ---------------------------------------- Another data point, which I saw last night: I was running a test which send several Gremlin queries to CosmosDb. Not every time, but occasionally, it would hang on the 4th or 5th call, at the same spot in my code. It did this consistently for 5-10min, and then started working again - same code, same CosmosDb, everything. It's possible my ISP was having some internet problems, but other apps could access the internet fine during this period. I've seen failures creating the GremlinClient and on SubmitAsync, and seem related to the web sockets connections. Do you have contacts at Microsoft you've worked with? We're a MSFT for Startups partner, and I can try and track some folks down there to help, if needed. > .NET SDK (CosmosDb): The server returned status code '500' when status code > '101' was expected. > ----------------------------------------------------------------------------------------------- > > Key: TINKERPOP-2541 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2541 > Project: TinkerPop > Issue Type: Bug > Components: dotnet > Affects Versions: 3.4.10 > Reporter: Kirk Marple > Priority: Major > > Using 3.4.10, .NET SDK, I've been getting a lot of problems talking to > CosmosDb lately. > Some hangs on SubmitAsync (after several successful calls), and sometimes > errors just creating the GremlinClient. > Seems to have cropped more over the last month, where I can reproduce it more > often now. I've tried playing with pool size, etc. and haven't been able to > get around it. > I'm able to repro it just from my laptop locally when running integration > tests against CosmosDb. > Any thoughts on this? It's preventing us going into production with our > application. > > {code:java} > The server returned status code '500' when status code '101' was expected. > at System.Net.WebSockets.WebSocketHandle.<ConnectAsync>d__13.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) > at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > at Gremlin.Net.Driver.WebSocketConnection.<ConnectAsync>d__4.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) > at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > at Gremlin.Net.Driver.Connection.<ConnectAsync>d__15.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) > at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > at > Gremlin.Net.Driver.ConnectionPool.<CreateNewConnectionAsync>d__18.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) > at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > at Gremlin.Net.Driver.ConnectionPool.<FillPoolAsync>d__17.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) > at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > at > Gremlin.Net.Driver.ConnectionPool.<ReplaceDeadConnectionsAsync>d__15.MoveNext() > at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() > at Gremlin.Net.Process.Utils.WaitUnwrap(Task task) > at Gremlin.Net.Driver.ConnectionPool..ctor(IConnectionFactory > connectionFactory, ConnectionPoolSettings settings) > at Gremlin.Net.Driver.GremlinClient..ctor(GremlinServer gremlinServer, > GraphSONReader graphSONReader, GraphSONWriter graphSONWriter, String > mimeType, ConnectionPoolSettings connectionPoolSettings, Action`1 > webSocketConfiguration, String sessionId) > at > Unstruk.Frameworks.Data.Graphs.GraphHelpers.<>c_DisplayClass4_0.<CreateGremlinClient>b_3() > in D:\a\1\s\src\Unstruk.Frameworks.Data\GraphHelpers.cs:line 90 > at Polly.Policy.<>c_DisplayClass126_0`1.<ExecuteAndCapture>b_0(Context ctx, > CancellationToken ct) > at Polly.Retry.RetryEngine.Implementation[TResult](Func`3 action, Context > context, CancellationToken cancellationToken, ExceptionPredicates > shouldRetryExceptionPredicates, ResultPredicates`1 > shouldRetryResultPredicates, Action`4 onRetry, Int32 permittedRetryCount, > IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider) > at Polly.Retry.RetryPolicy.Implementation[TResult](Func`3 action, Context > context, CancellationToken cancellationToken) > at Polly.Policy.Execute[TResult](Func`3 action, Context context, > CancellationToken cancellationToken) > at Polly.Policy.ExecuteAndCapture[TResult](Func`3 action, Context context, > CancellationToken cancellationToken) > {code} > Here's my helper method for creating the client, which I wrapped with Polly > retry, and it still failed. > {code:java} > private const int DEFAULT_POOL_SIZE = 8; > private const int DEFAULT_INPROCESS_PER_CONNECTION = 32; > public static GremlinClient CreateGremlinClient(ILogger logger, > GraphSettings settings, string collectionName) > { > const int DEFAULT_RETRY_COUNT = 5; > const float RETRY_SEED = 1.5F; > if (String.IsNullOrEmpty(settings.Location)) > throw new InvalidOperationException("Invalid CosmosDB graph > server name."); > if (String.IsNullOrEmpty(settings.Key)) > throw new InvalidOperationException("Invalid CosmosDB graph > access key."); > if (String.IsNullOrEmpty(settings.DatabaseId)) > throw new InvalidOperationException("Invalid CosmosDB graph > database id."); > if (String.IsNullOrEmpty(collectionName)) > throw new InvalidOperationException("Invalid CosmosDB graph > collection name."); > var sw = Stopwatch.StartNew(); > var uri = new Uri(settings.Location); > uri = ConvertGremlinEndpoint(uri); > var server = new GremlinServer(uri.Host, 443, true, > $"/dbs/{settings.DatabaseId}/colls/{collectionName}", settings.Key); > var connectionPoolSettings = new ConnectionPoolSettings > { > PoolSize = DEFAULT_POOL_SIZE, > MaxInProcessPerConnection = DEFAULT_INPROCESS_PER_CONNECTION > }; > var webSocketConfiguration = new > Action<ClientWebSocketOptions>(options => > { > //options.KeepAliveInterval = TimeSpan.FromSeconds(10); > }); > int attempt = 1; > var result = Policy > .Handle<ResponseException>() > .WaitAndRetry(DEFAULT_RETRY_COUNT, > retryAttempt => TimeSpan.FromSeconds(Math.Pow(RETRY_SEED, > retryAttempt)), > onRetry: (e, delay) => > { > #if DEBUG_SERVICE_GRAPH > logger.LogWarning($"Failed to create Gremlin client, > attempt [{attempt}]. Trying again in [{delay}]. {e.Message}"); > #endif > attempt++; > }) > .ExecuteAndCapture(() => > { > return new GremlinClient(server, new GraphSON2Reader(), > new GraphSON2Writer(), GremlinClient.GraphSON2MimeType, > connectionPoolSettings, webSocketConfiguration); > }); > if (result.Outcome == OutcomeType.Failure) > throw new InvalidOperationException($"Failed to create > Gremlin Client.", result.FinalException); > logger.LogDebug($"Created Gremlin client, location > [{settings.Location}], database [{settings.DatabaseId}], collection > [{collectionName}], took [{sw.Elapsed}]."); > return result.Result; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)