[ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85419&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85419
 ]

ASF GitHub Bot logged work on BEAM-3848:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 28/Mar/18 22:36
            Start Date: 28/Mar/18 22:36
    Worklog Time Spent: 10m 
      Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##########
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##########
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
     // therefore, can not exist an empty shard.
     assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+    SolrIO.Write write = mock(SolrIO.Write.class);
+    when(write.getRetryConfiguration())
+        .thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+    SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+    AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+    // simulate success
+    when(solrClient.process(any(String.class), any(SolrRequest.class)))
+        .thenReturn(mock(SolrResponse.class));
+
+    List<SolrInputDocument> batch = SolrIOTestUtils.createDocuments(1);
+    writeFn.flushBatch(solrClient, batch);
+    verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+    SolrIO.Write write = mock(SolrIO.Write.class);
+    
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+    SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+    AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+    // simulate failure
+    when(solrClient.process(any(String.class), any(SolrRequest.class)))
+        .thenThrow(new SolrServerException("Fail"));
+
+    List<SolrInputDocument> batch = SolrIOTestUtils.createDocuments(1);
+    try {
+      writeFn.flushBatch(solrClient, batch);
+      fail("Error should have been surfaced when flushing batch");
+    } catch (IOException e) {
+      verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+    }
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+    SolrIO.Write write = mock(SolrIO.Write.class);
+    when(write.getRetryConfiguration())
+        .thenReturn(
+            SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+    SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+    AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+    // simulate failure
+    when(solrClient.process(any(String.class), any(SolrRequest.class)))
+        .thenThrow(
+            new HttpSolrClient.RemoteSolrException(
+                "localhost", 1, "ignore", new IOException("Network")));
+
+    List<SolrInputDocument> batch = SolrIOTestUtils.createDocuments(1);
+    Stopwatch stopwatch = Stopwatch.createStarted();
+
+    try {
+      writeFn.flushBatch(solrClient, batch);
+      fail("Error should have been surfaced when flushing batch");
+    } catch (IOException e) {
+      // at least two attempts must be made
+      verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+      long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+      assertTrue(
+          "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+          seconds >= 3);
+    }
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+    SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
    1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
    2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
    3. I introduce `SLF4J` logging at `warn` when retries are invoked
    4. I run the mini-IT style tests using the embedded Solr server with 
        1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
        2. write to a non existing collection which will cause a 
`SolrException` (code = 400 and a message "collecton not found: ...") thus 
invoking the retry thanks to the strategy
        3. Use the same as apprach as [`JdbcIOTest` of checking for log 
statements](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOTest.java#L359)
 to verify retry behavior is invoked
     
   I'd suggest that we keep the 10 lines of retry within the ´SolrIO´ and then 
not move to the `AuthorizedSolrClient` as we considered above, to be similar to 
the ´JdbcIO´ and also so that it is the `SolrIO` logs we're using in the test.
   
   I'm happy to implement this (especially since it brings further consistency 
with `JdbcIO`) but would like to have a committer quickly approve the procedure 
first. Fundamentally we're saying that relying on log statements trumps mocking 
the underlying `SolrClient` (not something I particularly agree with, but keen 
to follow the project approach).
   
   CC @iemejia @echauchot @jkff 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 85419)
    Time Spent: 4h 20m  (was: 4h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---------------------------------------------------
>
>                 Key: BEAM-3848
>                 URL: https://issues.apache.org/jira/browse/BEAM-3848
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-solr
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Tim Robertson
>            Assignee: Tim Robertson
>            Priority: Minor
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to