[ 
https://issues.apache.org/jira/browse/BEAM-6206?focusedWorklogId=174956&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-174956
 ]

ASF GitHub Bot logged work on BEAM-6206:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Dec/18 17:27
            Start Date: 13/Dec/18 17:27
    Worklog Time Spent: 10m 
      Work Description: swegner commented on a change in pull request #7270: 
[BEAM-6206] Add CustomHttpErrors a tool to allow adding custom errors…
URL: https://github.com/apache/beam/pull/7270#discussion_r241488677
 
 

 ##########
 File path: 
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/util/CustomHttpErrors.java
 ##########
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * An optional component to use with the RetryHttpRequestInitializer in order 
to provide custom
+ * errors for failing http calls. This class allows you to specify custom 
error messages which match
+ * specific error codes and containing strings in the URL.
+ *
+ * <p>The intended use case here is to examine one of the logs emitted by a 
failing call made by the
+ * RetryHttpRequestInitializer, and then adding a custom error message which 
matches the URL and
+ * code for it.
+ *
+ * <p>Usage: See more in CustomHttpErrorsTest.
+ *
+ * <p>CustomHttpErrors.Builder builder = new CustomHttpErrors.Builder();
+ * builder.addErrorForCodeAndUrlContains(403,"/tables?", "Custom Error Msg"); 
CustomHttpErrors
+ * customErrors = builder.build();
+ *
+ * <p>RetryHttpRequestInitializer initializer = ... 
initializer.setCustomErrors(customErrors);
+ *
+ * <p>Suggestions for future enhancements to anyone upgrading this file: - 
This class is left open
+ * for extension, to allow different functions for HttpCallMatcher and 
HttpCallCustomError to match
+ * and log errors. For example, new functionality may including matching and 
error based on the
+ * HttpResponse body, and logging may include extracting and logging strings 
from the HttpResponse
+ * body as well. - Add a methods to add custom errors based on inspecting the 
contents of the
+ * HttpRequest and HttpResponse - Be sure to update the HttpRequestWrapper and 
HttpResponseWrapper
+ * with any new getters that you may use. The wrappers were introduced to add 
a layer of indirection
+ * which could be mocked out in tests. This was unfortunately needed because 
mockito cannot mock
+ * final classes and its non trivial to just construct HttpRequest and 
HttpResponse objects.
+ */
+public class CustomHttpErrors {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(CustomHttpErrors.class);
+
+  /**
+   * A simple Tuple class for creating a list of HttpResponseMatcher and 
HttpResponseCustomError to
+   * print for the responses.
+   */
+  @AutoValue
+  public abstract static class MatcherAndError implements Serializable {
+    static MatcherAndError create(HttpCallMatcher matcher, HttpCallCustomError 
customError) {
+      return new AutoValue_CustomHttpErrors_MatcherAndError(matcher, 
customError);
+    }
+
+    public abstract HttpCallMatcher getMatcher();
+
+    public abstract HttpCallCustomError getCustomError();
+  }
+
+  /**
+   * A Builder which allows building immutable CustomHttpErrors object.
+   */
+  public static class Builder {
+
+    private List<MatcherAndError> matchersAndLogs = new 
ArrayList<MatcherAndError>();
+
+    public CustomHttpErrors build() {
+      return new CustomHttpErrors(this.matchersAndLogs);
+    }
+
+    /** Adds a matcher to log the provided string if the error matches a 
particular status code. */
+    public void addErrorForCode(int statusCode, String errorMessage) {
+      HttpCallMatcher matcher =
+          (request, response) -> {
+            if (response.getStatusCode() == statusCode) {
+              return true;
+            }
+            return false;
+          };
+      this.matchersAndLogs.add(MatcherAndError.create(matcher, 
simpleErrorMessage(errorMessage)));
+    }
+
+    /**
+     * Adds a matcher to log the provided string if the error matches a 
particular status code and
 
 Review comment:
   It seems like in the future we'd want the matchers to be composable

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 174956)
    Time Spent: 1h 10m  (was: 1h)

> Dataflow template which reads from BigQuery fails if used more than once
> ------------------------------------------------------------------------
>
>                 Key: BEAM-6206
>                 URL: https://issues.apache.org/jira/browse/BEAM-6206
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp, runner-dataflow
>    Affects Versions: 2.8.0
>            Reporter: Neil McCrossin
>            Assignee: Tyler Akidau
>            Priority: Blocker
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When a pipeline contains a BigQuery read, and when that pipeline is uploaded 
> as a template and the template is run in Cloud Dataflow, it will run 
> successfully the first time, but after that it will fail because it can't 
> find a file in the folder BigQueryExtractTemp (see error message below). If 
> the template is uploaded again it will work again +once only+ and then fail 
> again every time after the first time.
> *Error message:*
>  java.io.FileNotFoundException: No files matched spec: 
> gs://bigquery-bug-report-4539/temp/BigQueryExtractTemp/847a342637a64e73b126ad33f764dcc9/000000000000.avro
> *Steps to reproduce:*
>  1. Create the Beam Word Count sample as described 
> [here|https://cloud.google.com/dataflow/docs/quickstarts/quickstart-java-maven].
> 2. Copy the command line from the section "Run WordCount on the Cloud 
> Dataflow service" and substitute in your own project id and bucket name. Make 
> sure you can run it successfully.
> 3. In the file WordCount.java, add the following lines below the existing 
> import statements:
> {code:java}
> import org.apache.beam.sdk.coders.AvroCoder;
> import org.apache.beam.sdk.coders.DefaultCoder;
> import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
> import org.apache.beam.sdk.io.gcp.bigquery.SchemaAndRecord;
> import org.apache.beam.sdk.transforms.SerializableFunction;
> @DefaultCoder(AvroCoder.class)
> class TestOutput
> { 
> }
> {code}
>  
>  4. In this same file, replace the entire method runWordCount with the 
> following code:
> {code:java}
> static void runWordCount(WordCountOptions options) {
>   Pipeline p = Pipeline.create(options);
>   p.apply("ReadBigQuery", BigQueryIO
>     .read(new SerializableFunction<SchemaAndRecord, TestOutput>() {
>       public TestOutput apply(SchemaAndRecord record) {
>         return new TestOutput();
>       }
>     })
>     .from("bigquery-public-data:stackoverflow.tags")
>   );
>   p.run();
> }
> {code}
> (Note I am using the stackoverflow.tags table for purposes of demonstration 
> because it is public and not too large, but the problem seems to occur for 
> any table).
> 5. Add the following pipeline parameters to the command line that you have 
> been using:
> {code:java}
> --tempLocation=gs://<STORAGE_BUCKET>/temp/
> --templateLocation=gs://<STORAGE_BUCKET>/my-bigquery-dataflow-template
> {code}
> 6. Run the command line so that the template is created.
> 7. Launch the template through the Cloud Console by clicking on "CREATE JOB 
> FROM TEMPLATE". Give it the job name "test-1", choose "Custom Template" at 
> the bottom of the list and browse to the template 
> "my-bigquery-dataflow-template", then press "Run job".
> 8. The job should succeed. But then repeat step 7 and it will fail.
> 9. Repeat steps 6 and 7 and it will work again. Repeat step 7 and it will 
> fail again.
>  
> This bug may be related to BEAM-2058 (just a hunch).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to