[ 
https://issues.apache.org/jira/browse/BEAM-5859?focusedWorklogId=171530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-171530
 ]

ASF GitHub Bot logged work on BEAM-5859:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Dec/18 10:56
            Start Date: 03/Dec/18 10:56
    Worklog Time Spent: 10m 
      Work Description: mxm commented on a change in pull request #7150: 
[BEAM-5859] Improve operator names for portable pipelines
URL: https://github.com/apache/beam/pull/7150#discussion_r238223008
 
 

 ##########
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/translation/FlinkPipelineTranslatorUtilsTest.java
 ##########
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation;
+
+import static org.hamcrest.core.Is.is;
+import static org.junit.Assert.assertThat;
+
+import java.io.Serializable;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+import org.apache.beam.runners.core.construction.PipelineTranslation;
+import org.apache.beam.runners.core.construction.graph.ExecutableStage;
+import org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser;
+import 
org.apache.beam.runners.flink.translation.utils.FlinkPipelineTranslatorUtils;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.Impulse;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.junit.Test;
+
+/**
+ * Tests for {@link 
org.apache.beam.runners.flink.translation.utils.FlinkPipelineTranslatorUtils}.
+ */
+public class FlinkPipelineTranslatorUtilsTest implements Serializable {
+
+  @Test
+  /* Test for generating readable operator names during translation. */
+  public void testOperatorNameGeneration() throws Exception {
+    Pipeline p = Pipeline.create();
+    p.apply(Impulse.create())
+        // Anonymous ParDo
+        .apply(
+            ParDo.of(
+                new DoFn<byte[], String>() {
+                  @ProcessElement
+                  public void processElement(
+                      ProcessContext processContext, OutputReceiver<String> 
outputReceiver) {}
+                }))
+        // Name ParDo
+        .apply(
+            "MyName",
+            ParDo.of(
+                new DoFn<String, Integer>() {
+                  @ProcessElement
+                  public void processElement(
+                      ProcessContext processContext, OutputReceiver<Integer> 
outputReceiver) {}
+                }));
+
+    ExecutableStage firstEnvStage =
+        GreedyPipelineFuser.fuse(PipelineTranslation.toProto(p))
+            .getFusedStages()
+            .stream()
+            .findFirst()
+            .get();
+    RunnerApi.ExecutableStagePayload basePayload =
+        RunnerApi.ExecutableStagePayload.parseFrom(
+            firstEnvStage.toPTransform("foo").getSpec().getPayload());
+
+    String executableStageName =
+        
FlinkPipelineTranslatorUtils.genOperatorNameFromStagePayload(basePayload);
+
+    assertThat(executableStageName, 
is("ExecutableStage(2)[ParDo(Anonymous)][MyName]"));
 
 Review comment:
   I think we could skip the word `ExecutableStage` entirely. Maybe just 
enclosing it in `{}` would suffice to indicate that a harness is involved.
   
   I realized the names are less readable for Python. In Java we would just get 
the user-defined name (if set).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 171530)
    Time Spent: 1h  (was: 50m)

> Improve Traceability of Pipeline translation
> --------------------------------------------
>
>                 Key: BEAM-5859
>                 URL: https://issues.apache.org/jira/browse/BEAM-5859
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-flink
>            Reporter: Maximilian Michels
>            Priority: Major
>              Labels: portability, portability-flink
>         Attachments: tfx.png, wordcount.png
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Users often ask how they can reason about the pipeline translation. The Flink 
> UI display a confusingly large graph without any trace of the original Beam 
> pipeline:
> WordCount:
>  !wordcount.png! 
> TFX:
>  !tfx.png! 
> Some aspects which make understanding these graphs hard:
>  * Users don't know how the Runner maps Beam to Flink concepts
>  * The UI is awfully slow / hangs when the pipeline is reasonable complex
>  * The operator names seem to use {{transform.getUniqueName()}} which doesn't 
> generate readable name
>  * So called Chaining combines operators into a single operator which makes 
> understanding which Beam concept belongs to which Flink concept even harder
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to