[
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911648#comment-16911648
]
Hannah Jiang commented on BEAM-7993:
------------------------------------
A side note about this test:
Now we only have py2 and py35, so it only fails with py35. I am introducing
minor versions, which will add py36 and py37, and all py3 are flaky.
It's really difficult to pass Portable Precommit with minor versions, the
chance of passing the test is around 15%.
>From Ahmet:
perhaps we need different pre commits for different versions so that flakes do
not stack up. Even if a suite is >90% reliable, if we stack up with 4 version,
the reliability will get much lower.
> portable python precommit is flaky
> ----------------------------------
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core, test-failures, testing
> Affects Versions: 2.15.0
> Reporter: Udi Meiri
> Assignee: Kyle Weaver
> Priority: Major
> Fix For: 2.15.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR
> org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR
> org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2457>),
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR
> org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN
> MapPartition (MapPartition at
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2457>),
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2457>),
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR
> org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN
> MapPartition (MapPartition at
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2457>),
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an
> exception: java.io.IOException: Received exit code 1 for command 'docker
> inspect -f {{.State.Running}}
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr:
> Error: No such object:
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22 at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22 at
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22 at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by:
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.io.IOException: Received exit code 1 for command 'docker inspect -f
> {{.State.Running}}
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr:
> Error: No such object:
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22 at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22 at
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:211)
> 11:51:22 at
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:202)
> 11:51:22 at
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22 at
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22 at
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22 at
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22 at
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22 at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22 ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull
--
This message was sent by Atlassian Jira
(v8.3.2#803003)