[
https://issues.apache.org/jira/browse/BEAM-2918?focusedWorklogId=158169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-158169
]
ASF GitHub Bot logged work on BEAM-2918:
----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Oct/18 14:36
Start Date: 24/Oct/18 14:36
Worklog Time Spent: 10m
Work Description: mxm commented on a change in pull request #6726:
[BEAM-2918] Add state support for streaming in portable FlinkRunner
URL: https://github.com/apache/beam/pull/6726#discussion_r227817028
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
##########
@@ -135,31 +149,131 @@ public void open() throws Exception {
// ownership of the higher level "factories" explicit? Do we care?
stageContext = contextFactory.get(jobInfo);
- stateRequestHandler = getStateRequestHandler(executableStage);
stageBundleFactory = stageContext.getStageBundleFactory(executableStage);
+ stateRequestHandler = getStateRequestHandler(executableStage);
progressHandler = BundleProgressHandler.unsupported();
outputQueue = new LinkedBlockingQueue<>();
}
private StateRequestHandler getStateRequestHandler(ExecutableStage
executableStage) {
+ final StateRequestHandler sideInputStateHandler;
if (executableStage.getSideInputs().size() > 0) {
checkNotNull(super.sideInputHandler);
StateRequestHandlers.SideInputHandlerFactory sideInputHandlerFactory =
Preconditions.checkNotNull(
FlinkStreamingSideInputHandlerFactory.forStage(
executableStage, sideInputIds, super.sideInputHandler));
try {
- return StateRequestHandlers.forSideInputHandlerFactory(
- ProcessBundleDescriptors.getSideInputs(executableStage),
sideInputHandlerFactory);
+ sideInputStateHandler =
+ StateRequestHandlers.forSideInputHandlerFactory(
+ ProcessBundleDescriptors.getSideInputs(executableStage),
sideInputHandlerFactory);
} catch (IOException e) {
- throw new RuntimeException(e);
+ throw new RuntimeException("Failed to initialize SideInputHandler", e);
+ }
+ } else {
+ sideInputStateHandler = StateRequestHandler.unsupported();
+ }
+
+ final StateRequestHandler userStateRequestHandler;
+ if (executableStage.getUserStates().size() > 0) {
+ if (keyedStateInternals == null) {
+ throw new IllegalStateException("Input must be keyed when user state
is used");
}
+ userStateRequestHandler =
+ StateRequestHandlers.forBagUserStateHandlerFactory(
+ stageBundleFactory.getProcessBundleDescriptor(),
+ new BagUserStateFactory(keyedStateInternals,
getKeyedStateBackend()));
} else {
- return StateRequestHandler.unsupported();
+ userStateRequestHandler = StateRequestHandler.unsupported();
+ }
+
+ EnumMap<TypeCase, StateRequestHandler> handlerMap = new
EnumMap<>(TypeCase.class);
+ handlerMap.put(TypeCase.MULTIMAP_SIDE_INPUT, sideInputStateHandler);
+ handlerMap.put(TypeCase.BAG_USER_STATE, userStateRequestHandler);
+
+ return StateRequestHandlers.delegateBasedUponType(handlerMap);
+ }
+
+ private static class BagUserStateFactory
+ implements StateRequestHandlers.BagUserStateHandlerFactory {
+
+ private final StateInternals stateInternals;
+ private final KeyedStateBackend<ByteBuffer> keyedStateBackend;
+
+ private BagUserStateFactory(
+ StateInternals stateInternals, KeyedStateBackend<ByteBuffer>
keyedStateBackend) {
+
+ this.stateInternals = stateInternals;
+ this.keyedStateBackend = keyedStateBackend;
+ }
+
+ @Override
+ public <K, V, W extends BoundedWindow>
+ StateRequestHandlers.BagUserStateHandler<K, V, W> forUserState(
+ String pTransformId,
+ String userStateId,
+ Coder<K> keyCoder,
+ Coder<V> valueCoder,
+ Coder<W> windowCoder) {
+ return new StateRequestHandlers.BagUserStateHandler<K, V, W>() {
+ @Override
+ public Iterable<V> get(K key, W window) {
+ prepareStateBackend(key, keyCoder);
Review comment:
I've looked into the chaining code and apparently the KeyedStataBackend is
scoped by operator id. So it shouldn't conflict, even if the operators are
chained together.
I disabled chaining because I have doubt that the asynchronous processing of
multiple elements (now that we merged #6723) works with other chained
operators, because it might break the chain abstraction. I'll remove the
chaining change and will investigate that separately.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 158169)
Time Spent: 11h 10m (was: 11h)
> Flink support for portable user state
> -------------------------------------
>
> Key: BEAM-2918
> URL: https://issues.apache.org/jira/browse/BEAM-2918
> Project: Beam
> Issue Type: Sub-task
> Components: runner-flink
> Reporter: Henning Rohde
> Assignee: Maximilian Michels
> Priority: Minor
> Labels: portability
> Time Spent: 11h 10m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)