GitHub user nickpan47 opened a pull request:
https://github.com/apache/samza/pull/475
SAMZA-1659: Serializable OperatorSpec
This change is to make the user supplied functions serializable. Hence,
making the full user defined DAG serializable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickpan47/samza
serializable-opspec-only-Jan-24-18
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/samza/pull/475.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #475
----
commit 5573a069e95f6467b92d73b3cba37f035e067ae2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-12T04:20:36Z
WIP: new api revision
commit 8bb975204bd8a5ad3459642609c9e4992c495701
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-12T07:38:40Z
WIP: proto-type of input/output stream/system specs
commit aeb457303779e7f31b114ad23bfa9ebaafeddab6
Author: Xinyu Liu <xiliu@...>
Date: 2017-06-29T00:16:10Z
SAMZA-1321: Propagate end-of-stream and watermark messages
The patch completes the end-of-stream work flow across multi-stage
pipeline. It also contains initial commit for supporting watermarks. For
watermark, there are issues raised in the review feedback and will be addressed
by further prs. The main logic this patch adds:
- EndOfStreamManager aggregates the end-of-stream control messages,
propagate the result to to downstream intermediate topics based on the topology
of the IO in the StreamGraph.
- WatermarkManager aggregates the watermark control messages from the
upstage tasks, pass it through the operators, and propagate it to downstream.
In operator impl, I implemented similar watermark logic as Beam for
watermark propagation:
* InputWatermark(op) = min { OutputWatermark(op') | op1 is upstream of op}
* OutputWatermark(op) = min { InputWatermark(op), OldestWorkTime(op) }
Add quite a few unit tests and integration test. The code is 100% covered
as reported by Intellij. Both control messages work as expected.
Author: Xinyu Liu <[email protected]>
Reviewers: Yi Pan <[email protected]>
Closes #236 from xinyuiscool/SAMZA-1321
commit ae3dc6ff133fa3f0c7d9f27b6f724c652105680c
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-12T04:20:36Z
WIP: new api revision
commit 91f364f1e3e350c4c9d6413e620d9506a1da91c8
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-12T07:38:40Z
WIP: proto-type of input/output stream/system specs
commit b898e6c037e4fb4c9ddf1baee95095045e6500ab
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-13T01:43:50Z
WIP: new-api-v2
commit cd528c1c30646a3307c81770f3dc82482bba1aba
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-21T16:12:41Z
WIP: updated spec and user DAG API
commit 0bc7ee7babd0a64ed4d7cf6b7fab24be3a3447ae
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-06-30T16:55:09Z
WIP: update the user code example on new APIs
commit 51541e133216b9c730f57653f5f7afb6c56d21a5
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-07-11T16:07:32Z
WIP: cleanup StreamDescriptor
commit 3c50629eb93b674731398f9f8c756e499da5faf2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-07-27T22:05:07Z
WIP: adding support for low-level task APIs
commit 4a6a58dcb80fb2919f5bad42f6f1979960826d62
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-07-31T09:51:30Z
WIP: updated w/ low-level task API and global var ingestion/metrics reporter
commit 256155ad530c48af0cf60ba8945e8face0133a90
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-07-31T17:29:22Z
WIP: new API user code examples
commit f227380f20503734decca75c0f34aa5810b7f1c0
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-08-02T07:50:54Z
WIP: update the app runner classes
commit e6fb96e574aa9ee78b3beedd6921e4a680fa9c73
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-08-08T14:35:39Z
WIP: merged all application types into StreamApplications
commit 525d8bc1b96149299ad831d1ba3d458d6dd3e4b1
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-08-09T18:02:31Z
Merge branch '0.14.0' into new-api-v2
Conflicts:
samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
samza-core/src/main/java/org/apache/samza/processor/StreamProcessor.java
samza-core/src/main/java/org/apache/samza/runtime/LocalContainerRunner.java
commit 6fc6d4c09d13132ae3c915de6bb6223206cb4ca2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-08-25T16:49:18Z
WIP: experiment code to implement an end-to-end working example for new APIs
commit d7df6ed0ef3f6ca8ad5ee4cb7e7307c314a71619
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-08-31T22:41:41Z
WIP: added all unit test for OperatorSpec#copy methods.
commit dde1ab14bee0ff4ebbc15500a938b191295dbd90
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-09-06T16:43:37Z
WIP: first end-to-end test
commit 50201728e964cbe3c997b4e335df3cd2a46ed076
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-10-19T23:48:56Z
Merge branch 'experiment-new-api-v2' into new-api-v2-0.14
Conflicts:
samza-api/src/main/java/org/apache/samza/operators/MessageStream.java
samza-api/src/main/java/org/apache/samza/operators/StreamGraph.java
samza-api/src/main/java/org/apache/samza/operators/windows/Windows.java
samza-api/src/main/java/org/apache/samza/operators/windows/internal/WindowInternal.java
samza-api/src/main/java/org/apache/samza/serializers/Serde.java
samza-api/src/main/java/org/apache/samza/system/StreamSpec.java
samza-core/src/main/java/org/apache/samza/operators/MessageStreamImpl.java
samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
samza-core/src/main/java/org/apache/samza/operators/functions/PartialJoinFunction.java
samza-core/src/main/java/org/apache/samza/operators/impl/InputOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImplGraph.java
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/WindowOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/spec/InputOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/JoinOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpecs.java
samza-core/src/main/java/org/apache/samza/operators/spec/OutputOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/OutputStreamImpl.java
samza-core/src/main/java/org/apache/samza/operators/spec/SinkOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/StreamOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/WindowOperatorSpec.java
samza-core/src/main/java/org/apache/samza/runtime/AbstractApplicationRunner.java
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
samza-core/src/main/java/org/apache/samza/runtime/RemoteApplicationRunner.java
samza-core/src/main/java/org/apache/samza/task/StreamOperatorTask.java
samza-core/src/test/java/org/apache/samza/control/TestEndOfStreamManager.java
samza-core/src/test/java/org/apache/samza/control/TestIOGraph.java
samza-core/src/test/java/org/apache/samza/control/TestWatermarkManager.java
samza-core/src/test/java/org/apache/samza/example/BroadcastExample.java
samza-core/src/test/java/org/apache/samza/example/MergeExample.java
samza-core/src/test/java/org/apache/samza/execution/TestExecutionPlanner.java
samza-core/src/test/java/org/apache/samza/execution/TestJobGraphJsonGenerator.java
samza-core/src/test/java/org/apache/samza/operators/TestJoinOperator.java
samza-core/src/test/java/org/apache/samza/operators/TestMessageStreamImpl.java
samza-core/src/test/java/org/apache/samza/operators/TestStreamGraphImpl.java
samza-core/src/test/java/org/apache/samza/operators/TestWindowOperator.java
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImpl.java
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImplGraph.java
samza-core/src/test/java/org/apache/samza/operators/spec/TestWindowOperatorSpec.java
samza-core/src/test/java/org/apache/samza/runtime/TestApplicationRunnerMain.java
samza-core/src/test/java/org/apache/samza/runtime/TestLocalApplicationRunner.java
samza-test/src/main/java/org/apache/samza/example/KeyValueStoreExample.java
samza-test/src/main/java/org/apache/samza/example/OrderShipmentJoinExample.java
samza-test/src/main/java/org/apache/samza/example/PageViewCounterExample.java
samza-test/src/main/java/org/apache/samza/example/RepartitionExample.java
samza-test/src/main/java/org/apache/samza/example/WindowExample.java
samza-test/src/test/java/org/apache/samza/test/controlmessages/EndOfStreamIntegrationTest.java
samza-test/src/test/java/org/apache/samza/test/controlmessages/WatermarkIntegrationTest.java
samza-test/src/test/java/org/apache/samza/test/operator/RepartitionWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/SessionWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/StreamApplicationIntegrationTestHarness.java
samza-test/src/test/java/org/apache/samza/test/operator/TestRepartitionWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/TumblingWindowApp.java
samza-test/src/test/java/org/apache/samza/test/processor/TestZkLocalApplicationRunner.java
commit bf1ce907645e4a131c5765c4612c158e1855730a
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-10-20T00:43:33Z
WIP: removing StreamDescriptor first
commit d46403294aecc2ebdbc3a890da8c40acfa35f448
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-10-23T16:05:48Z
WIP: fixing unit tests after merge
commit 6a14b2afad8d57cccb17e5926cd44743b6ebd034
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-10-30T22:25:03Z
WIP: fixed unit test failure for Windows
commit 475a46bced8d84c7fe09b57590d3a72e0587c3c8
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-12-15T18:59:29Z
WIP: fixed TestZkLocalApplicationRunner. Debugging issues w/
TestRepartitionWindowApp (i.e. missing changelog creation step when directly
running LocalApplicationRunner)
commit dc7da87e2eec618bb060acfec695c5b1aa280259
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2017-12-22T18:27:21Z
WIP: unit tests for serialization
commit 4102aa8ced2b5630edf3ac84e0c6298956a999b5
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-01-02T17:36:29Z
WIP: continued working on potential offspring integration
commit 1670aff0cefb569b109fad3928f04190a7b4e0a1
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-01-10T22:20:22Z
WIP: class-loading of user program logic and main() method based user
program logic are both included in
ThreadJobFactory/ProcessJobFactory/YarnJobFactory. ThreadJobFactory test suite
to be fixed.
commit 0ebebfc3ba49de6d6589cd3b82eb8ce3882d034d
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-01-19T23:11:26Z
WIP: serialization only change
commit aca423085ed29452adc89c48aee4aa1126874328
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-01-25T07:52:16Z
WIP: Serialize OperatorSpec only w/o StreamApplication interface change.
Passed all build and tests.
commit b973b105ddb8cd49ea5f084ceebea19b27254f36
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-02-08T08:12:58Z
Merge branch 'master' into serializable-opspec-only-Jan-24-18
Conflicts:
samza-api/src/main/java/org/apache/samza/operators/MessageStream.java
samza-core/src/main/java/org/apache/samza/operators/MessageStreamImpl.java
samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImplGraph.java
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/PartialJoinOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/PartitionByOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/impl/WindowOperatorImpl.java
samza-core/src/main/java/org/apache/samza/operators/spec/InputOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpecs.java
samza-core/src/main/java/org/apache/samza/operators/spec/OutputStreamImpl.java
samza-core/src/main/java/org/apache/samza/operators/spec/PartitionByOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/spec/StreamOperatorSpec.java
samza-core/src/main/java/org/apache/samza/operators/stream/IntermediateMessageStreamImpl.java
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
samza-core/src/main/scala/org/apache/samza/coordinator/JobModelManager.scala
samza-core/src/test/java/org/apache/samza/execution/TestExecutionPlanner.java
samza-core/src/test/java/org/apache/samza/execution/TestJobGraphJsonGenerator.java
samza-core/src/test/java/org/apache/samza/operators/TestJoinOperator.java
samza-core/src/test/java/org/apache/samza/operators/TestMessageStreamImpl.java
samza-core/src/test/java/org/apache/samza/operators/TestStreamGraphImpl.java
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImpl.java
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImplGraph.java
samza-core/src/test/java/org/apache/samza/operators/impl/TestWindowOperator.java
samza-core/src/test/java/org/apache/samza/operators/spec/TestWindowOperatorSpec.java
samza-test/src/main/java/org/apache/samza/example/KeyValueStoreExample.java
samza-test/src/main/java/org/apache/samza/example/OrderShipmentJoinExample.java
samza-test/src/main/java/org/apache/samza/example/PageViewCounterExample.java
samza-test/src/main/java/org/apache/samza/example/RepartitionExample.java
samza-test/src/main/java/org/apache/samza/example/WindowExample.java
samza-test/src/test/java/org/apache/samza/test/controlmessages/EndOfStreamIntegrationTest.java
samza-test/src/test/java/org/apache/samza/test/controlmessages/WatermarkIntegrationTest.java
samza-test/src/test/java/org/apache/samza/test/operator/RepartitionJoinWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/SessionWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/TestRepartitionJoinWindowApp.java
samza-test/src/test/java/org/apache/samza/test/operator/TumblingWindowApp.java
samza-test/src/test/java/org/apache/samza/test/processor/TestZkLocalApplicationRunner.java
commit 7c8d1591cef6326f32b2ffd061cc52b902e37214
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date: 2018-02-12T08:29:26Z
WIP: working on unit tests for trigger, broadcast, join, table, and SQL UDF
function serialization
----
---