[GitHub] samza pull request #25: SAMZA-1054: Refactor Operator APIs

2016-11-28 Thread prateekm
GitHub user prateekm opened a pull request:

https://github.com/apache/samza/pull/25

SAMZA-1054: Refactor Operator APIs

Some suggestions for an Operator API refactor and misc. cleanup. It does 
contain some implementation changes, mostly due to deleted, extracted or merged 
classes. (e.g. OperatorFactory + ChainedOperators == OperatorImpls).

Since git marked several moved classes as (delete + new) instead, it's 
probably best to apply the diff locally and  browse the code in an IDE.

Some of the changes, in no particular order:
* Extracted XFunction interfaces into a .functions package in -api.
* -api's internal.Operators is now the -operators's spec.* package. 
Extracted interfaces and classes. Factory methods are now in OperatorSpecs.
* -api's MessageStreams is now -api's MessageStream interface and 
-operators's MessageStreamImpl.
* -api's internal.Windows classes are now in -api's .window package. 
Extracted interfaces and classes, but no implementation changes.
* OperatorFactory + ChainedOperators is now OperatorImpls, which is used 
from StreamOperatorAdaptorTask.
* Added a NoOpOperatorImpl, which acts as the root node for the 
OperatorImpl DAG returned by OperatorImpls.
* Removed usages of reactivestreams APIs since current code looks simpler 
without them. We can add them back when we need features like backpressure etc.
* Removed the InputSystemMessage interface.
* Made field names consistent (e.g Fn suffix for functions everywhere etc.).
* Some method/class visibility changes due to moved classes. Haven't done 
this for all the classes yet.
* General documentation changes, mostly to make public APIs clearer. 
Haven't done this for all the classes yet.

There are additional questions/tasks that we can address in future RBs:
* Updating Window and Trigger APIs.
* Merging samza-operator into samza-core.
* Questions about Message timestamp and Offset comparison semantics.
* Questions about OperatorSpec serialization (e.g. ID generation).
* Questions about StateStoreImpl and StoreFunctions.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prateekm/samza master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/samza/pull/25.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #25


commit 25f2f54d8ce9a418629c71c872eddf50b01c5a03
Author: Prateek Maheshwari 
Date:   2016-11-28T22:31:12Z

SAMZA-1054: Refactor Operator APIs

Some suggestions for an Operator API refactor and misc. cleanup. It does 
contain some implementation changes, mostly due to deleted, extracted or merged 
classes. (e.g. OperatorFactory + ChainedOperators == OperatorImpls).

Since git marked several moved classes as (delete + new) instead, it's 
probably best to apply the diff locally and  browse the code in an IDE.

Some of the changes, in no particular order:
* Extracted XFunction interfaces into a .functions package in -api.
* -api's internal.Operators is now the -operators's spec.* package. 
Extracted interfaces and classes. Factory methods are now in OperatorSpecs.
* -api's MessageStreams is now -api's MessageStream interface and 
-operators's MessageStreamImpl.
* -api's internal.Windows classes are now in -api's .window package. 
Extracted interfaces and classes, but no implementation changes.
* OperatorFactory + ChainedOperators is now OperatorImpls, which is used 
from StreamOperatorAdaptorTask.
* Added a NoOpOperatorImpl, which acts as the root node for the 
OperatorImpl DAG returned by OperatorImpls.
* Removed usages of reactivestreams APIs since current code looks simpler 
without them. We can add them back when we need features like backpressure etc.
* Removed the InputSystemMessage interface.
* Made field names consistent (e.g Fn suffix for functions everywhere etc.).
* Some method/class visibility changes due to moved classes. Haven't done 
this for all the classes yet.
* General documentation changes, mostly to make public APIs clearer. 
Haven't done this for all the classes yet.

There are additional questions/tasks that we can address in future RBs:
* Updating Window and Trigger APIs.
* Merging samza-operator into samza-core.
* Questions about Message timestamp and Offset comparison semantics.
* Questions about OperatorSpec serialization (e.g. ID generation).
* Questions about StateStoreImpl and StoreFunctions.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket

Re: Review Request 54020: Operator API refactoring

2016-11-28 Thread Prateek Maheshwari

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54020/
---

(Updated Nov. 28, 2016, 2:11 p.m.)


Review request for samza, Jagadish Venkatraman and Yi Pan (Data Infrastructure).


Changes
---

Updated with Jagadish's feedback.
Inlined OperatorImpls#getOrCreateOperatorImpl() to remove dependency on k-v 
store Entry class.


Summary (updated)
-

Operator API refactoring


Repository: samza


Description (updated)
---

Some suggestions for an Operator API refactor and misc. cleanup. It does 
contain some implementation changes, mostly due to deleted, extracted or merged 
classes. (e.g. OperatorFactory + ChainedOperators == OperatorImpls).

Since git marked several moved classes as (delete + new) instead, it's probably 
best to apply the diff locally and  browse the code in an IDE.

Some of the changes, in no particular order:
* Extracted XFunction interfaces into a .functions package in -api.
* -api's internal.Operators is now the -operators's spec.* package. Extracted 
interfaces and classes. Factory methods are now in OperatorSpecs.
* -api's MessageStreams is now -api's MessageStream interface and -operators's 
MessageStreamImpl.
* -api's internal.Windows classes are now in -api's .window package. Extracted 
interfaces and classes, but no implementation changes.
* OperatorFactory + ChainedOperators is now OperatorImpls, which is used from 
StreamOperatorAdaptorTask.
* Added a NoOpOperatorImpl, which acts as the root node for the OperatorImpl 
DAG returned by OperatorImpls.
* Removed usages of reactivestreams APIs since current code looks simpler 
without them. We can add them back when we need features like backpressure etc.
* Removed the InputSystemMessage interface.
* Made field names consistent (e.g Fn suffix for functions everywhere etc.).
* Some method/class visibility changes due to moved classes. Haven't done this 
for all the classes yet.
* General documentation changes, mostly to make public APIs clearer. Haven't 
done this for all the classes yet.


There are additional questions/tasks that we can address in future RBs:
Updating Window and Trigger APIs.
Merging samza-operator into samza-core.
Questions about Message timestamp and Offset comparison semantics.
Questions about OperatorSpec serialization (e.g. ID generation).
Questions about StateStoreImpl and StoreFunctions.


Diffs (updated)
-

  samza-api/src/main/java/org/apache/samza/operators/MessageStream.java 
dede631e072c2803839242dbf08b0149bfb76adb 
  samza-api/src/main/java/org/apache/samza/operators/MessageStreams.java 
51bf482d6ccc6720828bd13a128a4cce9d6204e1 
  samza-api/src/main/java/org/apache/samza/operators/StreamOperatorTask.java 
PRE-CREATION 
  samza-api/src/main/java/org/apache/samza/operators/TriggerBuilder.java 
5b3f4d05cd1aa5ceecb2dd285966f3366d69cf92 
  samza-api/src/main/java/org/apache/samza/operators/WindowState.java 
759f2d8766804b2b5e47edf6759b3e6e2923c466 
  samza-api/src/main/java/org/apache/samza/operators/Windows.java 
6619f41596548056d08fc71822eb94262ca2f120 
  
samza-api/src/main/java/org/apache/samza/operators/data/IncomingSystemMessage.java
 3c9874dcffeaa9300e5dbc984301cb495483b0c8 
  
samza-api/src/main/java/org/apache/samza/operators/data/InputSystemMessage.java 
5c23e746456a5e3058600a41ad7a91857839d61c 
  samza-api/src/main/java/org/apache/samza/operators/data/Message.java 
8441682a6446e9a10a173d1c5c8ae307a2fdd111 
  
samza-api/src/main/java/org/apache/samza/operators/functions/FilterFunction.java
 PRE-CREATION 
  
samza-api/src/main/java/org/apache/samza/operators/functions/JoinFunction.java 
PRE-CREATION 
  
samza-api/src/main/java/org/apache/samza/operators/functions/SinkFunction.java 
PRE-CREATION 
  samza-api/src/main/java/org/apache/samza/operators/internal/Operators.java 
f06387c9f49592a901ce01d996aa7ae7c95e89e2 
  samza-api/src/main/java/org/apache/samza/operators/internal/Trigger.java 
3b50e2b8f640851a974ad01bc92425e50319692f 
  samza-api/src/main/java/org/apache/samza/operators/internal/WindowFn.java 
489e5b855b5de44b7ebf2c93b956ec50b7398ffb 
  samza-api/src/main/java/org/apache/samza/operators/internal/WindowOutput.java 
643b703648146f0bbae3f111099778b8c228d7b4 
  
samza-api/src/main/java/org/apache/samza/operators/task/StreamOperatorTask.java 
42c8f74544b5fd66af843ef7be2b38dd8e494253 
  samza-api/src/main/java/org/apache/samza/operators/windows/SessionWindow.java 
PRE-CREATION 
  
samza-api/src/main/java/org/apache/samza/operators/windows/StoreFunctions.java 
PRE-CREATION 
  samza-api/src/main/java/org/apache/samza/operators/windows/Window.java 
PRE-CREATION 
  samza-api/src/main/java/org/apache/samza/operators/windows/Windows.java 
PRE-CREATION 
  samza-api/src/test/java/org/apache/samza/operators/TestMessage.java 
8c5628757883168ce3f5a324884b967401b615fa 
  samza-api/src/test/java/org/apache/samza/operators/TestMe

[GitHub] samza pull request #24: SAMZA-1047: testEndOfStreamWithOutOfOrderProcess is ...

2016-11-28 Thread banecogic
GitHub user banecogic opened a pull request:

https://github.com/apache/samza/pull/24

SAMZA-1047: testEndOfStreamWithOutOfOrderProcess is flaky

Chooser always pools end of stream message until final callback is trigered 
so process-enelopes metrics didn't match. Changed the test methods to return 
null envelopes after end of stream message.

Deleted unused boolean variable "completed" in AsyncTaskWorker

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/banecogic/samza SAMZA-1047

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/samza/pull/24.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #24


commit e7aacf562dfa8b67b9e85ae8d77481f0a3d4cf89
Author: banecogic 
Date:   2016-11-28T10:25:24Z

SAMZA-1047: testEndOfStreamWithOutOfOrderProcess is flaky




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---