lindong28 opened a new pull request, #230:
URL: https://github.com/apache/flink-ml/pull/230

   ## What is the purpose of the change
   
   Add util methods that allow algorithm developers to co-group two DataStreams 
with the same semantics and similar performance as `DataSet#coGroup(...)`
   
   Here are the results of running the benchmark specified in FLINK-31753's 
JIRA description:
   - DataSet#coGroup takes 27.6 seconds.
   - DataStreamUtils#coGroup takes  31.5 seconds.
   
   The DataStream is roughly 12.3% slower than DataSet. The performance 
difference should be negligible for real-word applications whose co-group 
function is non-trivial.
   
   ## Brief change log
   
   Added the static method `DataStreamUtils#coGroup(...)`.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
     - If yes, how is the feature documented? JavaDocs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to