[ https://issues.apache.org/jira/browse/FLINK-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294948#comment-14294948 ]
Ufuk Celebi commented on FLINK-1459: ------------------------------------ Hey John, support for this kind of operations is coming up very soon. What you can currently do is: 1) For local setups: use the LocalCollectionOutputFormat 2) For remote setups: RemoteCollectorOutputFormat As you already mentioned, both are far from ideal solutions. There is a pull request to support count and collect here: https://github.com/apache/flink/pull/210 This should be merged soon to master. You can find a related discussion here: http://mail-archives.apache.org/mod_mbox/flink-dev/201501.mbox/%3CCAEXqXcY1LKMHg_36zj=e9wGzSh4cX82Jpq2H_=msaza4au_...@mail.gmail.com%3E > Collect DataSet to client > ------------------------- > > Key: FLINK-1459 > URL: https://issues.apache.org/jira/browse/FLINK-1459 > Project: Flink > Issue Type: Improvement > Reporter: John Sandiford > > Hi, I may well have missed something obvious here but I cannot find an easy > way to extract the values in a DataSet to the client. Spark has collect, > collectAsMap etc... > (I need to pass the values from a small aggregated DataSet back to a machine > learning library which is controlling the iterations.) > The only way I could find to do this was to implement my own in memory > OutputFormat. This is not ideal, but does work. > Many thanks, John > > val env = ExecutionEnvironment.getExecutionEnvironment > val data: DataSet[Double] = env.fromElements(1.0, 2.0, 3.0, 4.0) > val result = data.reduce((a, b) => a) > val valuesOnClient = result.??? > env.execute("Simple example") -- This message was sent by Atlassian JIRA (v6.3.4#6332)