[ 
https://issues.apache.org/jira/browse/FLINK-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970524#comment-15970524
 ] 

Pattarawat Chormai commented on FLINK-2032:
-------------------------------------------

Hi all,

I have searched on Github using [1] and found that there are several tests that 
haven't been refactored to use _collect_ yet.

{code}
flink-streaming-scala/src/test/scala/org/apache/flink/streaming/api/scala/StreamingOperatorsITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/functions/ClosureCleanerITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/io/ScalaCsvReaderWithPOJOITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/AggregateITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/CoGroupITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/DistinctITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/ExamplesITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FilterITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FirstNITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/FlatMapITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/JoinITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/MapITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/OuterJoinITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/PartitionITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/operators/ReduceITCase.scala
flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/ScalaSpecialTypesITCase.scala

flink-connectors/flink-avro/src/test/java/org/apache/flink/api/io/avro/AvroPojoTest.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopMapFunctionITCase.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopReduceCombineFunctionITCase.java
flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapred/HadoopReduceFunctionITCase.java
flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/CEPITCase.java
flink-libraries/flink-gelly-examples/src/test/java/org/apache/flink/graph/test/examples/IncrementalSSSPITCase.java
flink-tests/src/test/java/org/apache/flink/test/iterative/aggregators/AggregatorsITCase.java
flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/DataSinkITCase.java
{code}

I would suggest to create 2 additional subtasks each for Scala and Java and I 
can help finishing them. What do you think?

[1] 
https://github.com/apache/flink/search?p=5&q=TemporaryFolder+write&type=&utf8=%E2%9C%93

> Migrate integration tests from temp output files to collect()
> -------------------------------------------------------------
>
>                 Key: FLINK-2032
>                 URL: https://issues.apache.org/jira/browse/FLINK-2032
>             Project: Flink
>          Issue Type: Task
>          Components: Tests
>    Affects Versions: 0.9
>            Reporter: Fabian Hueske
>            Priority: Minor
>              Labels: starter
>
> Most of Flink's integration tests that execute full Flink programs and check 
> their results are implemented by writing results to temporary output file and 
> comparing the content of the file to a provided set of expected Strings. 
> Flink's test utils make this quite comfortable and hide a lot of the 
> complexity of this approach. Nonetheless, this approach has a few drawbacks:
> - increased latency by going through disk
> - comparison is on String representation of objects
> - depends on the file system
> Since Flink's {{collect()}} feature was added, the temp file approach is not 
> the best approach anymore. Instead, tests can collect the result of a Flink 
> program directly as objects and compare these against a set of expected 
> objects.
> It would be good to migrate the existing test base to use {{collect()}} 
> instead of temporary output files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to