Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/939#issuecomment-126380793
  
    Hi @jamescao, thanks for your contribution. I read this pull request and 
have some comments about this.
    
    First, Test cases are insufficient. There is no test cases to test Scala 
API, Case Classes, and POJOs. `CsvInputFormat` supports Tuple, Case Classes, 
and POJOs. We should add test cases to test them.
    
    Second, I think that Scala implementation is not good to use. There is 
already logic of reading CSV file in `DataSet` of Scala API (`readCsvFile()` 
method). I think that reusing this method is better then current 
implementation. `readCsvFile()` method already supports Case Classes, POJOs and 
Tuples. The method is tested well. If reuse the method, we can reduce many 
duplicated codes.
    
    Third, It is just my opinion. I think that test cases using file decreases 
testing performance. Flink community have tried to improve stability and 
performance of testing. We are changing test cases to use memory to compare 
results. `collect()` method and comparing list or array are sufficient to test. 
I know there are some test cases using file, but using memory to compare 
results is better than using file for new test cases.
    
    Fourth, `CsvOptions` is not same pattern as other configuration object in 
Flink. In Flink, many configuration objects implement builder pattern. Each 
setter method should return this object. You can use `CsvReader` class to 
reference.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to