[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-22 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Hello @thvasilo @greghogan Ok I've updated documentation. I stay tuned for updating code. Regards Thomas --- If your project is set up for it, you can reply

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-21 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan @thvasilo What's the next step ? More tests and reviews ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-16 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan Ok I've pushed the code with my tests and some modifications in mapping @thvasilo It seems to work perfectly! --- If your project is set up for it, you can reply

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-14 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan I've not pushed the code yet because my tests are still incorrect. Indeed the following code: val env = ExecutionEnvironment.getExecutionEnvironment val fitData

[GitHub] flink pull request #2740: [FLINK-4964] [ml]

2016-11-09 Thread tfournier314
Github user tfournier314 commented on a diff in the pull request: https://github.com/apache/flink/pull/2740#discussion_r87295380 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StringIndexer.scala --- @@ -0,0 +1,108 @@ +package

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-09 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @thvasilo @greghogan I've updated my code so that I'm streaming instead of caching with a collect(). Does it seem ok for you ? --- If your project is set up for it, you can reply to this email

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-04 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 I've changed my code so that I have now mapping:DataSet[(String,Long)] val mapping = input .mapWith( s => (s, 1) ) .groupBy( 0 ) .reduce( (a, b) =>

[GitHub] flink pull request #2740: [FLINK-4964] [ml]

2016-11-03 Thread tfournier314
Github user tfournier314 commented on a diff in the pull request: https://github.com/apache/flink/pull/2740#discussion_r86402178 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StringIndexer.scala --- @@ -0,0 +1,163 @@ +/* + * Licensed

[GitHub] flink pull request #2740: [FLINK-4964] [ml]

2016-11-03 Thread tfournier314
Github user tfournier314 commented on a diff in the pull request: https://github.com/apache/flink/pull/2740#discussion_r86402282 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StringIndexer.scala --- @@ -0,0 +1,163 @@ +/* + * Licensed

[GitHub] flink issue #2684: [FLINK-4865] [ml] Add EvaluateDataSet operation for Label...

2016-11-03 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2684 @thvasilo @tillrohrmann Ok I will investigate this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-02 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Yes, I've just updated the PR title --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] flink issue #2740: Implement StringIndexer

2016-11-02 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 I'd like to make this operation scalable: - by sorting globally - then by doing something like "zipWithIndex" (don't know if it is possible) - then collect()

[GitHub] flink pull request #2740: Implement StringIndexer

2016-11-02 Thread tfournier314
GitHub user tfournier314 opened a pull request: https://github.com/apache/flink/pull/2740 Implement StringIndexer Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all

[GitHub] flink issue #2684: Add EvaluateDataSet Operation for LabeledVector - This cl...

2016-10-27 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2684 Ok thanks ! We don't need to care about Jenkins/Travis fails, do we ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] flink pull request #2684: Add EvaluateDataSet Operation for LabeledVector - ...

2016-10-23 Thread tfournier314
GitHub user tfournier314 opened a pull request: https://github.com/apache/flink/pull/2684 Add EvaluateDataSet Operation for LabeledVector - This closes #4865 Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list

[GitHub] flink pull request #2668: Add EvaluateDataSetOperation for LabeledVector. Th...

2016-10-23 Thread tfournier314
Github user tfournier314 closed the pull request at: https://github.com/apache/flink/pull/2668 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] flink issue #2668: Add EvaluateDataSetOperation for LabeledVector. This clos...

2016-10-23 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2668 I'm using IntelliJ and I can't remove scala doc and use java doc instead --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] flink pull request #2668: Add EvaluateDataSetOperation for LabeledVector. Th...

2016-10-20 Thread tfournier314
GitHub user tfournier314 opened a pull request: https://github.com/apache/flink/pull/2668 Add EvaluateDataSetOperation for LabeledVector. This closes #4865 Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list