[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/2178#discussion_r68848969 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -408,6 +408,79 @@ public static GraphCsvReader fromCsvReader(String edgesPath, ExecutionEnvironmen } /** +* Creates a graph from a Adjacency List text file with Vertex Key values. Edges will be created automatically. +* +* @param filePath a path to an Adjacency List text file with the Vertex data +* @param context the execution environment. +* @return An instance of {@link org.apache.flink.graph.GraphAdjacencyListReader}, +* on which calling methods to specify types of the Vertex ID, Vertex value and Edge value returns a Graph. +*/ + public static GraphAdjacencyListReader fromAdjacencyListFile(String filePath, ExecutionEnvironment context) { + return new GraphAdjacencyListReader(filePath, context); + } + + /** +* Writes a graph as an Adjacency List formatted text file in a user specified folder. +* +* @param filePath the path that the Adjacency List formatted text file should be written in +* @param delimiters the delimiters that separate the different value types in the Adjacency List formatted text +* file. Delimiters should be provided with the following order: +* NEIGHBOR_DELIMITER : separating source from its neighbors +* VERTICES_DELIMITER : separating the different neighbors of a source vertex +* VERTEX_VALUE_DELIMITER: separating the source vertex-id from the vertex value, as well as the +* target vertex-ids from the edge value. +*/ + public void writeAsAdjacencyList(String filePath, String... delimiters) { + + final String NEIGHBOR_DELIMITER = delimiters.length > 0 ? delimiters[0] : "\t"; + + final String VERTICES_DELIMITER = delimiters.length > 1 ? delimiters[1] : ","; + + final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ? delimiters[2] : "-"; --- End diff -- You mean the error in this declaration: ```java final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ? delimiters[2] : "-"; ``` and not to check directly for length greater than two, because in that way the user will have to provide all three delimiters or none. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/2178#discussion_r68848469 --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -408,6 +408,79 @@ public static GraphCsvReader fromCsvReader(String edgesPath, ExecutionEnvironmen } /** +* Creates a graph from a Adjacency List text file with Vertex Key values. Edges will be created automatically. +* +* @param filePath a path to an Adjacency List text file with the Vertex data +* @param context the execution environment. +* @return An instance of {@link org.apache.flink.graph.GraphAdjacencyListReader}, +* on which calling methods to specify types of the Vertex ID, Vertex value and Edge value returns a Graph. +*/ + public static GraphAdjacencyListReader fromAdjacencyListFile(String filePath, ExecutionEnvironment context) { + return new GraphAdjacencyListReader(filePath, context); + } + + /** +* Writes a graph as an Adjacency List formatted text file in a user specified folder. +* +* @param filePath the path that the Adjacency List formatted text file should be written in +* @param delimiters the delimiters that separate the different value types in the Adjacency List formatted text +* file. Delimiters should be provided with the following order: +* NEIGHBOR_DELIMITER : separating source from its neighbors +* VERTICES_DELIMITER : separating the different neighbors of a source vertex +* VERTEX_VALUE_DELIMITER: separating the source vertex-id from the vertex value, as well as the +* target vertex-ids from the edge value. +*/ + public void writeAsAdjacencyList(String filePath, String... delimiters) { + + final String NEIGHBOR_DELIMITER = delimiters.length > 0 ? delimiters[0] : "\t"; + + final String VERTICES_DELIMITER = delimiters.length > 1 ? delimiters[1] : ","; + + final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ? delimiters[2] : "-"; + + + DataSet<Tuple2<K, VV>> vertices = this.getVerticesAsTuple2(); + + DataSet<Tuple3<K, K, EV>> edgesNValues = this.getEdgesAsTuple3(); --- End diff -- As I see now, we don't have to convert the vertex set to tuple2 set, so I already changed that. Regarding the edges dataset, in order to write the Adjacency List file, I use the coGroup transformation to the Vertex dataset and EdgesAsTuple3 dataset, where the vertexId equals the source of the edge. In that case, even when a Vertex is source to no edges (e.g. has only incoming edges), I can still have the vertexId in the "coGrouped" dataset (I couldn't do that with a join). I can't think how I could use the Edge dataset in a coGroup or similar transformation. Please let me know if you have any suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/2178#discussion_r68846112 --- Diff: flink-libraries/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala --- @@ -1127,8 +1194,7 @@ TypeInformation : ClassTag](jgraph: jg.Graph[K, VV, EV]) { * * @param analytic the analytic to run on the Graph */ - def run[T: TypeInformation : ClassTag](analytic: GraphAnalytic[K, VV, EV, T]): - GraphAnalytic[K, VV, EV, T] = { + def run[T: TypeInformation : ClassTag](analytic: GraphAnalytic[K, VV, EV, T])= { --- End diff -- No, I will revert the change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...
GitHub user fobeligi opened a pull request: https://github.com/apache/flink/pull/2178 [Flink-1815] Add methods to read and write a Graph as adjacency list Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html). In addition to going through the list, please provide a meaningful description of your changes. - [ ] General - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text") - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message (including the JIRA id) - [ ] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [ ] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed You can merge this pull request into a Git repository by running: $ git pull https://github.com/fobeligi/incubator-flink FLINK-1815 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2178.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2178 commit 3a9502da61b7758e1383803d5141a16fe3a5777a Author: fobeligi <faybeligia...@gmail.com> Date: 2016-06-22T16:11:23Z [FLINK-1815] Add GraphAdjacencyListReader class to read an Adjacency List formatted text file. Moreover, add writeAsAdjacencyList method to Graph. Test cases are also added for each new method. commit 8aab5b40e031b132c46782a5908d58cc6290892f Author: fobeligi <faybeligia...@gmail.com> Date: 2016-06-28T08:49:03Z [FLINK-1815] Add fromAdjacencyListFile and writeAsAdjacencyList methods to Graph scala API. Tests are also added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [Flink 1844] Add Normaliser to ML library
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/798#discussion_r31894794 --- Diff: docs/libs/ml/minMax_scaler.md --- @@ -0,0 +1,113 @@ +--- +mathjax: include +htmlTitle: FlinkML - MinMax Scaler +title: a href=../mlFlinkML/a - MinMax Scaler +--- +!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +License); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +-- + +* This will be replaced by the TOC +{:toc} + +## Description + + The MinMax scaler scales the given data set, so that all values will lie between a user specified range [min,max]. + In case the user does not provide a specific minimum and maximum value for the scaling range, the MinMax scaler transforms the features of the input data set to lie in the [0,1] interval. + Given a set of input data $x_1, x_2,... x_n$, with minimum value: + + $$x_{min} = min({x_1, x_2,..., x_n})$$ + + and maximum value: + + $$x_{max} = max({x_1, x_2,..., x_n})$$ + +The scaled data set $z_1, z_2,...,z_n$ will be: + + $$z_{i}= \frac{x_{i} - x_{min}}{x_{max} - x_{min}} \left ( max - min \right ) + min$$ + +where $\textit{min}$ and $\textit{max}$ are the user specified minimum and maximum values of the range to scale. + +## Operations + +`MinMaxScaler` is a `Transformer`. +As such, it supports the `fit` and `transform` operation. + +### Fit + +MinMaxScaler is trained on all subtypes of `Vector` or `LabeledVector`: + +* `fit[T : Vector]: DataSet[T] = Unit` +* `fit: DataSet[LabeledVector] = Unit` + +### Transform + +MinMaxScaler transforms all subtypes of `Vector` or `LabeledVector` into the respective type: + +* `transform[T : Vector]: DataSet[T] = DataSet[T]` +* `transform: DataSet[LabeledVector] = DataSet[LabeledVector]` + +## Parameters + +The MinMax scaler implementation can be controlled by the following two parameters: + + table class=table table-bordered + thead +tr + th class=text-left style=width: 20%Parameters/th + th class=text-centerDescription/th +/tr + /thead + + tbody +tr + tdstrongMin/strong/td + td +p + The minimum value of the range for the scaled data set. (Default value: strong0.0/strong) +/p + /td +/tr +tr + tdstrongStd/strong/td --- End diff -- Yes, you are right! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [Flink 1844] Add Normaliser to ML library
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/798#discussion_r31895883 --- Diff: flink-staging/flink-ml/src/test/scala/org/apache/flink/ml/preprocessing/MinMaxScalerITSuite.scala --- @@ -0,0 +1,180 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.ml.preprocessing + +import breeze.linalg +import org.apache.flink.api.scala._ +import org.apache.flink.ml.common.LabeledVector +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{DenseVector, Vector} +import org.apache.flink.test.util.FlinkTestBase +import org.scalatest.{FlatSpec, Matchers} + + +class MinMaxScalerITSuite + extends FlatSpec + with Matchers + with FlinkTestBase { + + behavior of Flink's MinMax Scaler + + import MinMaxScalerData._ + + it should scale the vectors' values to be restricted in the (0.0,1.0) range in { + +val env = ExecutionEnvironment.getExecutionEnvironment + +val dataSet = env.fromCollection(data) +val minMaxScaler = MinMaxScaler() +minMaxScaler.fit(dataSet) +val scaledVectors = minMaxScaler.transform(dataSet).collect + +scaledVectors.length should equal(data.length) + +for (vector - scaledVectors) { + val test = vector.asBreeze.forall(fv = { +fv = 0.0 fv = 1.0 --- End diff -- In this case I will use the same method as in the implementation of the transformer. Calculating the min and max of each feature and then applying the formula which I explain in the documentation. Is that OK? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1844] [ml] Add Normaliser to ML library
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/798#discussion_r31913634 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/MinMaxScaler.scala --- @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.ml.preprocessing + +import breeze.linalg +import breeze.linalg.{max, min} +import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.api.scala._ +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, Parameter, ParameterMap} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BreezeVectorConverter, Vector} +import org.apache.flink.ml.pipeline.{FitOperation, TransformOperation, Transformer} +import org.apache.flink.ml.preprocessing.MinMaxScaler.{Max, Min} + +import scala.reflect.ClassTag + +/** Scales observations, so that all features are in a user-specified range. + * By default for [[MinMaxScaler]] transformer range = [0,1]. + * + * This transformer takes a subtype of [[Vector]] of values and maps it to a + * scaled subtype of [[Vector]] such that each feature lies between a user-specified range. + * + * This transformer can be prepended to all [[Transformer]] and + * [[org.apache.flink.ml.pipeline.Predictor]] implementations which expect as input a subtype + * of [[Vector]]. + * + * @example + * {{{ + * val trainingDS: DataSet[Vector] = env.fromCollection(data) + * val transformer = MinMaxScaler().setMin(-1.0) + * + * transformer.fit(trainingDS) + * val transformedDS = transformer.transform(trainingDS) + * }}} + * + * =Parameters= + * + * - [[Min]]: The minimum value of the range of the transformed data set; by default equal to 0 + * - [[Max]]: The maximum value of the range of the transformed data set; by default + * equal to 1 + */ +class MinMaxScaler extends Transformer[MinMaxScaler] { + + var metricsOption: Option[DataSet[(linalg.Vector[Double], linalg.Vector[Double])]] = None --- End diff -- I am using metricsOption vectors internally in the transformer in elementwise subtraction and divisions, so instead of transforming to/from Breeze to flink.ml.math.Vector I have it as breeze.linalg.Vector. Can I perform the same operations with flink.ml.math.Vector, or do you believe that it would be better to perform the transformations (to/from breeze vectors) in the functions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1844] [ml] Add Normaliser to ML library
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/798#discussion_r31924947 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/MinMaxScaler.scala --- @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.ml.preprocessing + +import breeze.linalg +import breeze.linalg.{max, min} +import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.api.scala._ +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, Parameter, ParameterMap} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BreezeVectorConverter, Vector} +import org.apache.flink.ml.pipeline.{FitOperation, TransformOperation, Transformer} +import org.apache.flink.ml.preprocessing.MinMaxScaler.{Max, Min} + +import scala.reflect.ClassTag + +/** Scales observations, so that all features are in a user-specified range. + * By default for [[MinMaxScaler]] transformer range = [0,1]. + * + * This transformer takes a subtype of [[Vector]] of values and maps it to a + * scaled subtype of [[Vector]] such that each feature lies between a user-specified range. + * + * This transformer can be prepended to all [[Transformer]] and + * [[org.apache.flink.ml.pipeline.Predictor]] implementations which expect as input a subtype + * of [[Vector]]. + * + * @example + * {{{ + * val trainingDS: DataSet[Vector] = env.fromCollection(data) + * val transformer = MinMaxScaler().setMin(-1.0) + * + * transformer.fit(trainingDS) + * val transformedDS = transformer.transform(trainingDS) + * }}} + * + * =Parameters= + * + * - [[Min]]: The minimum value of the range of the transformed data set; by default equal to 0 + * - [[Max]]: The maximum value of the range of the transformed data set; by default + * equal to 1 + */ +class MinMaxScaler extends Transformer[MinMaxScaler] { + + var metricsOption: Option[DataSet[(linalg.Vector[Double], linalg.Vector[Double])]] = None --- End diff -- Hey, if the {{metricsOption}} field is package private then my tests will fail, cause I am also testing in the {{MinMaxScalerITSuite}} if the min, max of each feature has been calculated correct. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1844] [ml] Add Normaliser to ML library
Github user fobeligi commented on a diff in the pull request: https://github.com/apache/flink/pull/798#discussion_r31927083 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/MinMaxScaler.scala --- @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.ml.preprocessing + +import breeze.linalg +import breeze.linalg.{max, min} +import org.apache.flink.api.common.typeinfo.TypeInformation +import org.apache.flink.api.scala._ +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, Parameter, ParameterMap} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BreezeVectorConverter, Vector} +import org.apache.flink.ml.pipeline.{FitOperation, TransformOperation, Transformer} +import org.apache.flink.ml.preprocessing.MinMaxScaler.{Max, Min} + +import scala.reflect.ClassTag + +/** Scales observations, so that all features are in a user-specified range. + * By default for [[MinMaxScaler]] transformer range = [0,1]. + * + * This transformer takes a subtype of [[Vector]] of values and maps it to a + * scaled subtype of [[Vector]] such that each feature lies between a user-specified range. + * + * This transformer can be prepended to all [[Transformer]] and + * [[org.apache.flink.ml.pipeline.Predictor]] implementations which expect as input a subtype + * of [[Vector]]. + * + * @example + * {{{ + * val trainingDS: DataSet[Vector] = env.fromCollection(data) + * val transformer = MinMaxScaler().setMin(-1.0) + * + * transformer.fit(trainingDS) + * val transformedDS = transformer.transform(trainingDS) + * }}} + * + * =Parameters= + * + * - [[Min]]: The minimum value of the range of the transformed data set; by default equal to 0 + * - [[Max]]: The maximum value of the range of the transformed data set; by default + * equal to 1 + */ +class MinMaxScaler extends Transformer[MinMaxScaler] { + + var metricsOption: Option[DataSet[(linalg.Vector[Double], linalg.Vector[Double])]] = None --- End diff -- Yes ^^ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [Flink 1844] Add Normaliser to ML library
GitHub user fobeligi opened a pull request: https://github.com/apache/flink/pull/798 [Flink 1844] Add Normaliser to ML library Adds a MinMaxScaler to the ML preprocessing package. MinMax scaler scales the values to a user-specified range. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fobeligi/incubator-flink FLINK-1844 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/798.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #798 commit 802b9da07a2c3f7c055b4c024aaecbbe647db1cd Author: fobeligi faybeligia...@gmail.com Date: 2015-06-05T21:12:43Z [FLINK-1844] Add MinMaxScaler implementation in the proprocessing package, test for the for the corresponding functionality and documentation. commit e639185108f9bda253e296bae4c6c4269a30d1d0 Author: fobeligi faybeligia...@gmail.com Date: 2015-06-05T22:12:33Z [FLINK-1844] Change second test to use LabeledVectors instead of Vectors --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Ml branch
GitHub user fobeligi opened a pull request: https://github.com/apache/flink/pull/579 Ml branch Implementation of StandardScaler and respective tests for FLINK-1809 JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fobeligi/incubator-flink ml-branch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/579.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #579 commit 96cb2f5676e945d7bc414987934e5c854de70584 Author: fobeligi faybeligia...@gmail.com Date: 2015-04-01T20:31:38Z [FLINK-1809] Add Preprocessing package and Standardizer to ML-library commit 2e8333b74e08f0c48bb58d36f2915a9ad832c456 Author: fobeligi faybeligia...@gmail.com Date: 2015-04-03T16:52:35Z [FLINK-1809] Change implementation to use Breeze.linalg library and add tests for Standardizer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---