[jira] [Commented] (FLINK-1664) Forbid sorting on POJOs
[ https://issues.apache.org/jira/browse/FLINK-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384904#comment-14384904 ] ASF GitHub Bot commented on FLINK-1664: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/541 [FLINK-1664] Adds checks if selected sort key is sortable - Adds checks if a sort key can be actually sorted. - The POJO type is defined as non-sortable, because an order would depend on the undefined order of POJO fields. - Adds a few more tests for API sort functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink sortOnPojo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/541.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #541 commit e26d934eb1b2c14298900c53e8413487ce43a17a Author: Fabian Hueske Date: 2015-03-27T20:37:59Z [FLINK-1664] Adds check if a selected sort key is sortable > Forbid sorting on POJOs > --- > > Key: FLINK-1664 > URL: https://issues.apache.org/jira/browse/FLINK-1664 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 0.8.0, 0.9 >Reporter: Fabian Hueske >Assignee: Fabian Hueske > > Flink's groupSort, partitionSort, and outputSort operators allow to sort > partitions or groups of a DataSet. > If the sort is defined on a POJO field, the sort order is not well defined. > Internally, the POJO is recursively decomposed into atomic fields (primitives > or generic types) and sorted by sorting these atomic fields. Thereby, the > order of these atomic fields is not well defined (I believe it is > lexicographic order of the POJO's member names). > IMO, the best approach is to forbid sorting on POJO types for now. Instead, > it is always possible to select the nested fields of the POJO that should be > used for sorting. Later we can relax this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1664] Adds checks if selected sort key ...
GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/541 [FLINK-1664] Adds checks if selected sort key is sortable - Adds checks if a sort key can be actually sorted. - The POJO type is defined as non-sortable, because an order would depend on the undefined order of POJO fields. - Adds a few more tests for API sort functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink sortOnPojo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/541.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #541 commit e26d934eb1b2c14298900c53e8413487ce43a17a Author: Fabian Hueske Date: 2015-03-27T20:37:59Z [FLINK-1664] Adds check if a selected sort key is sortable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1794] [test-utils] Adds test base for s...
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/540#discussion_r27329656 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -68,7 +68,7 @@ private final Optimizer compiler; // the compiler to compile the jobs - private boolean printStatusDuringExecution = false; --- End diff -- Is this an intentional change? Doesn't it change the behavior of all submission clients based on this class? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1794) Add test base for scalatest and adapt flink-ml test cases
[ https://issues.apache.org/jira/browse/FLINK-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384602#comment-14384602 ] ASF GitHub Bot commented on FLINK-1794: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/540#discussion_r27329656 --- Diff: flink-clients/src/main/java/org/apache/flink/client/program/Client.java --- @@ -68,7 +68,7 @@ private final Optimizer compiler; // the compiler to compile the jobs - private boolean printStatusDuringExecution = false; --- End diff -- Is this an intentional change? Doesn't it change the behavior of all submission clients based on this class? > Add test base for scalatest and adapt flink-ml test cases > - > > Key: FLINK-1794 > URL: https://issues.apache.org/jira/browse/FLINK-1794 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Till Rohrmann > > Currently, the flink-ml test cases use the standard {{ExecutionEnvironment}} > which can cause problems in parallel test executions as they happen on > Travis. For these tests it would be helpful to have an appropriate Scala test > base which instantiates a {{ForkableFlinkMiniCluster}} and sets the > {{ExecutionEnvironment}} appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-703) Use complete element as join key.
[ https://issues.apache.org/jira/browse/FLINK-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384529#comment-14384529 ] Fabian Hueske commented on FLINK-703: - Hi, that's a very good example and shows that the description of this issue is incomplete ;-) Key expressions (such as {{"*"}} and {{"_"}}) are supported for Pojo, Tuple, and CaseClass types. If you would change your example in a way, that you join against DataSet of primitive types (such as DataSet) it would no longer work. Since primitive types are not composite, the only possible key expression is the wildcard which selects the full type. We handle this case in several places of the API as special case. See for example in [DataSink.java|https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/operators/DataSink.java] in function {{sortLocalOutput(String fieldExpression, Order order)}} (around line 180). So this issue would be mean to add similiar functionality to the key definition functions of join, coGroup, and grouping. For that, we need to check if the type is a valid key type ({{TypeInformation.isKeyType()}}) and if the type is an atomic type, set the key to int[]\{0\} > Use complete element as join key. > - > > Key: FLINK-703 > URL: https://issues.apache.org/jira/browse/FLINK-703 > Project: Flink > Issue Type: Improvement >Reporter: GitHub Import >Assignee: Chiwan Park >Priority: Trivial > Labels: github-import > Fix For: pre-apache > > > In some situations such as semi-joins it could make sense to use a complete > element as join key. > Currently this can be done using a key-selector function, but we could offer > a shortcut for that. > This is not an urgent issue, but might be helpful. > Imported from GitHub > Url: https://github.com/stratosphere/stratosphere/issues/703 > Created by: [fhueske|https://github.com/fhueske] > Labels: enhancement, java api, user satisfaction, > Milestone: Release 0.6 (unplanned) > Created at: Thu Apr 17 23:40:00 CEST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1796) Local mode TaskManager should have a process reaper
[ https://issues.apache.org/jira/browse/FLINK-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384458#comment-14384458 ] Stephan Ewen commented on FLINK-1796: - I have a fix for this coming up... > Local mode TaskManager should have a process reaper > --- > > Key: FLINK-1796 > URL: https://issues.apache.org/jira/browse/FLINK-1796 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > > We use process reaper actors (a typical Akka design pattern) to shut down the > JVM processes when the core actors die, as this is currently unrecoverable. > The local mode uses the process reaper only for the JobManager actor, not for > the TaskManager actor. This may lead to dead stale JVMs on critical > TaskManager errors and makes debugging harder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1796) Local mode TaskManager should have a process reaper
Stephan Ewen created FLINK-1796: --- Summary: Local mode TaskManager should have a process reaper Key: FLINK-1796 URL: https://issues.apache.org/jira/browse/FLINK-1796 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 We use process reaper actors (a typical Akka design pattern) to shut down the JVM processes when the core actors die, as this is currently unrecoverable. The local mode uses the process reaper only for the JobManager actor, not for the TaskManager actor. This may lead to dead stale JVMs on critical TaskManager errors and makes debugging harder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1795) Solution set allows duplicates upon construction.
[ https://issues.apache.org/jira/browse/FLINK-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384456#comment-14384456 ] Stephan Ewen commented on FLINK-1795: - I have a fix and test coming up... > Solution set allows duplicates upon construction. > - > > Key: FLINK-1795 > URL: https://issues.apache.org/jira/browse/FLINK-1795 > Project: Flink > Issue Type: Bug > Components: Iterations >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > > The solution set identifies entries uniquely by key. > During construction, it does not eliminate duplicates. The duplicates do not > get updated during the iterations (since only the first match is considered), > but are contained in the final result. > This contradicts the definition of the solution set. It should not contain > duplicates to begin with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1754) Deadlock in job execution
[ https://issues.apache.org/jira/browse/FLINK-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384450#comment-14384450 ] Stephan Ewen commented on FLINK-1754: - Is this issue resolved? The issue with deadlocks in 0.8.1 is (I think) that the runtime does not obey the assumptions from the optimizer. The hash table building requires (for some reason) data availability on the probe side as well. > Deadlock in job execution > - > > Key: FLINK-1754 > URL: https://issues.apache.org/jira/browse/FLINK-1754 > Project: Flink > Issue Type: Bug >Affects Versions: 0.8.1 >Reporter: Sebastian Kruse > > I have encountered a reproducible deadlock in the execution of one of my > jobs. The part of the plan, where this happens, is the following: > {code:java} > /** Performs the reduction via creating transitive INDs and removing them > from the original IND set. */ > private DataSet> > calculateTransitiveReduction1(DataSet> > inclusionDependencies) { > // Concatenate INDs (only one hop). > DataSet> transitiveInds = inclusionDependencies > .flatMap(new SplitInds()) > .joinWithTiny(inclusionDependencies) > .where(1).equalTo(0) > .with(new ConcatenateInds()); > // Remove the concatenated INDs to come up with a transitive > reduction of the INDs. > return inclusionDependencies > .coGroup(transitiveInds) > .where(0).equalTo(0) > .with(new RemoveTransitiveInds()); > } > {code} > Seemingly, the flatmap operator waits infinitely for a free buffer to write > on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Fix issue where Windows paths were not recogni...
Github user balidani commented on the pull request: https://github.com/apache/flink/pull/491#issuecomment-87062926 So most of the jobs passed, except for PROFILE="-Dhadoop.version=2.6.0 -Dscala-2.11", where it says No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1781) Quickstarts broken due to Scala Version Variables
[ https://issues.apache.org/jira/browse/FLINK-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1781. - Resolution: Fixed Fixed in 1aba942c1fd7c4dbf1c4d4f30602d69c2cb3540e > Quickstarts broken due to Scala Version Variables > - > > Key: FLINK-1781 > URL: https://issues.apache.org/jira/browse/FLINK-1781 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Blocker > Fix For: 0.9 > > > The quickstart archetype resources refer to the scala version variables. > When creating a maven project standalone, these variables are not defined, > and the pom is invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1783) Quickstart shading should not created shaded jar and dependency reduced pom
[ https://issues.apache.org/jira/browse/FLINK-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1783. - Resolution: Fixed Assignee: Stephan Ewen Fixed in d11e0910880a48bbd5c452e4c76ffdca000f5614 > Quickstart shading should not created shaded jar and dependency reduced pom > --- > > Key: FLINK-1783 > URL: https://issues.apache.org/jira/browse/FLINK-1783 > Project: Flink > Issue Type: Improvement > Components: Quickstarts >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1795) Solution set allows duplicates upon construction.
Stephan Ewen created FLINK-1795: --- Summary: Solution set allows duplicates upon construction. Key: FLINK-1795 URL: https://issues.apache.org/jira/browse/FLINK-1795 Project: Flink Issue Type: Bug Components: Iterations Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The solution set identifies entries uniquely by key. During construction, it does not eliminate duplicates. The duplicates do not get updated during the iterations (since only the first match is considered), but are contained in the final result. This contradicts the definition of the solution set. It should not contain duplicates to begin with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384410#comment-14384410 ] ASF GitHub Bot commented on FLINK-1718: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/539#issuecomment-87060638 I combined some of the sanity checks of the matrix/vector accessors. > Add sparse vector and sparse matrix types to machine learning library > - > > Key: FLINK-1718 > URL: https://issues.apache.org/jira/browse/FLINK-1718 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Till Rohrmann > Labels: ML > > Currently, the machine learning library only supports dense matrix and dense > vectors. For future algorithms it would be beneficial to also support sparse > vectors and matrices. > I'd propose to use the compressed sparse column (CSC) representation, because > it allows rather efficient operations compared to a map backed sparse > matrix/vector implementation. Furthermore, this is also the format the Breeze > library expects for sparse matrices/vectors. Thus, it is easy to convert to a > sparse breeze data structure which provides us with many linear algebra > operations. > BIDMat [1] uses the same data representation. > Resources: > [1] [https://github.com/BIDData/BIDMat] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/539#issuecomment-87060638 I combined some of the sanity checks of the matrix/vector accessors. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27321365 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- Well but actually the string interpolations do not costs anything, because the message parameter is called by-name. Meaning that it is only evaluated in case of a false requirement. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384372#comment-14384372 ] ASF GitHub Bot commented on FLINK-1718: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27321365 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- Well but actually the string interpolations do not costs anything, because the message parameter is called by-name. Meaning that it is only evaluated in case of a false requirement. > Add sparse vector and sparse matrix types to machine learning library
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27321156 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- That's right. I'll probably remove the checks and expect the user to know what he's doing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1794) Add test base for scalatest and adapt flink-ml test cases
[ https://issues.apache.org/jira/browse/FLINK-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384362#comment-14384362 ] ASF GitHub Bot commented on FLINK-1794: --- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/540 [FLINK-1794] [test-utils] Adds test base for scala tests and adapts existing flink-ml tests The test base for scala tests is implemented as a trait. The trait expects as a self-type ```Suite```. Mixing this trait into a test suite will start a ```ForkableFlinkMiniCluster``` upon class loading and stops the respective cluster after all tests have been run. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixClusterMLITCases Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #540 commit 25542ddc1c88a92f5b50df7d3384ad2820560510 Author: Till Rohrmann Date: 2015-03-27T18:40:42Z [FLINK-1794] [test-utils] Adds test base for scala tests and adapts existing flink-ml tests > Add test base for scalatest and adapt flink-ml test cases > - > > Key: FLINK-1794 > URL: https://issues.apache.org/jira/browse/FLINK-1794 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Till Rohrmann > > Currently, the flink-ml test cases use the standard {{ExecutionEnvironment}} > which can cause problems in parallel test executions as they happen on > Travis. For these tests it would be helpful to have an appropriate Scala test > base which instantiates a {{ForkableFlinkMiniCluster}} and sets the > {{ExecutionEnvironment}} appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384365#comment-14384365 ] ASF GitHub Bot commented on FLINK-1718: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27321156 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- That's right. I'll probably remove the checks and expect the user to know what he's doing. > Add sparse vector and sparse matrix types to machine learning library > - > >
[GitHub] flink pull request: [FLINK-1794] [test-utils] Adds test base for s...
GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/540 [FLINK-1794] [test-utils] Adds test base for scala tests and adapts existing flink-ml tests The test base for scala tests is implemented as a trait. The trait expects as a self-type ```Suite```. Mixing this trait into a test suite will start a ```ForkableFlinkMiniCluster``` upon class loading and stops the respective cluster after all tests have been run. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixClusterMLITCases Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #540 commit 25542ddc1c88a92f5b50df7d3384ad2820560510 Author: Till Rohrmann Date: 2015-03-27T18:40:42Z [FLINK-1794] [test-utils] Adds test base for scala tests and adapts existing flink-ml tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1790) Remove the redundant import code
[ https://issues.apache.org/jira/browse/FLINK-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Saputra resolved FLINK-1790. -- Resolution: Fixed Fix Version/s: 0.9 PR merged to master. Thanks! > Remove the redundant import code > > > Key: FLINK-1790 > URL: https://issues.apache.org/jira/browse/FLINK-1790 > Project: Flink > Issue Type: Improvement > Components: TaskManager >Affects Versions: master >Reporter: Sibao Hong >Assignee: Sibao Hong >Priority: Minor > Fix For: 0.9 > > > Remove the redundant import code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1790) Remove the redundant import code
[ https://issues.apache.org/jira/browse/FLINK-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384325#comment-14384325 ] ASF GitHub Bot commented on FLINK-1790: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/538 > Remove the redundant import code > > > Key: FLINK-1790 > URL: https://issues.apache.org/jira/browse/FLINK-1790 > Project: Flink > Issue Type: Improvement > Components: TaskManager >Affects Versions: master >Reporter: Sibao Hong >Assignee: Sibao Hong >Priority: Minor > > Remove the redundant import code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1790]Remove the redundant import code
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/538 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1794) Add test base for scalatest and adapt flink-ml test cases
Till Rohrmann created FLINK-1794: Summary: Add test base for scalatest and adapt flink-ml test cases Key: FLINK-1794 URL: https://issues.apache.org/jira/browse/FLINK-1794 Project: Flink Issue Type: Improvement Reporter: Till Rohrmann Assignee: Till Rohrmann Currently, the flink-ml test cases use the standard {{ExecutionEnvironment}} which can cause problems in parallel test executions as they happen on Travis. For these tests it would be helpful to have an appropriate Scala test base which instantiates a {{ForkableFlinkMiniCluster}} and sets the {{ExecutionEnvironment}} appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1782) Change Quickstart Java version to 1.7
[ https://issues.apache.org/jira/browse/FLINK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1782. - Resolution: Won't Fix Fix Version/s: (was: 0.9) > Change Quickstart Java version to 1.7 > - > > Key: FLINK-1782 > URL: https://issues.apache.org/jira/browse/FLINK-1782 > Project: Flink > Issue Type: Improvement > Components: Quickstarts >Affects Versions: 0.9 >Reporter: Stephan Ewen > > The quickstarts refer to the outdated Java 1.6 source and bin version. We > should upgrade this to 1.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1782) Change Quickstart Java version to 1.7
[ https://issues.apache.org/jira/browse/FLINK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384254#comment-14384254 ] Stephan Ewen commented on FLINK-1782: - This is currently not possible, since the builds with Java 6 fail to run the quickstart tests when the quickstart pom specifies compiler version 1.7 (Java 7) > Change Quickstart Java version to 1.7 > - > > Key: FLINK-1782 > URL: https://issues.apache.org/jira/browse/FLINK-1782 > Project: Flink > Issue Type: Improvement > Components: Quickstarts >Affects Versions: 0.9 >Reporter: Stephan Ewen > > The quickstarts refer to the outdated Java 1.6 source and bin version. We > should upgrade this to 1.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method
[ https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384210#comment-14384210 ] ASF GitHub Bot commented on FLINK-1633: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/452#issuecomment-87024583 Ha! This seems like travis is pending but it has actually finished o.O There is the following failure for hadoop1 / openjdk 6: `Failed tests: ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:266 The program encountered a AssertionError : expected:<55> but was:<6239784976>` Shall I ignore it and merge? Thanks! > Add getTriplets() Gelly method > -- > > Key: FLINK-1633 > URL: https://issues.apache.org/jira/browse/FLINK-1633 > Project: Flink > Issue Type: New Feature > Components: Gelly >Affects Versions: 0.9 >Reporter: Vasia Kalavri >Assignee: Andra Lungu >Priority: Minor > Labels: starter > > In some graph algorithms, it is required to access the graph edges together > with the vertex values of the source and target vertices. For example, > several graph weighting schemes compute some kind of similarity weights for > edges, based on the attributes of the source and target vertices. This issue > proposes adding a convenience Gelly method that generates a DataSet of > triplets from the input graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/452#issuecomment-87024583 Ha! This seems like travis is pending but it has actually finished o.O There is the following failure for hadoop1 / openjdk 6: `Failed tests: ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:266 The program encountered a AssertionError : expected:<55> but was:<6239784976>` Shall I ignore it and merge? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384092#comment-14384092 ] ASF GitHub Bot commented on FLINK-1718: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/539#issuecomment-86996068 Looks good to merge. There are some performance improvements possible. > Add sparse vector and sparse matrix types to machine learning library > - > > Key: FLINK-1718 > URL: https://issues.apache.org/jira/browse/FLINK-1718 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Till Rohrmann > Labels: ML > > Currently, the machine learning library only supports dense matrix and dense > vectors. For future algorithms it would be beneficial to also support sparse > vectors and matrices. > I'd propose to use the compressed sparse column (CSC) representation, because > it allows rather efficient operations compared to a map backed sparse > matrix/vector implementation. Furthermore, this is also the format the Breeze > library expects for sparse matrices/vectors. Thus, it is easy to convert to a > sparse breeze data structure which provides us with many linear algebra > operations. > BIDMat [1] uses the same data representation. > Resources: > [1] [https://github.com/BIDData/BIDMat] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/539#issuecomment-86996068 Looks good to merge. There are some performance improvements possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384080#comment-14384080 ] ASF GitHub Bot commented on FLINK-1718: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27308257 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- Many ops in this class seem to use this method. I'm not sure how expensive the string interpolation is. > Add sparse vector and sparse matrix types to machine learning library > - > >
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27308257 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseMatrix.scala --- @@ -0,0 +1,235 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.math + +import scala.util.Sorting + +/** Sparse matrix using the compressed sparse column (CSC) representation. + * + * More details concerning the compressed sparse column (CSC) representation can be found + * [http://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_.28CSC_or_CCS.29]. + * + * @param numRows Number of rows + * @param numCols Number of columns + * @param rowIndices Array containing the row indices of non-zero entries + * @param colPtrs Array containing the starting offsets in data for each column + * @param data Array containing the non-zero entries in column-major order + */ +class SparseMatrix( +val numRows: Int, +val numCols: Int, +val rowIndices: Array[Int], +val colPtrs: Array[Int], +val data: Array[Double]) + extends Matrix { + + /** Element wise access function +* +* @param row row index +* @param col column index +* @return matrix entry at (row, col) +*/ + override def apply(row: Int, col: Int): Double = { + +val index = locate(row, col) + +if(index < 0){ + 0 +} else { + data(index) +} + } + + def toDenseMatrix: DenseMatrix = { +val result = DenseMatrix.zeros(numRows, numCols) + +for(row <- 0 until numRows; col <- 0 until numCols) { + result(row, col) = apply(row, col) +} + +result + } + + /** Element wise update function +* +* @param row row index +* @param col column index +* @param value value to set at (row, col) +*/ + override def update(row: Int, col: Int, value: Double): Unit = { +val index = locate(row, col) + +if(index < 0) { + throw new IllegalArgumentException("Cannot update zero value of sparse matrix at index " + + s"($row, $col)") +} else { + data(index) = value +} + } + + override def toString: String = { +val result = StringBuilder.newBuilder + +result.append(s"SparseMatrix($numRows, $numCols)\n") + +var columnIndex = 0 + +val fieldWidth = math.max(numRows, numCols).toString.length +val valueFieldWidth = data.map(_.toString.length).max + 2 + +for(index <- 0 until colPtrs.last) { + while(colPtrs(columnIndex + 1) <= index){ +columnIndex += 1 + } + + val rowStr = rowIndices(index).toString + val columnStr = columnIndex.toString + val valueStr = data(index).toString + + result.append("(" + " " * (fieldWidth - rowStr.length) + rowStr + "," + +" " * (fieldWidth - columnStr.length) + columnStr + ")") + result.append(" " * (valueFieldWidth - valueStr.length) + valueStr) + result.append("\n") +} + +result.toString + } + + private def locate(row: Int, col: Int): Int = { +require(0 <= row && row < numRows, s"Row $row is out of bounds [0, $numRows).") +require(0 <= col && col < numCols, s"Col $col is out of bounds [0, $numCols).") --- End diff -- Many ops in this class seem to use this method. I'm not sure how expensive the string interpolation is. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1718) Add sparse vector and sparse matrix types to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384070#comment-14384070 ] ASF GitHub Bot commented on FLINK-1718: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27307815 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseVector.scala --- @@ -67,7 +63,25 @@ case class DenseVector(val values: Array[Double]) extends Vector { * @return Copy of the vector instance */ override def copy: Vector = { -DenseVector(values.clone()) +DenseVector(data.clone()) + } + + /** Updates the element at the given index with the provided value +* +* @param index +* @param value +*/ + override def update(index: Int, value: Double): Unit = { +require(0 <= index && index < data.length, s"Index $index is out of bounds " + --- End diff -- Might be inefficient. > Add sparse vector and sparse matrix types to machine learning library > - > > Key: FLINK-1718 > URL: https://issues.apache.org/jira/browse/FLINK-1718 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Till Rohrmann > Labels: ML > > Currently, the machine learning library only supports dense matrix and dense > vectors. For future algorithms it would be beneficial to also support sparse > vectors and matrices. > I'd propose to use the compressed sparse column (CSC) representation, because > it allows rather efficient operations compared to a map backed sparse > matrix/vector implementation. Furthermore, this is also the format the Breeze > library expects for sparse matrices/vectors. Thus, it is easy to convert to a > sparse breeze data structure which provides us with many linear algebra > operations. > BIDMat [1] uses the same data representation. > Resources: > [1] [https://github.com/BIDData/BIDMat] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1718] Adds sparse matrix and sparse vec...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/539#discussion_r27307815 --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseVector.scala --- @@ -67,7 +63,25 @@ case class DenseVector(val values: Array[Double]) extends Vector { * @return Copy of the vector instance */ override def copy: Vector = { -DenseVector(values.clone()) +DenseVector(data.clone()) + } + + /** Updates the element at the given index with the provided value +* +* @param index +* @param value +*/ + override def update(index: Int, value: Double): Unit = { +require(0 <= index && index < data.length, s"Index $index is out of bounds " + --- End diff -- Might be inefficient. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/518 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor
[ https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384021#comment-14384021 ] ASF GitHub Bot commented on FLINK-1589: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/427#issuecomment-86979441 I've updated the PR. It is now ready for review again. > Add option to pass Configuration to LocalExecutor > - > > Key: FLINK-1589 > URL: https://issues.apache.org/jira/browse/FLINK-1589 > Project: Flink > Issue Type: New Feature >Reporter: Robert Metzger >Assignee: Robert Metzger > > Right now its not possible for users to pass custom configuration values to > Flink when running it from within an IDE. > It would be very convenient to be able to create a local execution > environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/427#issuecomment-86979441 I've updated the PR. It is now ready for review again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1650. --- Resolution: Fixed Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/c9d29f26 > Suppress Akka's Netty Shutdown Errors through the log config > > > Key: FLINK-1650 > URL: https://issues.apache.org/jira/browse/FLINK-1650 > Project: Flink > Issue Type: Bug > Components: other >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > > I suggest to set the logging for > `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get > rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384009#comment-14384009 ] ASF GitHub Bot commented on FLINK-1501: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/421 > Integrate metrics library and report basic metrics to JobManager web interface > -- > > Key: FLINK-1501 > URL: https://issues.apache.org/jira/browse/FLINK-1501 > Project: Flink > Issue Type: Sub-task > Components: JobManager, TaskManager >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > Fix For: 0.9 > > > As per mailing list, the library: https://github.com/dropwizard/metrics > The goal of this task is to get the basic infrastructure in place. > Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384008#comment-14384008 ] ASF GitHub Bot commented on FLINK-1650: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/518 > Suppress Akka's Netty Shutdown Errors through the log config > > > Key: FLINK-1650 > URL: https://issues.apache.org/jira/browse/FLINK-1650 > Project: Flink > Issue Type: Bug > Components: other >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > > I suggest to set the logging for > `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get > rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/421 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1771) Add support for submitting single jobs to a detached YARN session
[ https://issues.apache.org/jira/browse/FLINK-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1771: -- Summary: Add support for submitting single jobs to a detached YARN session (was: Add support for submitting single jobs detached YARN) > Add support for submitting single jobs to a detached YARN session > - > > Key: FLINK-1771 > URL: https://issues.apache.org/jira/browse/FLINK-1771 > Project: Flink > Issue Type: Improvement > Components: YARN Client >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > > We need tests ensuring that the processing slots are set properly when > starting Flink on YARN, in particular with the per job YARN session feature. > Also, the YARN tests for detached YARN sessions / per job yarn clusters are > polluting the local home-directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1771) Add support for submitting single jobs detached YARN
[ https://issues.apache.org/jira/browse/FLINK-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1771: -- Summary: Add support for submitting single jobs detached YARN (was: Add tests for setting the processing slots for the YARN client) > Add support for submitting single jobs detached YARN > > > Key: FLINK-1771 > URL: https://issues.apache.org/jira/browse/FLINK-1771 > Project: Flink > Issue Type: Improvement > Components: YARN Client >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > > We need tests ensuring that the processing slots are set properly when > starting Flink on YARN, in particular with the per job YARN session feature. > Also, the YARN tests for detached YARN sessions / per job yarn clusters are > polluting the local home-directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1793) Streaming jobs can't be canceled
[ https://issues.apache.org/jira/browse/FLINK-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383878#comment-14383878 ] Márton Balassi commented on FLINK-1793: --- Just cancelled a streaming job successfully after running it for 10 minutes on 10 taskmanagers. Cancellation took around 8-10 seconds in this case (the use case is the infamous chained mapper though). > Streaming jobs can't be canceled > > > Key: FLINK-1793 > URL: https://issues.apache.org/jira/browse/FLINK-1793 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 0.9 >Reporter: Maximilian Michels > Fix For: 0.9 > > > The streaming WordCount gets stuck in the "CANCELED" state after it has been > canceled using either the web interface or the command line interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1793) Streaming jobs can't be canceled
[ https://issues.apache.org/jira/browse/FLINK-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383850#comment-14383850 ] Maximilian Michels commented on FLINK-1793: --- Some minutes. Can't give you an exact time period though. The parallelism was set to 20 in the experiments. 10 task managers with two task slots each. > Streaming jobs can't be canceled > > > Key: FLINK-1793 > URL: https://issues.apache.org/jira/browse/FLINK-1793 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 0.9 >Reporter: Maximilian Michels > Fix For: 0.9 > > > The streaming WordCount gets stuck in the "CANCELED" state after it has been > canceled using either the web interface or the command line interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/518#issuecomment-86944772 I'll merge the change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1781) Quickstarts broken due to Scala Version Variables
[ https://issues.apache.org/jira/browse/FLINK-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1781: -- Assignee: Stephan Ewen (was: Robert Metzger) > Quickstarts broken due to Scala Version Variables > - > > Key: FLINK-1781 > URL: https://issues.apache.org/jira/browse/FLINK-1781 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Blocker > Fix For: 0.9 > > > The quickstart archetype resources refer to the scala version variables. > When creating a maven project standalone, these variables are not defined, > and the pom is invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383851#comment-14383851 ] ASF GitHub Bot commented on FLINK-1650: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/518#issuecomment-86944772 I'll merge the change > Suppress Akka's Netty Shutdown Errors through the log config > > > Key: FLINK-1650 > URL: https://issues.apache.org/jira/browse/FLINK-1650 > Project: Flink > Issue Type: Bug > Components: other >Affects Versions: 0.9 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 0.9 > > > I suggest to set the logging for > `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get > rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1793) Streaming jobs can't be canceled
[ https://issues.apache.org/jira/browse/FLINK-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383829#comment-14383829 ] Robert Metzger commented on FLINK-1793: --- How long did you wait after cancelling the Job? With higher parallelism, this can take minutes :( > Streaming jobs can't be canceled > > > Key: FLINK-1793 > URL: https://issues.apache.org/jira/browse/FLINK-1793 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 0.9 >Reporter: Maximilian Michels > Fix For: 0.9 > > > The streaming WordCount gets stuck in the "CANCELED" state after it has been > canceled using either the web interface or the command line interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1790) Remove the redundant import code
[ https://issues.apache.org/jira/browse/FLINK-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383782#comment-14383782 ] ASF GitHub Bot commented on FLINK-1790: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/538#issuecomment-86930773 :+1: > Remove the redundant import code > > > Key: FLINK-1790 > URL: https://issues.apache.org/jira/browse/FLINK-1790 > Project: Flink > Issue Type: Improvement > Components: TaskManager >Affects Versions: master >Reporter: Sibao Hong >Assignee: Sibao Hong >Priority: Minor > > Remove the redundant import code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1790]Remove the redundant import code
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/538#issuecomment-86930773 :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1669) Streaming tests for recovery with distributed process failure
[ https://issues.apache.org/jira/browse/FLINK-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383767#comment-14383767 ] ASF GitHub Bot commented on FLINK-1669: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/496#issuecomment-86927956 When merging this please put the outputs to a TempFolder, right now they are only deleted if the check of the output is reached. > Streaming tests for recovery with distributed process failure > - > > Key: FLINK-1669 > URL: https://issues.apache.org/jira/browse/FLINK-1669 > Project: Flink > Issue Type: Sub-task > Components: Streaming >Affects Versions: 0.9 >Reporter: Márton Balassi >Assignee: Márton Balassi > Fix For: 0.9 > > > Multiple JVM test for streaming recovery from failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1669] [wip] Test for streaming recovery...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/496#issuecomment-86927956 When merging this please put the outputs to a TempFolder, right now they are only deleted if the check of the output is reached. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request:
Github user mbalassi commented on the pull request: https://github.com/apache/flink/commit/83db1db1c93b45943a788a8bd61f023b389d4af2#commitcomment-10430523 When merging this please put the outputs to a `TempFolder`, right now they are only deleted if the check of the output is reached. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1792) Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only
[ https://issues.apache.org/jira/browse/FLINK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1792: -- Fix Version/s: (was: pre-apache) > Improve TM Monitoring: CPU utilization, hide graphs by default and show > summary only > > > Key: FLINK-1792 > URL: https://issues.apache.org/jira/browse/FLINK-1792 > Project: Flink > Issue Type: Sub-task > Components: Webfrontend >Affects Versions: 0.9 >Reporter: Robert Metzger > > As per https://github.com/apache/flink/pull/421 from FLINK-1501, there are > some enhancements to the current monitoring required > - Get the CPU utilization in % from each TaskManager process > - Remove the metrics graph from the overview and only show the current stats > as numbers (cpu load, heap utilization) and add a button to enable the > detailed graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1793) Streaming jobs can't be canceled
Maximilian Michels created FLINK-1793: - Summary: Streaming jobs can't be canceled Key: FLINK-1793 URL: https://issues.apache.org/jira/browse/FLINK-1793 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Maximilian Michels Fix For: 0.9 The streaming WordCount gets stuck in the "CANCELED" state after it has been canceled using either the web interface or the command line interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383646#comment-14383646 ] ASF GitHub Bot commented on FLINK-1501: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898969 I've filed a JIRA for the changes requested here: https://issues.apache.org/jira/browse/FLINK-1792 > Integrate metrics library and report basic metrics to JobManager web interface > -- > > Key: FLINK-1501 > URL: https://issues.apache.org/jira/browse/FLINK-1501 > Project: Flink > Issue Type: Sub-task > Components: JobManager, TaskManager >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > Fix For: 0.9 > > > As per mailing list, the library: https://github.com/dropwizard/metrics > The goal of this task is to get the basic infrastructure in place. > Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1792) Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only
Robert Metzger created FLINK-1792: - Summary: Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only Key: FLINK-1792 URL: https://issues.apache.org/jira/browse/FLINK-1792 Project: Flink Issue Type: Sub-task Components: Webfrontend Affects Versions: 0.9 Reporter: Robert Metzger As per https://github.com/apache/flink/pull/421 from FLINK-1501, there are some enhancements to the current monitoring required - Get the CPU utilization in % from each TaskManager process - Remove the metrics graph from the overview and only show the current stats as numbers (cpu load, heap utilization) and add a button to enable the detailed graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898969 I've filed a JIRA for the changes requested here: https://issues.apache.org/jira/browse/FLINK-1792 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383645#comment-14383645 ] ASF GitHub Bot commented on FLINK-1501: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898299 Hey @bhatsachin, I've merged the change to master. If you want, we can do a quick hangout or skype call to discuss potential contributions from your side. > Integrate metrics library and report basic metrics to JobManager web interface > -- > > Key: FLINK-1501 > URL: https://issues.apache.org/jira/browse/FLINK-1501 > Project: Flink > Issue Type: Sub-task > Components: JobManager, TaskManager >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > Fix For: 0.9 > > > As per mailing list, the library: https://github.com/dropwizard/metrics > The goal of this task is to get the basic infrastructure in place. > Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383644#comment-14383644 ] ASF GitHub Bot commented on FLINK-1501: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898282 Hey @bhatsachin, I've merged the change to master. If you want, we can do a quick hangout or skype call to discuss potential contributions from your side. > Integrate metrics library and report basic metrics to JobManager web interface > -- > > Key: FLINK-1501 > URL: https://issues.apache.org/jira/browse/FLINK-1501 > Project: Flink > Issue Type: Sub-task > Components: JobManager, TaskManager >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > Fix For: 0.9 > > > As per mailing list, the library: https://github.com/dropwizard/metrics > The goal of this task is to get the basic infrastructure in place. > Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898299 Hey @bhatsachin, I've merged the change to master. If you want, we can do a quick hangout or skype call to discuss potential contributions from your side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-86898282 Hey @bhatsachin, I've merged the change to master. If you want, we can do a quick hangout or skype call to discuss potential contributions from your side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1501. --- Resolution: Fixed Fix Version/s: (was: pre-apache) 0.9 Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/2d1f8b07 > Integrate metrics library and report basic metrics to JobManager web interface > -- > > Key: FLINK-1501 > URL: https://issues.apache.org/jira/browse/FLINK-1501 > Project: Flink > Issue Type: Sub-task > Components: JobManager, TaskManager >Affects Versions: 0.9 >Reporter: Robert Metzger >Assignee: Robert Metzger > Fix For: 0.9 > > > As per mailing list, the library: https://github.com/dropwizard/metrics > The goal of this task is to get the basic infrastructure in place. > Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Flink dev cluster on Docker with Docker Compos...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/533#issuecomment-86861202 Thank you very much for the contribution. I've merged the pull request to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Flink dev cluster on Docker with Docker Compos...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/533 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---