[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-136521155 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8485 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135955245 [Test build #41781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41781/consoleFull) for PR 8485 at commit [`1378c23`](https://github.com/apache/spark/commit/1378c23510360531da216f2b2a275b48aaec7348). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135959234 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41781/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135959193 [Test build #41781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41781/console) for PR 8485 at commit [`1378c23`](https://github.com/apache/spark/commit/1378c23510360531da216f2b2a275b48aaec7348). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DCT(JavaTransformer, HasInputCol, HasOutputCol):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135959233 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135953111 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135953058 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135705308 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135705261 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135709399 [Test build #41742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41742/consoleFull) for PR 8485 at commit [`d5d5270`](https://github.com/apache/spark/commit/d5d5270630f3e6524d7e5f065b1b5bbf9ed9e78c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135718650 [Test build #41742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41742/console) for PR 8485 at commit [`d5d5270`](https://github.com/apache/spark/commit/d5d5270630f3e6524d7e5f065b1b5bbf9ed9e78c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DCT(JavaTransformer, HasInputCol, HasOutputCol):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135718766 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41742/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135718765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135436880 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135436933 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135445963 [Test build #41688 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41688/console) for PR 8485 at commit [`565a831`](https://github.com/apache/spark/commit/565a83142eae29b23ad1bdae3239df375cc47001). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DCT(JavaTransformer, HasInputCol, HasOutputCol):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/8485 [SPARK-8472] [ML] [PySpark] Python API for DCT Add Python API for ml.feature.DCT. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanboliang/spark spark-8472 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8485.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8485 commit 565a83142eae29b23ad1bdae3239df375cc47001 Author: Yanbo Liang yblia...@gmail.com Date: 2015-08-27T13:42:04Z Python API for DCT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135439908 [Test build #41688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41688/consoleFull) for PR 8485 at commit [`565a831`](https://github.com/apache/spark/commit/565a83142eae29b23ad1bdae3239df375cc47001). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135446096 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135446099 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41688/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/8485#discussion_r38116030 --- Diff: python/pyspark/ml/feature.py --- @@ -167,6 +167,65 @@ def getSplits(self): @inherit_doc +class DCT(JavaTransformer, HasInputCol, HasOutputCol): + +A feature transformer that takes the 1D discrete cosine transform of a real vector. No zero +padding is performed on the input vector. +It returns a real vector of the same length representing the DCT. The return vector is scaled +such that the transform matrix is unitary (aka scaled DCT-II). + +More information on `https://en.wikipedia.org/wiki/Discrete_cosine_transform#DCT-II Wikipedia`. + + from pyspark.mllib.linalg import Vectors + df = sqlContext.createDataFrame([(Vectors.dense([5.0, 8.0, 6.0]),)], [vec]) + dct = DCT(inverse=False, inputCol=vec, outputCol=resultVec) + dct.transform(df).head().resultVec +DenseVector([10.969..., -0.707..., -2.041...]) + dct.setInverse(True).transform(df).head().resultVec --- End diff -- I would transform `resultVec` back to `origVec` to show that this is the inverse. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/8485#issuecomment-135485931 LGTM except minor inline comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8472] [ML] [PySpark] Python API for DCT
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/8485#discussion_r38116027 --- Diff: python/pyspark/ml/feature.py --- @@ -167,6 +167,65 @@ def getSplits(self): @inherit_doc +class DCT(JavaTransformer, HasInputCol, HasOutputCol): + +A feature transformer that takes the 1D discrete cosine transform of a real vector. No zero --- End diff -- Keep line width in docstring at 72 (PEP8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org