[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104854320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33393/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104854318 [Test build #33393 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33393/consoleFull) for PR 6342 at commit [`c528eed`](https://github.com/apache/spark/commit/c528eedc2c6553aee335b6994e7a5aed196071f1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104854319 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-7826][CORE] Suppress extra calling...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/6352#issuecomment-104854294 @JoshRosen Thank you for your details. It is exactly that I was noticed yesterday. I'm modifying `DAGScheduler` and adding the tests. I'll push the next version as soon as possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6806] [SparkR] [Docs] Fill in SparkR ex...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/5442#issuecomment-104854271 @davies Sorry for the delay in looking at this. I think this change looks pretty good -- I found a minor typo that we can fix up during merge. I think it might be better to actually create a new page for SparkR rather than append it to the DataFrames page -- but I'l do this in a follow PR. LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6806] [SparkR] [Docs] Fill in SparkR ex...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/5442#discussion_r30942598 --- Diff: docs/sql-programming-guide.md --- @@ -1430,6 +1633,16 @@ df = sqlContext.load(source="jdbc", url="jdbc:postgresql:dbserver", dbtable="sch + + +{% highlight r %} + +df <- laodDF(sqlContext, source="jdbc", url="jdbc:postgresql:dbserver", dbtable="schema.tablename") --- End diff -- Minor typo: This should be `loadDF` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104854118 Eh damn. R. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6811] Copy SparkR lib in make-distribut...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6373#issuecomment-104854108 The YARN test worked out fine. I might leave this open in case @davies also has time to test this out. @pwendell Feel free to merge this if its getting close to another RC --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3674] YARN support in Spark EC2
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6376#issuecomment-104854097 [Test build #33400 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33400/consoleFull) for PR 6376 at commit [`961504a`](https://github.com/apache/spark/commit/961504a6a35b095fc5ffa636000c6e754b086199). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3674] YARN support in Spark EC2
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6376#issuecomment-104854044 @andrewor14 - One question I had was whether we should open `8042` in the worker nodes. This might be useful to see the NodeManager UI ? (See `yarn.nodemanager.webapp.address` in https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3674] YARN support in Spark EC2
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6376#issuecomment-104854024 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3674] YARN support in Spark EC2
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6376#issuecomment-104854020 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3674] YARN support in Spark EC2
GitHub user shivaram opened a pull request: https://github.com/apache/spark/pull/6376 [SPARK-3674] YARN support in Spark EC2 This corresponds to https://github.com/mesos/spark-ec2/pull/116 in the spark-ec2 repo. The only changes required on the spark_ec2.py script is to open the RM port. cc @andrewor14 You can merge this pull request into a Git repository by running: $ git pull https://github.com/shivaram/spark-1 spark-ec2-yarn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6376 commit 152c94cda053b0c535ea698b54fb09b736a703e3 Author: Shivaram Venkataraman Date: 2015-05-23T05:34:45Z Open 8088 for YARN in EC2 commit 961504a6a35b095fc5ffa636000c6e754b086199 Author: Shivaram Venkataraman Date: 2015-05-23T05:35:04Z Merge branch 'master' of https://github.com/apache/spark into spark-ec2-yarn --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104853733 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33390/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104853730 [Test build #33390 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33390/consoleFull) for PR 6366 at commit [`56d2540`](https://github.com/apache/spark/commit/56d25408be22786fd1e93285ba371276a5569c9f). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104853732 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3920 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942519 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): +spstr = "{0} X {1} ".format(self.numRows, self.numCols) +if self.isTransposed: +spstr += "CSRMatrix\n" +else: +spstr += "CSCMatrix\n" + +for i, colPtr in enumerate(self.colPtrs[:-1]): +endptr = self.colPtrs[i + 1] +values = self.values[colPtr: endptr] +rowindices = self.rowIndices[colPtr: endptr] +for j, rowInd in enumerate(rowindices): +if self.isTransposed: +spstr += '({0},{1}) {2}\n'.format( +i, rowInd, _format_float(values[j])) +else: +spstr += '({0},{1}) {2}\n'.format( +rowInd, i, _format_float(values[j])) --- End diff -- but indices, values and colPtrs are numpy arrays. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104853605 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104853602 [Test build #33391 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33391/consoleFull) for PR 3920 at commit [`d2153df`](https://github.com/apache/spark/commit/d2153df5f5472f34b427a39e8fd450f7d37d2fc6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104853606 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33391/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7684] [SQL] Invoking HiveContext.newTem...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6359#issuecomment-104853577 [Test build #33399 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33399/consoleFull) for PR 6359 at commit [`95d2eb8`](https://github.com/apache/spark/commit/95d2eb82128904239f7af1e1d4362e117db5f155). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7684] [SQL] Invoking HiveContext.newTem...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6359#issuecomment-104853543 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7684] [SQL] Invoking HiveContext.newTem...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6359#issuecomment-104853537 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942496 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): +spstr = "{0} X {1} ".format(self.numRows, self.numCols) +if self.isTransposed: +spstr += "CSRMatrix\n" +else: +spstr += "CSCMatrix\n" + +for i, colPtr in enumerate(self.colPtrs[:-1]): +endptr = self.colPtrs[i + 1] +values = self.values[colPtr: endptr] +rowindices = self.rowIndices[colPtr: endptr] +for j, rowInd in enumerate(rowindices): +if self.isTransposed: +spstr += '({0},{1}) {2}\n'.format( +i, rowInd, _format_float(values[j])) +else: +spstr += '({0},{1}) {2}\n'.format( +rowInd, i, _format_float(values[j])) --- End diff -- slice in Python does copy the data. izip() will not copy data. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942480 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): +spstr = "{0} X {1} ".format(self.numRows, self.numCols) +if self.isTransposed: +spstr += "CSRMatrix\n" +else: +spstr += "CSCMatrix\n" + +for i, colPtr in enumerate(self.colPtrs[:-1]): +endptr = self.colPtrs[i + 1] +values = self.values[colPtr: endptr] +rowindices = self.rowIndices[colPtr: endptr] +for j, rowInd in enumerate(rowindices): +if self.isTransposed: +spstr += '({0},{1}) {2}\n'.format( +i, rowInd, _format_float(values[j])) +else: +spstr += '({0},{1}) {2}\n'.format( +rowInd, i, _format_float(values[j])) --- End diff -- Great. But do you have any objection to the current method? I'll change the string `+=` to appending to a list, so that it become faster. If you are afraid of the temporary array creation in my method, those are just slices. And also this creates another array (zip) of size nnz. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7840] add insertInto() to Writer
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6375#issuecomment-104853456 [Test build #33397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33397/consoleFull) for PR 6375 at commit [`826423e`](https://github.com/apache/spark/commit/826423ec891ff3a62e1be162c7f1d16baac5efcc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853372 [Test build #33398 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33398/consoleFull) for PR 6374 at commit [`69004c7`](https://github.com/apache/spark/commit/69004c7043d6529286f856d403990d095407ec3d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853343 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7840] add insertInto() to Writer
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6375#issuecomment-104853337 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853339 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7840] add insertInto() to Writer
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6375#issuecomment-104853344 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7840] add insertInto() to Writer
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/6375 [SPARK-7840] add insertInto() to Writer Add tests later. You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark insertInto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6375.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6375 commit 826423ec891ff3a62e1be162c7f1d16baac5efcc Author: Davies Liu Date: 2015-05-23T06:22:47Z add insertInto() to Writer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853299 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853191 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104853192 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33396/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] Add tests for SparkListenerApplicatio...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6368 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104850766 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] Add tests for SparkListenerApplicatio...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/6368#issuecomment-104850776 Merging into master, thanks @harishreedharan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104850763 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7838] [STREAMING] Set scope for kinesis...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6369 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user davies closed the pull request at: https://github.com/apache/spark/pull/6364 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104850731 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7838] [STREAMING] Set scope for kinesis...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/6369#issuecomment-104850730 Merging into master and 1.4 thanks TD. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7777][Streaming] Handle the case when t...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6372#issuecomment-104850450 @tdas FYI you didn't merge this. Did you intend to? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849881 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849884 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33395/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849874 [Test build #33395 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33395/consoleFull) for PR 6374 at commit [`288cea9`](https://github.com/apache/spark/commit/288cea98a08133b4d1078d57be3e9d6a13a5eb59). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class Window(object):` * `class WindowSpec(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104849713 LGTM. Let's merge this once tests pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849645 [Test build #33395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33395/consoleFull) for PR 6374 at commit [`288cea9`](https://github.com/apache/spark/commit/288cea98a08133b4d1078d57be3e9d6a13a5eb59). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104849614 I submitted a pull request with more documentation changes: https://github.com/apache/spark/pull/6374 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849591 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322][SQL] Improve DataFrame window fun...
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/6370 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6374#issuecomment-104849587 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322][SQL] Improve DataFrame window fun...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6370#issuecomment-104849602 Closing this in favor of https://github.com/apache/spark/pull/6374 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] Data...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/6374 [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates 1. ntile should take an integer as parameter. 2. Added Python API (based on #6364) 3. Update documentation of various DataFrame Python functions. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark window-final Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6374.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6374 commit 778e2c0fc506e1995cca5e1808c76574a0866213 Author: Davies Liu Date: 2015-05-22T22:01:25Z SPARK-7836 and SPARK-7822: Python API of window functions commit 264935816245ed912194ce3bf4cccd3b415146a6 Author: Davies Liu Date: 2015-05-22T22:17:49Z update docs commit 8936ade324ce0eb44ad498e0dda0196575fc7e1e Author: Davies Liu Date: 2015-05-23T00:51:59Z fix maxint in python 3 commit ef55132e09a1231db9e0f12b1f14c2460e25090b Author: Davies Liu Date: 2015-05-23T01:31:42Z Merge branch 'master' of github.com:apache/spark into window4 commit ed73cb4ee72d9032789c9e4d7fa63e31897cace2 Author: Reynold Xin Date: 2015-05-23T01:00:33Z [SPARK-7322][SQL] Improve DataFrame window function documentation. Conflicts: sql/core/src/main/scala/org/apache/spark/sql/functions.scala commit 66092b45435bef6dd220419ec842f4da4e4318c6 Author: Davies Liu Date: 2015-05-23T01:09:57Z update docs commit 7cb8985c7e862f14d690c9d8db5150f1c81889ca Author: Reynold Xin Date: 2015-05-23T04:58:29Z Merge pull request #6364 from davies/window [SPARK-7836] [SPARK-7822] Python API of window functions commit 288cea98a08133b4d1078d57be3e9d6a13a5eb59 Author: Reynold Xin Date: 2015-05-23T05:53:40Z Update documentaiton. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104849538 Yes, @rxin is working on it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-7826][CORE] Suppress extra calling...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6352#issuecomment-104848483 Also: if my above reasoning is right and this optimization is incorrect, then it's concerning that it didn't cause a test failure. My hunch is that we don't have unit tests for the particular combinations of RDD dependency graphs, caching states, and map output availability that would expose this issue. It would be nice to write a failing regression test which would have caught the problems in the current version of this patch, since that will help us to gain confidence that the new optimizations are safe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7161][History Server] Provide REST api ...
Github user harishreedharan commented on a diff in the pull request: https://github.com/apache/spark/pull/5792#discussion_r30942242 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala --- @@ -164,6 +164,18 @@ private[v1] class ApiRootResource extends UIRootFromServletContext { } } + @Path("applications/{appId}/download") + def getEventLogs( +@PathParam("appId") appId: String): EventLogDownloadResource = { +new EventLogDownloadResource(uiRoot, appId, None) + } + + @Path("applications/{appId}/{attemptId}/download") --- End diff -- I am following the current API format - for example: `applications/{appId}/{attemptId}/storage/rdd/{rddId: \d+}` etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104847712 I think this still needs sign off from @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6811] Copy SparkR lib in make-distribut...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6373#issuecomment-104847561 Thanks @pwendell. I'm running the YARN test right now and will merge if that looks good. Also cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-7826][CORE] Suppress extra calling...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6352#issuecomment-104847514 Oh, one other thought: maybe a good exercise would be to attempt to write the Scaladoc comment for `getMissingParentStages` which describes, in prose, the basic high-level algorithm for finding missing parent stages. I can help with this tomorrow. Even if you don't end up modifying `getMissingParentStages`, I'd love to submit a new PR that just comments / explains the existing code in order to make this easier to understand in the future. To help me build some intuition for understanding your optimization here: It looks like this only save us from performing `getCacheLocs` lookups in cases where we're traversing backwards through a long chain of narrow dependencies. I don't think that this is necessarily safe. Imagine that we have a lineage graph which looks something like this: ``` âââââ shuffle ââââââââââ â A ââ â â â ââ B ââââââ C ââââ âââââ ââââââââââ â âââââ ââââ E â âââââ â âââââ â D ââââ âââââ ``` Here, `E` has one-to-one dependencies on `C` and `D`. `C` is derived from `A` by performing a shuffle and then a map. If we're trying to determine which ancestor stages need to be computed in order to compute `E`, we need to figure out whether the shuffle `A -> B` should be performed. If the RDD `C`, which has only one ancestor via a narrow dependency, is cached, then we won't need to compute `A`, even if it has some unavailable output partitions. The same goes for `B`: if `B` is 100% cached, then we can avoid the shuffle on `A`. Based on this, I don't think that we can make a local decision to skip the caching check based on the structure of the RDD graph. However, we _might_ be able to skip / optimize this check based on RDDs' storage levels: in long chains of narrow dependencies, most RDDs probably _aren't_ cached, so adding a simple `if StorageLevel = None return Seq.fill(numPartitions)(Nil)` check to `getCacheLocs` might be safe / sufficient. Someone more familiar with StorageLevel / caching semantics should double-check this reasoning to make sure that I'm not overlooking any corner-cases when RDDs' storage levels change due to unpersist / cache / persist calls. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add SparkR to create-release script
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6371 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7838] [STREAMING] Set scope for kinesis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6369#issuecomment-104847480 [Test build #33394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33394/consoleFull) for PR 6369 at commit [`87d1c7f`](https://github.com/apache/spark/commit/87d1c7f5ca5583ddc38806cf52065e14028f041f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add SparkR to create-release script
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6371#issuecomment-104847472 Thanks merged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7838] [STREAMING] Set scope for kinesis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6369#issuecomment-104847414 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7838] [STREAMING] Set scope for kinesis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6369#issuecomment-104847425 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7777][Streaming] Handle the case when t...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/6372#issuecomment-104847418 LGTM. Merging into master and branch-1.4. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6811] Copy SparkR lib in make-distribut...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6373#issuecomment-104847426 make-distribution changes LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942146 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): --- End diff -- yes, you can bin/spark-submit.py python/pyspark/mllib/linalg.py --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942142 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): +spstr = "{0} X {1} ".format(self.numRows, self.numCols) +if self.isTransposed: +spstr += "CSRMatrix\n" +else: +spstr += "CSCMatrix\n" + +for i, colPtr in enumerate(self.colPtrs[:-1]): +endptr = self.colPtrs[i + 1] +values = self.values[colPtr: endptr] +rowindices = self.rowIndices[colPtr: endptr] +for j, rowInd in enumerate(rowindices): +if self.isTransposed: +spstr += '({0},{1}) {2}\n'.format( +i, rowInd, _format_float(values[j])) +else: +spstr += '({0},{1}) {2}\n'.format( +rowInd, i, _format_float(values[j])) --- End diff -- How about this: cur_col = 0 rs = [] for i, (rowIndice, value) in enumerate(izip(self.rowIndices, self.values)): while self.colPtrs[cur_col+1] <= i: cur_col += 1 rs.append("(%s,%s) %s" % (cur_col, rowIndice, value)) return spstr + '\n'.join(rs) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942087 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): --- End diff -- Is there a way to run the doctests locally? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942075 --- Diff: python/pyspark/mllib/linalg.py --- @@ -897,6 +914,33 @@ def __init__(self, numRows, numCols, colPtrs, rowIndices, values, raise ValueError("Expected rowIndices of length %d, got %d." % (self.rowIndices.size, self.values.size)) +def __str__(self): --- End diff -- Could you add a doc test for `__str__`, it's easy to see the result in doc test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7788] Made KinesisReceiver.onStart() no...
Github user cfregly commented on the pull request: https://github.com/apache/spark/pull/6348#issuecomment-104845837 looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104845565 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33392/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6342#discussion_r30942066 --- Diff: python/pyspark/mllib/linalg.py --- @@ -821,6 +821,23 @@ def __reduce__(self): self.numRows, self.numCols, self.values.tostring(), int(self.isTransposed)) +def __str__(self): +mattoarr = self.toArray() +matstr = "" +for row in mattoarr: +for ind, col in enumerate(row): +if ind != 0: +matstr += " " +matstr += str(col) +matstr += '\n' --- End diff -- â+=â of string is slow, could you change to '\n'.join(' '.join(row) for row in self.toArray() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104845564 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7795] [Core] Speed up task scheduling i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6323 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7795] [Core] Speed up task scheduling i...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6323#issuecomment-104845475 LGTM, so I'm going to merge this to master. Although I think this is _probably_ safe for 1.4, I don't want to risk conflicting with any attempts to cut another RC tonight, so I'm not going to pick it there for now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104845403 [Test build #33393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33393/consoleFull) for PR 6342 at commit [`c528eed`](https://github.com/apache/spark/commit/c528eedc2c6553aee335b6994e7a5aed196071f1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104845025 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104845067 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user MechCoder commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104842961 @davies fixed! anything else? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104842944 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7785] [MLlib] [PySpark] Add __str__ and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6342#issuecomment-104842948 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104842759 [Test build #33390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33390/consoleFull) for PR 6366 at commit [`56d2540`](https://github.com/apache/spark/commit/56d25408be22786fd1e93285ba371276a5569c9f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104842719 [Test build #33391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33391/consoleFull) for PR 3920 at commit [`d2153df`](https://github.com/apache/spark/commit/d2153df5f5472f34b427a39e8fd450f7d37d2fc6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104842620 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104842642 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104842587 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104842624 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7654][SQL] Move insertInto into reader/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6366#issuecomment-104842641 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5090][examples] The improvement of pyth...
Github user GenTang commented on the pull request: https://github.com/apache/spark/pull/3920#issuecomment-104841817 @davies I just tested it with the assembly of master branch, it works. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-7826][CORE] Suppress extra calling...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/6352#discussion_r30941932 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -386,13 +386,15 @@ class DAGScheduler( def visit(rdd: RDD[_]) { if (!visited(rdd)) { visited += rdd -if (getCacheLocs(rdd).contains(Nil)) { --- End diff -- As a general aside, I find `getCacheLocs(rdd).contains(Nil)` to be hard to understand to begin with. I think that this condition is meant to be read as "if at least one partition of this RDD is not cached anywhere...". Maybe this code would be easier to review / parse if we extracted this condition into a variable, perhaps a lazy val if we want to short-circuit, named `rddHasUncachedPartitions`, or `!rddIsCached`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user Lewuathe commented on a diff in the pull request: https://github.com/apache/spark/pull/5707#discussion_r30941923 --- Diff: python/pyspark/mllib/util.py --- @@ -169,6 +170,27 @@ def loadLabeledPoints(sc, path, minPartitions=None): minPartitions = minPartitions or min(sc.defaultParallelism, 2) return callMLlibFunc("loadLabeledPoints", sc, path, minPartitions) +@staticmethod +def appendBias(data): +""" +Returns a new vector with `1.0` (bias) appended to +the end of the input vector. +""" +vec = _convert_to_vector(data) +if isinstance(vec, SparseVector): +return sp.csc_matrix(np.append(vec.toArray(), 1.0)) --- End diff -- @jkbradley Thank you for response. I think if the returned value of `appendBias` should be `SparseVector`, there seems no space to use scipy.sparse because SparseVector can be constructed with given data without scipy.sparse. Is it correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104830492 [Test build #853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/853/consoleFull) for PR 6364 at commit [`62bbb6c`](https://github.com/apache/spark/commit/62bbb6c9e1fbe944635c12778e3f459639327f38). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104829680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33386/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104829679 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7836] [SPARK-7822] Python API of window...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6364#issuecomment-104829678 [Test build #33386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33386/consoleFull) for PR 6364 at commit [`66092b4`](https://github.com/apache/spark/commit/66092b45435bef6dd220419ec842f4da4e4318c6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` "public class " + className + extendsText + " implements java.io.Serializable ` * `class Window(object):` * `class WindowSpec(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6811] Copy SparkR lib in make-distribut...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6373#issuecomment-104827910 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33389/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org