[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349754#comment-14349754 ] Andrew Palumbo commented on MAHOUT-1603: this can be closed, right? Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Labels: DSL, scala, spark Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093233#comment-14093233 ] ASF GitHub Bot commented on MAHOUT-1603: Github user asfgit closed the pull request at: https://github.com/apache/mahout/pull/40 Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093397#comment-14093397 ] Hudson commented on MAHOUT-1603: FAILURE: Integrated in Mahout-Quality #2739 (See [https://builds.apache.org/job/Mahout-Quality/2739/]) MAHOUT-1603: Tweaks for Spark 1.0.x (dlyubimov pferrel) (dlyubimov: rev ee6359f621b508ab7f21df0316941e68c75eb3e5) * spark/src/test/scala/org/apache/mahout/sparkbindings/blas/ABtSuite.scala * spark/src/main/scala/org/apache/mahout/drivers/ItemSimilarityDriver.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/test/DistributedSparkSuite.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/blas/BlasSuite.scala * spark/src/main/scala/org/apache/mahout/drivers/MahoutOptionParser.scala * CHANGELOG * spark/src/main/scala/org/apache/mahout/drivers/MahoutDriver.scala * math-scala/src/test/scala/org/apache/mahout/test/LoggerConfiguration.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtASuite.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AewBSuite.scala * pom.xml * spark/src/main/scala/org/apache/mahout/sparkbindings/SparkEngine.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/test/LoggerConfiguration.scala * spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtSuite.scala * spark/src/test/scala/org/apache/mahout/drivers/ItemSimilarityDriverSuite.scala Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091146#comment-14091146 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51642800 excellent. seems to be working for me. let me squash it and merge to apache/mahout spark_1.0.x. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091158#comment-14091158 ] Dmitriy Lyubimov commented on MAHOUT-1603: -- merged to apache/spark-1.0.x branch (here: https://git-wip-us.apache.org/repos/asf?p=mahout.git;a=shortlog;h=refs/heads/spark-1.0.x) Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089599#comment-14089599 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51514939 made changes to use the test context in the driver and tests seem to complete correctly up to the point they try to read the output file, which does contain the correct results. ``` val indicatorLines = mahoutCtx.textFile(OutPath + /indicator-matrix/part-0) ``` The part file is created in the driver using ```rdd.saveAsTextFile(dest)```. It seems like something was getting done before by shutting down the context, maybe I need to close the output file(s) (not sure how to do that since it's created inside the saveAsTextFile call)? ``` java.lang.NullPointerException at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1215) at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1222) at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:456) at org.apache.mahout.drivers.ItemSimilarityDriverSuite$$anonfun$4.apply$mcV$sp(ItemSimilarityDriverSuite.scala:303) ``` Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089609#comment-14089609 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51515920 @pferrel where are the changes? Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089638#comment-14089638 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51517989 I can't push them back to you so they are here https://github.com/pferrel/mahout/tree/spark-1.0.x Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089645#comment-14089645 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51518645 assuming you are on your local branch named spark-1.0.x with your commit on top of mine current head, please execute push g...@github.com:dlyubimov/mahout spark-1.0.x this should go thru On Thu, Aug 7, 2014 at 12:13 PM, Pat Ferrel notificati...@github.com wrote: I can't push them back to you so they are here https://github.com/pferrel/mahout/tree/spark-1.0.x — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51517989. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089649#comment-14089649 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51518705 oh ok, never mind Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089713#comment-14089713 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51523570 ok i guess like you said tests are still failing On Thu, Aug 7, 2014 at 12:18 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: assuming you are on your local branch named spark-1.0.x with your commit on top of mine current head, please execute push g...@github.com:dlyubimov/mahout spark-1.0.x this should go thru On Thu, Aug 7, 2014 at 12:13 PM, Pat Ferrel notificati...@github.com wrote: I can't push them back to you so they are here https://github.com/pferrel/mahout/tree/spark-1.0.x — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51517989. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089737#comment-14089737 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51524833 Wait, I found the problem. I'm closing the context at the end of the driver. I'll fix it. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089741#comment-14089741 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51525161 ok, past the failure. Now I have to do some test cleanup. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089745#comment-14089745 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51525329 @pferrel: So one problem with those tests is that they are creating 2 spark sessions. 1 session is created by tests and another session is created by driver. Spark is very strict with this: (1) Spark is not reentrant w.r.t. session creation are non-reentrant (not just thread-unsafe) -- meaning you can only safely have at most 1 session at a time in a jvm. (2) Spark session itself is reentrant -- meaning multiple threads may invoke asynchronous computational actions on the same session. This may not always manifest, but in the end it always will (ask me how i know :) so the problem with those tests is that they probably must not be featuring DistributedSparkSuite but rather just MahoutSuite. Or alternatively you may pass an already existing mahoutContext to the driver code for reuse. But you must ensure the above constraint. The effects will range dramatically if not (from mislabeled rdd partitions in the block manager to lockups and internal race conditions) On Thu, Aug 7, 2014 at 12:59 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: ok i guess like you said tests are still failing On Thu, Aug 7, 2014 at 12:18 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: assuming you are on your local branch named spark-1.0.x with your commit on top of mine current head, please execute push g...@github.com:dlyubimov/mahout spark-1.0.x this should go thru On Thu, Aug 7, 2014 at 12:13 PM, Pat Ferrel notificati...@github.com wrote: I can't push them back to you so they are here https://github.com/pferrel/mahout/tree/spark-1.0.x — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51517989. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089760#comment-14089760 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51526214 Btw your handling of temporary directory is quite to the point, you may quite possibly make it part of MahoutSuite. Then i have one other place that may use it. Also see similar code in MahoutTest java class for JUnit -- we could just use that i guess. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089966#comment-14089966 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51541649 OK, pushed it back to you. The test pass. All drivers and tests share a single context and man are they fast now. Still using DistributedSparkSuite. BTW the tmp dir really needs to be deleted beforeAll and afterEach for convenience not afterAll so I changed that. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088160#comment-14088160 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51389335 @pferrel perhaps you could look at ItemSimilaritySuite, it doesn't work on spark 1.0 here? I disabled the tests for now since they are failing. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088433#comment-14088433 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51409540 On Wed, Aug 6, 2014 at 3:55 PM, Pat Ferrel notificati...@github.com wrote: Sorry was off the internet during a move (curse you giant nameless cable company!) Anyway these tests are substantially changed in #36 https://github.com/apache/mahout/pull/36 but I haven't been able to get the new build until now, will check and push 36 first. As to building and tearing down contexts I'm not helping things. For each driver test DistributedSparkSuite in the beforeEach creates a context so I use that to start the test. Then the driver I am using needs to start a context so for every time I call a driver I precede it with the afterEach call to shut down the context. Then call the driver, then call beforeEach to restore the test context. I also had to tell the driver in a special invisible option not to load Mahout jars with a --dontAddMahoutJars. So the context is being built 3 times for every test. but that hasn't changed, it's always been that way. We could reuse a single context per test but it would require disabling some stuff in the driver along the lines of what I had to do with --dontAddMahoutJars. Since I've already had to do this I don't think it would be a big deal to disable a little more. I'll look at it once 36 is pushed. Is there any reason to build the context more than once per suite? Usually, there's not and that's exactly what this branch is moving towards (note: this PR is not against master but to to a side branch called `spark-1.0.x`). Also that's what they seem to have done in Spark 1.0 as well. There are sometimes (in my other projects) a need to create a custom context but not in Mahout codebase. Seems like if I disable the context things in the driver we could run all tests in a single context, right? Right. This branch has already switched to doing that. All algebra tests seem to be fine but these tests are failing now. not sure why. seems functional to me. — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51408987. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088443#comment-14088443 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51410370 OK so DistributedSparkSuite moved the create context into the beforeAll? Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088452#comment-14088452 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51410605 OK so DistributedSparkSuite moved the create context into the beforeAll? on this branch, yes. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088486#comment-14088486 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51413783 Do you want to push this with the ignores and I'll fix them to use the new DistributedSparkSuite as it gets into the master? BTW any reason we aren't doing Scala 2.11 since we are upping to Java 7 and Spark 1? Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088506#comment-14088506 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51415419 On Wed, Aug 6, 2014 at 4:56 PM, Pat Ferrel notificati...@github.com wrote: Do you want to push this with the ignores and I'll fix them to use the new DistributedSparkSuite as it gets into the master? No i probably don't want ot merge it with non-working tests. As usual, i can add you as collaborator in my account (if i have not yet done so) so you could push directly to my source branch of this (so it reflects in the PR instantaniously) or you can PR against my spark 1.0.x branch, or you can just send me a regular git patch with email, whichever works. BTW any reason we aren't doing Scala 2.11 since we are upping to Java 7 and Spark 1? The reason Scala is fixed where it is fixed is because it is paired to Spark's version of Scala. Migration between major versions of Scala is a big deal, for Spark and otherwise. Stuff will not work. Minor version of Scala should be generally portable. — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51413783. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088513#comment-14088513 ] ASF GitHub Bot commented on MAHOUT-1603: Github user avati commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51415479 Scala 2.11 port of Spark is in progress [https://issues.apache.org/jira/browse/SPARK-1812] Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088517#comment-14088517 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51415624 sure. there're tons of stuff in progress but we can only use released artifact as dependencies. On Wed, Aug 6, 2014 at 5:19 PM, Anand Avati notificati...@github.com wrote: Scala 2.11 port of Spark is in progress [ https://issues.apache.org/jira/browse/SPARK-1812] — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51415479. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088520#comment-14088520 ] ASF GitHub Bot commented on MAHOUT-1603: Github user avati commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51415697 Only meant FYI (in case someone is planning anything). Of course we have to wait for release. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088526#comment-14088526 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51416019 alternatively, you can also just give me a verbal hint what i need to fix, and i can try to patch to the best of my ability. On Wed, Aug 6, 2014 at 5:18 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: On Wed, Aug 6, 2014 at 4:56 PM, Pat Ferrel notificati...@github.com wrote: Do you want to push this with the ignores and I'll fix them to use the new DistributedSparkSuite as it gets into the master? No i probably don't want ot merge it with non-working tests. As usual, i can add you as collaborator in my account (if i have not yet done so) so you could push directly to my source branch of this (so it reflects in the PR instantaniously) or you can PR against my spark 1.0.x branch, or you can just send me a regular git patch with email, whichever works. BTW any reason we aren't doing Scala 2.11 since we are upping to Java 7 and Spark 1? The reason Scala is fixed where it is fixed is because it is paired to Spark's version of Scala. Migration between major versions of Scala is a big deal, for Spark and otherwise. Stuff will not work. Minor version of Scala should be generally portable. — Reply to this email directly or view it on GitHub https://github.com/apache/mahout/pull/40#issuecomment-51413783. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088538#comment-14088538 ] ASF GitHub Bot commented on MAHOUT-1603: Github user pferrel commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51417124 Ok, I just pushed the new tests, maybe they work. Don't laugh it could happen. There are likely to be problems with my calling afterEach and beforeEach since their meaning has changed. Fixing this will require mods to the driver too I expect and it'll probably be easier for me to do it. If you are almost ready with this I'll upgrade to Spark 1.0.1 and grab your branch. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086975#comment-14086975 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51274951 itemsimilarity driver stuff is failing on this. ItemSimilarityDriverSuite: 113754 [ScalaTest-main-running-ItemSimilarityDriverSuite] DEBUG org.apache.mahout.sparkbindings.blas.AtA$ - Applying slim A'A. 114171 [ScalaTest-main-running-ItemSimilarityDriverSuite] DEBUG org.apache.mahout.sparkbindings.blas.AtB$ - A and B for A'B are not identically partitioned, performing inner join. - ItemSimilarityDriver, non-full-spec CSV *** FAILED *** Set(iphone galaxy:1.7260924347106847,iphone:1.7260924347106847,ipad:0.6795961471815897,nexus:0.6795961471815897, surface surface:4.498681156950466, nexus iphone:1.7260924347106847,ipad:0.6795961471815897,surface:0.6795961471815897,nexus:0.6795961471815897,galaxy:1.7260924347106847, ipad galaxy:1.7260924347106847,iphone:1.7260924347106847,ipad:0.6795961471815897,nexus:0.6795961471815897, galaxy galaxy:1.7260924347106847,iphone:1.7260924347106847,ipad:0.6795961471815897,nexus:0.6795961471815897) did not equal Set(nexus nexus:0.6795961471815897,iphone:1.7260924347106847,ipad:0.6795961471815897,surface:0.6795961471815897,galaxy:1.7260924347106847, ipad nexus:0.6795961471815897,iphone:1.7260924347106847,ipad:0.6795961471815897,galaxy:1.7260924347106847, surface surface:4.498681156950466, iphone nexus:0.6795961471815897,iphone:1.7260924347106847,ipad:0.6795961471815897,galaxy:1.7260924347106847, galaxy nexus:0.6795961471815897,iphone:1.7260924347106847,ipad:0.6795961471815897,galaxy:1.7260924347106847) (ItemSimilarityDriverSuite.scala:142) the rest seems to pass Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086989#comment-14086989 ] ASF GitHub Bot commented on MAHOUT-1603: Github user dlyubimov commented on the pull request: https://github.com/apache/mahout/pull/40#issuecomment-51276205 also, tests run much slower although cpu remains unsaturated. Something about setting up and tearing down local spark context ??? Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1603) Tweaks for Spark 1.0.x
[ https://issues.apache.org/jira/browse/MAHOUT-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085509#comment-14085509 ] ASF GitHub Bot commented on MAHOUT-1603: GitHub user dlyubimov opened a pull request: https://github.com/apache/mahout/pull/40 MAHOUT-1603: Tweaks for Spark 1.0.x For folks who (like me) got tired of waiting for Mahout data frames support and would like to run Spark SQL expressions directly in the Mahout Spark shell. (you can thank me later) You can merge this pull request into a Git repository by running: $ git pull https://github.com/dlyubimov/mahout spark-1.0.x Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/40.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #40 commit 13e909b58eaa89e212415318655dbe82ef982323 Author: Dmitriy Lyubimov dlyubi...@apache.org Date: 2014-08-04T22:00:59Z Initial migration. Tweaks for Spark 1.0.x --- Key: MAHOUT-1603 URL: https://issues.apache.org/jira/browse/MAHOUT-1603 Project: Mahout Issue Type: Task Affects Versions: 0.9 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Fix For: 1.0 Tweaks necessary current codebase on top of spark 1.0.x -- This message was sent by Atlassian JIRA (v6.2#6252)