[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002649#comment-14002649 ] Dmitriy Lyubimov commented on MAHOUT-1544: -- I would suggest to close it, given the uncertainty about the progress. make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001029#comment-14001029 ] Sebastian Schelter commented on MAHOUT-1544: [~avati] What's the status here? make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001041#comment-14001041 ] Anand Avati commented on MAHOUT-1544: - [~ssc] - I investigated how Spark is achieving the task of exposing user defined data types and closures to the server. It turns out, Spark is using a heavily refactored and modified version of the Scala 2.10 REPL. Those modifications are now part of Scala 2.11 (in the form of two options: Yclass-based-repl and Yrepl-outdir). So if Mahout were to use Scala 2.11 (which in turn depends on Spark to be available in 2.11 Scala) then the proposal in this JIRA can be achieved with much more simplicity (It also turns out that Spark's REPL itself can be made much simpler when moved to Scala 2.11, for the same reasons). However right now I'm trying to get Spark's dependencies in Scala 2.11. So in any case, this proposal can be achieved at the earliest with Spark 1.1 (Spark devs are considering Scala 2.11 support in version 1.1). To do it any earlier, we would need to inherit a refactored and modified version of Scala REPL, just like how Spark is doing - just not worth the effort. What's the right process here now? Close this JIRA? Or leave it open with a decreased priority? make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990043#comment-13990043 ] Dmitriy Lyubimov commented on MAHOUT-1544: -- Did you try to run simple.mscala examples in it in _standalone_ spark mode? local master will not do. I think you will encounter that workers have troubles deserializing the tasks. I've read REPL code enough to be fairly confident it is not that simple. Most of complexity stems from the necessity to expose user-defined data types and closures to remote classloaders. I am not sure your code takes care of all that. make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990044#comment-13990044 ] Dmitriy Lyubimov commented on MAHOUT-1544: -- Also, we don't want IDEA-specific files to be part of the patch. thanks make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990121#comment-13990121 ] Anand Avati commented on MAHOUT-1544: - Ah, I have been testing with just the matrix multiply operators and they work fine. The issue shows with custom code coming with mapBlock(). Investigating more. IDE-specific files came probably because 'diff' represents rename of directory from 'spark-shell' to 'spark' as deletion and creation of all files inside. My code changes are done only through emacs. make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark
[ https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990138#comment-13990138 ] Dmitriy Lyubimov commented on MAHOUT-1544: -- exactly. Or, try to define something like class Complex(x,i) and then try to pass it back and forth. Same effect. make Mahout DSL shell depend dynamically on Spark - Key: MAHOUT-1544 URL: https://issues.apache.org/jira/browse/MAHOUT-1544 Project: Mahout Issue Type: Improvement Reporter: Anand Avati Fix For: 1.0 Attachments: 0001-spark-shell-rename-to-shell.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch Today the Mahout's scala shell depends on spark. Create a cleaner separation between the shell and Spark. For e.g, the in core scalabindings and operators do not need Spark. So make Spark a runtime addon to the shell. Similarly in the future new distributed backend engines can transparently (dynamically) be available through the DSL shell. The new shell works, looks and feels exactly like the shell before, but has a cleaner modular architecture. -- This message was sent by Atlassian JIRA (v6.2#6252)