[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-19 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002649#comment-14002649
 ] 

Dmitriy Lyubimov commented on MAHOUT-1544:
--

I would suggest to close it, given the uncertainty about the progress.

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-18 Thread Sebastian Schelter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001029#comment-14001029
 ] 

Sebastian Schelter commented on MAHOUT-1544:


[~avati] What's the status here?

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-18 Thread Anand Avati (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001041#comment-14001041
 ] 

Anand Avati commented on MAHOUT-1544:
-

[~ssc] - I investigated how Spark is achieving the task of exposing user 
defined data types and closures to the server. It turns out, Spark is using a 
heavily refactored and modified version of the Scala 2.10 REPL. Those 
modifications are now part of Scala 2.11 (in the form of two options: 
Yclass-based-repl and Yrepl-outdir). So if Mahout were to use Scala 2.11 (which 
in turn depends on Spark to be available in 2.11 Scala) then the proposal in 
this JIRA can be achieved with much more simplicity (It also turns out that 
Spark's REPL itself can be made much simpler when moved to Scala 2.11, for the 
same reasons). However right now I'm trying to get Spark's dependencies in 
Scala 2.11.

So in any case, this proposal can be achieved at the earliest with Spark 1.1 
(Spark devs are considering Scala 2.11 support in version 1.1). To do it any 
earlier, we would need to inherit a refactored and modified version of Scala 
REPL, just like how Spark is doing - just not worth the effort.

What's the right process here now? Close this JIRA? Or leave it open with a 
decreased priority?

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-05 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990043#comment-13990043
 ] 

Dmitriy Lyubimov commented on MAHOUT-1544:
--

Did you try to run simple.mscala examples in it in _standalone_ spark mode? 
local master will not do. 

I think you will encounter that workers have troubles deserializing the tasks. 
I've read REPL code enough to be fairly confident it is not that simple. Most 
of complexity stems from the necessity to expose user-defined data types and 
closures to remote classloaders. I am not sure your code takes care of all that.

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-05 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990044#comment-13990044
 ] 

Dmitriy Lyubimov commented on MAHOUT-1544:
--

Also, we  don't want IDEA-specific files to be part of the patch. 

thanks

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-05 Thread Anand Avati (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990121#comment-13990121
 ] 

Anand Avati commented on MAHOUT-1544:
-

Ah, I have been testing with just the matrix multiply operators and they work 
fine. The issue shows with custom code coming with mapBlock(). Investigating 
more.

IDE-specific files came probably because 'diff' represents rename of directory 
from 'spark-shell' to 'spark' as deletion and creation of all files inside. My 
code changes are done only through emacs.

 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1544) make Mahout DSL shell depend dynamically on Spark

2014-05-05 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990138#comment-13990138
 ] 

Dmitriy Lyubimov commented on MAHOUT-1544:
--

exactly. Or, try to define something like class Complex(x,i) and then try
to pass it back and forth. Same effect.





 make Mahout DSL shell depend dynamically on Spark
 -

 Key: MAHOUT-1544
 URL: https://issues.apache.org/jira/browse/MAHOUT-1544
 Project: Mahout
  Issue Type: Improvement
Reporter: Anand Avati
 Fix For: 1.0

 Attachments: 0001-spark-shell-rename-to-shell.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch, 
 0002-shell-make-dependency-on-Spark-optional-and-dynamic.patch


 Today the Mahout's scala shell depends on spark.
 Create a cleaner separation between the shell and Spark. For e.g, the in core 
 scalabindings and operators do not need Spark. So make Spark a runtime 
 addon to the shell. Similarly in the future new distributed backend engines 
 can transparently (dynamically) be available through the DSL shell.
 The new shell works, looks and feels exactly like the shell before, but has a 
 cleaner modular architecture.



--
This message was sent by Atlassian JIRA
(v6.2#6252)