[jira] [Commented] (MAHOUT-2020) Maven repo structure compatibility with SBT

2017-10-03 Thread Pat Ferrel (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190340#comment-16190340
 ] 

Pat Ferrel commented on MAHOUT-2020:


This may be a non-issue. Trevor said in email:

{quote}The spark is included via maven classifier-

the sbt line should be

libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" %
"0.13.1-SNAPSHOT" classifier "spark_2.1"
{quote}



> Maven repo structure compatibility with SBT
> ---
>
> Key: MAHOUT-2020
> URL: https://issues.apache.org/jira/browse/MAHOUT-2020
> Project: Mahout
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.13.1
> Environment: Creating a project from maven built Mahout using sbt. 
> Made critical since it seems to block using Mahout with sbt. At least I have 
> found no way to do it.
>Reporter: Pat Ferrel
>Assignee: Trevor Grant
>Priority: Critical
> Fix For: 0.13.1
>
>
> The maven repo should build:
> org/apache/mahout/mahout-spark-2.1/0.13.1-SNAPSHOT/mahout-spark-2.1_2.11-0.13.1-SNAPSHOT.jar
> substitute Spark version for -2.1, so -1.6 etc.
> The build.sbt  `libraryDependencies` line then will be:
> `"org.apache.mahout" %% "mahout-spark-2.1" % “0.13.1-SNAPSHOT`
> This is parsed by sbt to yield the path of :
> org/apache/mahout/mahout-spark-2.1/0.13.1-SNAPSHOT/mahout-spark-2.1_2.11-0.13.1-SNAPSHOT.jar
> The outcome of `mvn clean install` currently is something like:
> org/apache/mahout/mahout-spark/0.13.1-SNAPSHOT/mahout-spark-0.13.1-SNAPSHOT-spark_2.1.jar
> This has no effect on the package structure, only artifact naming and maven 
> repo structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized

2017-10-03 Thread Pat Ferrel (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190338#comment-16190338
 ] 

Pat Ferrel commented on MAHOUT-2019:


This may be a non-issue: 

Trevor said in email:

{quote}The spark is included via maven classifier-

the sbt line should be

libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" %
"0.13.1-SNAPSHOT" classifier "spark_2.1"


{quote}

> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so 
> can be optimized
> ---
>
> Key: MAHOUT-2019
> URL: https://issues.apache.org/jira/browse/MAHOUT-2019
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.13.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
> Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But 
> SRM inherits the implementation of method like "assign" from AbstractMatrix, 
> which uses nest for loops to traverse rows. For multiplying 2 matrices that 
> are extremely sparse, the kind if data you see in collaborative filtering, 
> this is extremely wasteful of execution time. Better to use a sparse vector's 
> iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized

2017-10-03 Thread Pat Ferrel (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pat Ferrel updated MAHOUT-2019:
---
Priority: Major  (was: Minor)

> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so 
> can be optimized
> ---
>
> Key: MAHOUT-2019
> URL: https://issues.apache.org/jira/browse/MAHOUT-2019
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.13.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
> Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But 
> SRM inherits the implementation of method like "assign" from AbstractMatrix, 
> which uses nest for loops to traverse rows. For multiplying 2 matrices that 
> are extremely sparse, the kind if data you see in collaborative filtering, 
> this is extremely wasteful of execution time. Better to use a sparse vector's 
> iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized

2017-10-03 Thread Pat Ferrel (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pat Ferrel updated MAHOUT-2019:
---
Priority: Minor  (was: Major)

> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so 
> can be optimized
> ---
>
> Key: MAHOUT-2019
> URL: https://issues.apache.org/jira/browse/MAHOUT-2019
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.13.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Minor
> Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But 
> SRM inherits the implementation of method like "assign" from AbstractMatrix, 
> which uses nest for loops to traverse rows. For multiplying 2 matrices that 
> are extremely sparse, the kind if data you see in collaborative filtering, 
> this is extremely wasteful of execution time. Better to use a sparse vector's 
> iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)