[jira] [Commented] (MAHOUT-2020) Maven repo structure compatibility with SBT
[ https://issues.apache.org/jira/browse/MAHOUT-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190340#comment-16190340 ] Pat Ferrel commented on MAHOUT-2020: This may be a non-issue. Trevor said in email: {quote}The spark is included via maven classifier- the sbt line should be libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" % "0.13.1-SNAPSHOT" classifier "spark_2.1" {quote} > Maven repo structure compatibility with SBT > --- > > Key: MAHOUT-2020 > URL: https://issues.apache.org/jira/browse/MAHOUT-2020 > Project: Mahout > Issue Type: Bug > Components: build >Affects Versions: 0.13.1 > Environment: Creating a project from maven built Mahout using sbt. > Made critical since it seems to block using Mahout with sbt. At least I have > found no way to do it. >Reporter: Pat Ferrel >Assignee: Trevor Grant >Priority: Critical > Fix For: 0.13.1 > > > The maven repo should build: > org/apache/mahout/mahout-spark-2.1/0.13.1-SNAPSHOT/mahout-spark-2.1_2.11-0.13.1-SNAPSHOT.jar > substitute Spark version for -2.1, so -1.6 etc. > The build.sbt `libraryDependencies` line then will be: > `"org.apache.mahout" %% "mahout-spark-2.1" % “0.13.1-SNAPSHOT` > This is parsed by sbt to yield the path of : > org/apache/mahout/mahout-spark-2.1/0.13.1-SNAPSHOT/mahout-spark-2.1_2.11-0.13.1-SNAPSHOT.jar > The outcome of `mvn clean install` currently is something like: > org/apache/mahout/mahout-spark/0.13.1-SNAPSHOT/mahout-spark-0.13.1-SNAPSHOT-spark_2.1.jar > This has no effect on the package structure, only artifact naming and maven > repo structure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
[ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190338#comment-16190338 ] Pat Ferrel commented on MAHOUT-2019: This may be a non-issue: Trevor said in email: {quote}The spark is included via maven classifier- the sbt line should be libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" % "0.13.1-SNAPSHOT" classifier "spark_2.1" {quote} > SparseRowMatrix assign ops user for loops instead of iterateNonZero and so > can be optimized > --- > > Key: MAHOUT-2019 > URL: https://issues.apache.org/jira/browse/MAHOUT-2019 > Project: Mahout > Issue Type: Bug > Components: Math >Affects Versions: 0.13.0 >Reporter: Pat Ferrel >Assignee: Pat Ferrel > Fix For: 0.13.1 > > > DRMs get blockified into SparseRowMatrix instances if the density is low. But > SRM inherits the implementation of method like "assign" from AbstractMatrix, > which uses nest for loops to traverse rows. For multiplying 2 matrices that > are extremely sparse, the kind if data you see in collaborative filtering, > this is extremely wasteful of execution time. Better to use a sparse vector's > iterateNonZero Iterator for some function types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
[ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pat Ferrel updated MAHOUT-2019: --- Priority: Major (was: Minor) > SparseRowMatrix assign ops user for loops instead of iterateNonZero and so > can be optimized > --- > > Key: MAHOUT-2019 > URL: https://issues.apache.org/jira/browse/MAHOUT-2019 > Project: Mahout > Issue Type: Bug > Components: Math >Affects Versions: 0.13.0 >Reporter: Pat Ferrel >Assignee: Pat Ferrel > Fix For: 0.13.1 > > > DRMs get blockified into SparseRowMatrix instances if the density is low. But > SRM inherits the implementation of method like "assign" from AbstractMatrix, > which uses nest for loops to traverse rows. For multiplying 2 matrices that > are extremely sparse, the kind if data you see in collaborative filtering, > this is extremely wasteful of execution time. Better to use a sparse vector's > iterateNonZero Iterator for some function types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
[ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pat Ferrel updated MAHOUT-2019: --- Priority: Minor (was: Major) > SparseRowMatrix assign ops user for loops instead of iterateNonZero and so > can be optimized > --- > > Key: MAHOUT-2019 > URL: https://issues.apache.org/jira/browse/MAHOUT-2019 > Project: Mahout > Issue Type: Bug > Components: Math >Affects Versions: 0.13.0 >Reporter: Pat Ferrel >Assignee: Pat Ferrel >Priority: Minor > Fix For: 0.13.1 > > > DRMs get blockified into SparseRowMatrix instances if the density is low. But > SRM inherits the implementation of method like "assign" from AbstractMatrix, > which uses nest for loops to traverse rows. For multiplying 2 matrices that > are extremely sparse, the kind if data you see in collaborative filtering, > this is extremely wasteful of execution time. Better to use a sparse vector's > iterateNonZero Iterator for some function types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)