[jira] [Commented] (MAHOUT-1604) Create a RowSimilarity for Spark

Dmitriy Lyubimov (JIRA) Wed, 17 Dec 2014 08:36:39 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250086#comment-14250086
 ]


Dmitriy Lyubimov commented on MAHOUT-1604:
------------------------------------------

Relevant sequential history of this file follows. History doesn't lie. My 
commit removed it and your commit brought it back right after with comment "do 
not remove albeit useless" added. 

If you don't recognize the change, then it is failure to correctly merge in the 
head that resulted in overwritten changes. 
Which is in any build system is a "mortal sin" and is subject to immediate 
revert. It means there are potentially more overwriting changes in this commit. 

In light of which, the proposed remedy is as follows: 
- a revert is issued for commit 149c98592
- the committer of PR #47 works on commit as follows: 
-- pull apache/master HEAD to PR branch, resolve merge 
-- issue git diff apache/master
--- if there are unrecognized changes in that commit, the committer works on 
the PR until it contains no unrecognized/master-overwriting changes. 
--- we do one more round of PR review after that. 

relevant sequential file history with diff follows.

{panel:title=log}

dmitriy@Intel-KUBU:~/projects/github/mahout-commits$ git log 149c98592fe -p -2 
-- spark/pom.xml
commit 149c98592fe447c98dfb5afc67b5809725cc3056
Author: pferrel <[email protected]>
Date:   Thu Aug 28 10:45:13 2014 -0700

    MAHOUT-1604 add a CLI and associated code for spark-rowsimilarity, also 
cleans up some things in MAHOUT-1568 and MAHOUT-1569, closes apache/mahout#47

diff --git a/spark/pom.xml b/spark/pom.xml
index 71d3944..2f79377 100644
--- a/spark/pom.xml
+++ b/spark/pom.xml
@@ -157,6 +157,27 @@
         </executions>
       </plugin>
 
+      <!-- create job jar to include CLI driver deps-->
+      <!-- leave this in even though there are no hadoop mapreduce jobs in 
this module -->
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-assembly-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>job</id>
+            <phase>package</phase>
+            <goals>
+              <goal>single</goal>
+            </goals>
+            <configuration>
+              <descriptors>
+                <descriptor>src/main/assembly/job.xml</descriptor>
+              </descriptors>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
     </plugins>
   </build>
 

commit c6ee8cbcdb6ae205624b908bc16ae462515c98e6
Author: Dmitriy Lyubimov <[email protected]>
Date:   Fri Aug 15 16:31:10 2014 -0700

    (NOJIRA) disabling -job.jar assembly in spark module (we don't use it, do 
we?)

diff --git a/spark/pom.xml b/spark/pom.xml
index 0946cee..71d3944 100644
--- a/spark/pom.xml
+++ b/spark/pom.xml
@@ -83,27 +83,6 @@
         </executions>
       </plugin>
 
-      <!-- create core job dependencies jar -->
-
-      <plugin>
-        <groupId>org.apache.maven.plugins</groupId>
-        <artifactId>maven-assembly-plugin</artifactId>
-          <executions>
-            <execution>
-              <id>job</id>
-              <phase>package</phase>
-              <goals>
-                <goal>single</goal>
-              </goals>
-              <configuration>
-                <descriptors>
-                  <descriptor>src/main/assembly/job.xml</descriptor>
-                </descriptors>
-              </configuration>
-            </execution>
-          </executions>
-      </plugin>
-
       <!-- create test jar so other modules can reuse the math test utility 
classes. -->
       <plugin>
         <groupId>org.apache.maven.plugins</groupId>
dmitriy@Intel-KUBU:~/projects/github/mahout-commits$ 
{panel}


> Create a RowSimilarity for Spark
> --------------------------------
>
>                 Key: MAHOUT-1604
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1604
>             Project: Mahout
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 1.0
>         Environment: Spark
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>
> Using CooccurrenceAnalysis.cooccurrence create a driver that reads a text DRM 
> or two and produces LLR similarity/cross-similarity matrices.
> This will produce the same results as ItemSimilarity but take a Drm as input 
> instead of individual cells.
> The first version will only support LLR, other similarity measures will need 
> to be in separate Jiras



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAHOUT-1604) Create a RowSimilarity for Spark

Reply via email to