Cool- I was just going through it to get familiar with the DSL (and really scala in general at this point) and the read/write traits that you were talking about... Just looking at the code really- I don't have any need to build it right now. Wanted to make sure i wasn't totally off...
Thanks > Subject: Re: [jira] [Created] (MAHOUT-1568) Build an I/O model that can > replace sequence files for import/export > From: [email protected] > Date: Sun, 1 Jun 2014 17:57:47 -0700 > To: [email protected] > > Sorry, wasn’t expecting someone to build it. Don’t know if the packaging is > right yet and it's about a month behind on the trunk. > > You pull the repo at the same level as the major pieces like math-scala—into > MAHOUT_HOME, apply MAHOUT-1464 patch, but all you need is > org.apache.mahout.cf.CooccurrenceAnalysis from the patches. Your version > should work. Then build the snapshot mahout, go into harness and ‘mvn install > -DskipTests’. Since the packaging may not be right I haven’t integrated it > with the mahout poms. > > I’ll merge it with the trunk tomorrow. > > On Jun 1, 2014, at 1:57 PM, Andrew Palumbo <[email protected]> wrote: > > Hi Pat, > > Does Harness compile against the mahout trunk + MAHOUT-1464.patch > (cooccurance)? I have a patched up branch of the mahout trunk with basically > a gutted MAHOUT-1464.patch- just something that defines > org.apache.mahout.cf.CooccurrenceAnalysis and compiles (so i wouldn't be able > to run Harness right now anyways). I think the changes from MAHOUT-1529 are > causing my problems- the errors are from DrmLike stuff: > > > [ERROR] > /home/andy/sandbox/harness/src/main/scala/org/apache/mahout/drivers/IndexedDataset.scala:40: > error: not found: type DrmLike > [INFO] case class IndexedDataset(matrix: DrmLike[Int], rowIDs: > BiMap[String,Int], columnIDs: BiMap[String,Int]) { > [INFO] ^ > [ERROR] > /home/andy/sandbox/harness/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala:105: > error: not found: type DrmRdd > [INFO] }).asInstanceOf[DrmRdd[Int]] > [INFO] ^ > [ERROR] > /home/andy/sandbox/harness/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala:107: > error: not found: type CheckpointedDrmBase > [INFO] val drmInteractions = new > CheckpointedDrmBase[Int](indexedInteractions, numRows, numColumns) > [INFO] ^ > [ERROR] > /home/andy/sandbox/harness/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala:145: > error: not found: type DrmLike > [INFO] val matrix: DrmLike[Int] = indexedDataset.matrix > > Thanks, > > Andy > > > > Date: Sun, 1 Jun 2014 17:27:01 +0000 > > From: [email protected] > > To: [email protected] > > Subject: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace > > sequence files for import/export > > > > Pat Ferrel created MAHOUT-1568: > > ---------------------------------- > > > > Summary: Build an I/O model that can replace sequence files for > > import/export > > Key: MAHOUT-1568 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1568 > > Project: Mahout > > Issue Type: New Feature > > Components: CLI > > Environment: Scala, Spark > > Reporter: Pat Ferrel > > Assignee: Pat Ferrel > > > > > > Implement mechanisms to read and write data from/to flexible stores. These > > will support tuples streams and drms but with extensions that allow keeping > > user defined values for IDs. The mechanism in some sense can replace > > Sequence Files for import/export and will make the operation much easier > > for the user. In many cases directly consuming their input files. > > > > Start with text delimited files for input/output in the Spark version of > > ItemSimilarity > > > > A proposal is running with ItemSimilarity on Spark which and is documented > > on the github wiki here: https://github.com/pferrel/harness/wiki > > > > Comments are appreciated > > > > > > > > -- > > This message was sent by Atlassian JIRA > > (v6.2#6252) > >
