Pat Ferrel created MAHOUT-1568:
----------------------------------
Summary: Build an I/O model that can replace sequence files for
import/export
Key: MAHOUT-1568
URL: https://issues.apache.org/jira/browse/MAHOUT-1568
Project: Mahout
Issue Type: New Feature
Components: CLI
Environment: Scala, Spark
Reporter: Pat Ferrel
Assignee: Pat Ferrel
Implement mechanisms to read and write data from/to flexible stores. These will
support tuples streams and drms but with extensions that allow keeping user
defined values for IDs. The mechanism in some sense can replace Sequence Files
for import/export and will make the operation much easier for the user. In many
cases directly consuming their input files.
Start with text delimited files for input/output in the Spark version of
ItemSimilarity
A proposal is running with ItemSimilarity on Spark which and is documented on
the github wiki here: https://github.com/pferrel/harness/wiki
Comments are appreciated
--
This message was sent by Atlassian JIRA
(v6.2#6252)