[ https://issues.apache.org/jira/browse/SPARK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tor Myklebust updated SPARK-1672: --------------------------------- Component/s: MLlib > Support separate partitioners (and numbers of partitions) for users and > products > -------------------------------------------------------------------------------- > > Key: SPARK-1672 > URL: https://issues.apache.org/jira/browse/SPARK-1672 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Tor Myklebust > Priority: Minor > > The user ought to be able to specify a partitioning of his data if he knows a > good one. It's convenient to have separate partitioners for users and > products so that no strange mapping step needs to happen. > It may also be reasonable to partition the users and products into different > numbers of partitions (for instance, to balance memory requirements) if the > dataset is tall, thin, and very sparse. -- This message was sent by Atlassian JIRA (v6.2#6252)