[
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005279#comment-14005279
]
Ted Dunning commented on MAHOUT-1490:
-------------------------------------
{quote}
> (5) or compress whenever there is danger of memory pressure.
in new architecture, this is implied by (3). Cache manager makes such decisions
for us, by hooking these techniques into serialization we automatically
participate in this.
{quote}
I disagree here. I really think that the user will often not realize that they
are under pressure and if they are even close then a 5-20x decrease in memory
use across the board could really help them out.
Perhaps it might be better to compress by default and then leave the data in
uncompressed form if we see a pattern of use that indicates excessive cost?
Or leave it to the engine to decide?
> Data frame R-like bindings
> --------------------------
>
> Key: MAHOUT-1490
> URL: https://issues.apache.org/jira/browse/MAHOUT-1490
> Project: Mahout
> Issue Type: New Feature
> Reporter: Saikat Kanjilal
> Assignee: Dmitriy Lyubimov
> Fix For: 1.0
>
> Original Estimate: 20h
> Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark
--
This message was sent by Atlassian JIRA
(v6.2#6252)