[
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000600#comment-14000600
]
Anand Avati commented on MAHOUT-1490:
-------------------------------------
[~dlyubimov], unsafe access surely a factor for the performance. Compression is
implemented in
https://github.com/0xdata/h2o/blob/master/src/main/java/water/fvec/NewChunk.java#L379.
The performance boost is really in coding the access (various at8_impl and
atd_impl methods) such that inflation happens completely in registers and only
compressed data is transferred over the memory bus. This seems to work very
effectively in practice.
AFAIK all the code is public in that github.
> Data frame R-like bindings
> --------------------------
>
> Key: MAHOUT-1490
> URL: https://issues.apache.org/jira/browse/MAHOUT-1490
> Project: Mahout
> Issue Type: New Feature
> Reporter: Saikat Kanjilal
> Assignee: Dmitriy Lyubimov
> Fix For: 1.0
>
> Original Estimate: 20h
> Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark
--
This message was sent by Atlassian JIRA
(v6.2#6252)