[ 
https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000600#comment-14000600
 ] 

Anand Avati commented on MAHOUT-1490:
-------------------------------------

[~dlyubimov], unsafe access surely a factor for the performance. Compression is 
implemented in 
https://github.com/0xdata/h2o/blob/master/src/main/java/water/fvec/NewChunk.java#L379.
 The performance boost is really in coding the access (various at8_impl and 
atd_impl methods) such that inflation happens completely in registers and only 
compressed data is transferred over the memory bus. This seems to work very 
effectively in practice.

AFAIK all the code is public in that github.

> Data frame R-like bindings
> --------------------------
>
>                 Key: MAHOUT-1490
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1490
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Saikat Kanjilal
>            Assignee: Dmitriy Lyubimov
>             Fix For: 1.0
>
>   Original Estimate: 20h
>  Remaining Estimate: 20h
>
> Create Data frame R-like bindings for spark



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to