[
https://issues.apache.org/jira/browse/HADOOP-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518577
]
Konstantin Shvachko commented on HADOOP-1649:
---------------------------------------------
+1
On 3: I mean that the performance gains are in single digits percentage-wise,
so it is important to minimize memory costs,
that is I am agreeing you should file a new issue to deal with redundant data.
> Performance regression with Block CRCs
> --------------------------------------
>
> Key: HADOOP-1649
> URL: https://issues.apache.org/jira/browse/HADOOP-1649
> Project: Hadoop
> Issue Type: Bug
> Affects Versions: 0.14.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: HADOOP-1649.patch, HADOOP-1649.patch, HADOOP-1649.patch
>
>
> Performance is noticeably affected by Block Level CRCs patch (HADOOP-1134).
> This is more noticeable on writes (randomriter test etc).
> With random writer, it takes 20-25% on small cluster (20 nodes) and many be
> 10% on larger cluster.
> There are a few differences in how data is written with 1134. As soon as I
> can reproduce this, I think it will be easier to fix.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.