[ 
https://issues.apache.org/jira/browse/HADOOP-10694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620604#comment-14620604
 ] 

Tsuyoshi Ozawa commented on HADOOP-10694:
-----------------------------------------

[~gopalv] thank you for the contribution. +1 for the change.

However, it's a bit dangerous to remove synchronization from DataInputBuffer 
because there are lots caller of DataInputBuffer. How about adding 
NonSyncByteArrayInputStream and changing to use it in MapTask and ReduceTask  
instead of removing lock?

Additionaly, I think that it would be better NonSyncByteArrayInputStream's 
extending DataInputBuffer, because  we don't need to change other code except 
new statement of NonSyncByteArrayInputStream instead of DataInputBuffer.

> Remove synchronized input streams from Writable deserialization
> ---------------------------------------------------------------
>
>                 Key: HADOOP-10694
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10694
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>            Reporter: Gopal V
>            Assignee: Gopal V
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-10694.1.patch, writable-read-sync.png
>
>
> Writable deserialization is slowing down due to a synchronized block within 
> DataInputBuffer$Buffer.
> ByteArrayInputStream::read() is synchronized and this shows up as a slow 
> uncontested lock.
> Hive ships with its own faster thread-unsafe version with 
> hive.common.io.NonSyncByteArrayInputStream.
> !writable-read-sync.png!
> The DataInputBuffer and Writable deserialization should not require a lock 
> per readInt()/read().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to