[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-11-09 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997520#comment-14997520
 ] 

Zhe Zhang commented on HDFS-8425:
-

bq. Per local test, it's 2.5x slower than repl. We need a faster codec
HADOOP-11887 has just shipped. It's much faster than the current coder.

I think it's very important to isolate different factors impacting performance. 
For this purpose, [~lirui] is working on HDFS-9345. With that we should be able 
to figure out any potential performance issue with the output logic (non-codec).

Rui is also working on HDFS-8968, which is a more comprehensive benchmark for 
EC I/O. 

> [umbrella] Performance tuning, investigation and optimization for erasure 
> coding
> 
>
> Key: HDFS-8425
> URL: https://issues.apache.org/jira/browse/HDFS-8425
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
> Attachments: testClientWriteReadFile_v1.pdf, 
> testdfsio-read-mbsec.png, testdfsio-write-mbsec.png
>
>
> This {{umbrella}} jira aims to track performance tuning, investigation and 
> optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-11-02 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984928#comment-14984928
 ] 

Walter Su commented on HDFS-8425:
-

Thanks [~tfukudom]! The results looks good.

I agree we should test read with some DN killed. But I'm afraid it won't be 
much different in the TestDFSIO.

I've only tested writing. when I ran TestDFSIO, I found the throughput of ec is 
slightly better than repl. It's the same as [~tfukudom]'s tests. I open disk 
monitor and network monitor. The disk monitor shows that disk utilization often 
hits 100%. I think it's because we can use all the cpus of NodeManagers, so the 
bottleneck is disk/network io. It's useful because we can write ec files in 
batch. For example, converting multiple repl files to ec files.

The speed of single client writing is constrained by coding speed. Per local 
test, it's 2.5x slower than repl. We need a faster codec. I think it's also 
important, right? But I'm not sure there's use-case is bounded by the speed of 
single client writing. Usually we write files using repl, and convert them to 
ec files later.

How do you think?

> [umbrella] Performance tuning, investigation and optimization for erasure 
> coding
> 
>
> Key: HDFS-8425
> URL: https://issues.apache.org/jira/browse/HDFS-8425
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
> Attachments: testClientWriteReadFile_v1.pdf, 
> testdfsio-read-mbsec.png, testdfsio-write-mbsec.png
>
>
> This {{umbrella}} jira aims to track performance tuning, investigation and 
> optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-11-01 Thread Takuya Fukudome (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984631#comment-14984631
 ] 

Takuya Fukudome commented on HDFS-8425:
---

Thanks for the comment, [~zhz]! The X-axis means the number of 
files(TestDFSIO's nrFiles parameter). And
bq.  And I guess you didn't kill and DN in read tests?
Yes, you are right. I will do the read tests with failure situation later. 
Thank you!

> [umbrella] Performance tuning, investigation and optimization for erasure 
> coding
> 
>
> Key: HDFS-8425
> URL: https://issues.apache.org/jira/browse/HDFS-8425
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
> Attachments: testClientWriteReadFile_v1.pdf, 
> testdfsio-read-mbsec.png, testdfsio-write-mbsec.png
>
>
> This {{umbrella}} jira aims to track performance tuning, investigation and 
> optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-10-29 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981572#comment-14981572
 ] 

Zhe Zhang commented on HDFS-8425:
-

Thanks [~tfukudom]! What does the X-axis show (10, 20, ..., 200)? And I guess 
you didn't kill and DN in read tests?

> [umbrella] Performance tuning, investigation and optimization for erasure 
> coding
> 
>
> Key: HDFS-8425
> URL: https://issues.apache.org/jira/browse/HDFS-8425
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: GAO Rui
> Attachments: testClientWriteReadFile_v1.pdf, 
> testdfsio-read-mbsec.png, testdfsio-write-mbsec.png
>
>
> This {{umbrella}} jira aims to track performance tuning, investigation and 
> optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-07-24 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641195#comment-14641195
 ] 

Zhe Zhang commented on HDFS-8425:
-

Moving system test JIRAs as follow-ons. Let me know if you have other opinions. 
Thanks!

 [umbrella] Performance tuning, investigation and optimization for erasure 
 coding
 

 Key: HDFS-8425
 URL: https://issues.apache.org/jira/browse/HDFS-8425
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: GAO Rui
 Attachments: testClientWriteReadFile_v1.pdf


 This {{umbrella}} jira aims to track performance tuning, investigation and 
 optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding

2015-06-03 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572086#comment-14572086
 ] 

Li Bo commented on HDFS-8425:
-

I think encoding is one factor. More work can be done to enhance the 
efficiency. When the data buffers are full, we will suspend data writing and 
switch to generate parity buffers. I think they can work concurrently. I have 
created HDFS-8528 which tries to enhance the efficiency in this direction.

 [umbrella] Performance tuning, investigation and optimization for erasure 
 coding
 

 Key: HDFS-8425
 URL: https://issues.apache.org/jira/browse/HDFS-8425
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: GAO Rui
 Attachments: testClientWriteReadFile_v1.pdf


 This {{umbrella}} jira aims to track performance tuning, investigation and 
 optimization for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)