[ 
https://issues.apache.org/jira/browse/ACCUMULO-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Brassard updated ACCUMULO-1083:
------------------------------------

    Description: 
When running tablet servers on beefy nodes (lots of disks), the write-ahead log 
can be a serious bottleneck. Today we ran a continuous ingest test of 
1.5-SNAPSHOT on an 8-node (plus a master node) cluster in which the nodes had 
32 cores and 15 drives each. Running with write-ahead log off resulted in a >4x 
performance improvement sustained over a long period.

I believe the culprit is that the WAL is only using one file at a time per 
tablet server, which means HDFS is only appending to one drive (plus replicas). 
If we increase the number of concurrent WAL files supported on a tablet server 
we could probably drastically improve the performance on systems with many 
disks. As it stands, I believe Accumulo is significantly more optimized for a 
larger number of smaller nodes (3-4 drives).

  was:
When running tablet servers on beefy nodes (lots of disks), the write-ahead log 
can be a serious bottleneck. Today we ran a test of 1.5-SNAPSHOT on an 8-node 
(plus a master node) cluster in which the nodes had 32 cores and 15 drives 
each. Running with write-ahead log off resulted in a >4x performance 
improvement sustained over a long period.

I believe the culprit is that the WAL is only using one file at a time per 
tablet server, which means HDFS is only appending to one drive (plus replicas). 
If we increase the number of concurrent WAL files supported on a tablet server 
we could probably drastically improve the performance on systems with many 
disks. As it stands, I believe Accumulo is significantly more optimized for a 
larger number of smaller nodes (3-4 drives).

    
> add concurrency to HDFS write-ahead log
> ---------------------------------------
>
>                 Key: ACCUMULO-1083
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1083
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Adam Fuchs
>             Fix For: 1.6.0
>
>         Attachments: walog-performance.jpg
>
>
> When running tablet servers on beefy nodes (lots of disks), the write-ahead 
> log can be a serious bottleneck. Today we ran a continuous ingest test of 
> 1.5-SNAPSHOT on an 8-node (plus a master node) cluster in which the nodes had 
> 32 cores and 15 drives each. Running with write-ahead log off resulted in a 
> >4x performance improvement sustained over a long period.
> I believe the culprit is that the WAL is only using one file at a time per 
> tablet server, which means HDFS is only appending to one drive (plus 
> replicas). If we increase the number of concurrent WAL files supported on a 
> tablet server we could probably drastically improve the performance on 
> systems with many disks. As it stands, I believe Accumulo is significantly 
> more optimized for a larger number of smaller nodes (3-4 drives).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to