[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-08-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436059#comment-13436059
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

Your instincts were better than mine: combining compaction and flush i/o into a 
single executor was a mistake.  We could band-aid it by adding some kind of 
semaphore mechanism to make sure we always leave at least one thread free for 
flushing but this still won't let us max out on flushing temporarily at the 
expense of compaction, without introducing extremely complicated preemption 
logic.

So, color me convinced that we need to keep separate executors for flush and 
compaction.

Additionally, the more I think about it the less I think the DBT abstraction is 
what we want here.  Or at a higher level: I don't think we want to be that 
strict about one thread per disk.  Which was my fault in the first place, sorry!

If we instead just follow the above disk prioritization logic, we'll still get 
effectively thread-per-disk until disks start to run out of space.  But having 
a (standard) flexible pool of threads means that we generalize much better to 
SSDs, where having substantially more threads than disks makes sense (since 
compaction becomes CPU bound).

So I think we can simplify our approach a lot, perhaps by having a global 
Directory state that tracks space remaining and how many i/o tasks are running 
on each, that we can use when handing out flush and compaction targets.  The 
executor architecture won't need to change.  (May want to introduce a 
DirectoryBoundRunnable abstraction, whose run method encapsulates updating i/o 
task count and space free after running the flush/compaction, but without 
trying it I'm not sure if that actually works as imagined.)

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-08-15 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435532#comment-13435532
 ] 

Yuki Morishita commented on CASSANDRA-4292:
---

I ran tests against patched and trunk with modified stress tool to write to 3 
CFs with leveled compaction.
Node consists of 6 spinning disks and C* uses those as data directories.
Although I see difference in disk usage(patched version distributes load evenly 
among disks), there is still no difference in performance in both write and 
compaction.
It seems that sometimes memtable flushing is blocked when long running 
compaction is already started, and causing GC pressure on patched node.
Looks like I need to find the way to avoid queuing up memtable flush tasks.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292.txt, 4292-v2.txt, 4292-v3.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427392#comment-13427392
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

v3 looks good enough to do some performance testing to see if it's worth 
polishing more. :)

bq. Can we use CopyOnWriteArrayList 

Nit: Looking at this again it should probably actually be an ImmutableList.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292-v3.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425036#comment-13425036
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

- need to use a single DiskWriter for both compaction and flushing or we lose 
on most of the benefits here.  One solution: rename CompactionManager to 
IOManager, and use that.  Another could be to move it into StorageService.
- compactionexecutor needs to be cleaned up since it's no longer serving the 
executor role.  again, cleanup could be straightforward if we morph CM into 
IOManager (and merge CompactionExecutor + DiskWriter).  Could be nice to get 
the kind of progress reporting on flushes that we now have on compaction.
- DiskWriter: Can we use CopyOnWriteArrayList instead of synchronized block?


 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-27 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423951#comment-13423951
 ] 

Yuki Morishita commented on CASSANDRA-4292:
---

Here's the code for choosing disk from attached patch.

{code}
// DiskWriter.java
private ExecutorService selectExecutor(DiskBoundTask task)
{
// sort by available disk space
SortedSetDiskBoundTaskExecutor executors;
synchronized (perDiskTaskExecutors)
{
executors = ImmutableSortedSet.copyOf(perDiskTaskExecutors);
}

// if there is disk with sufficient space and no activity running on it, 
then use it
for (DiskBoundTaskExecutor executor : executors)
{
long spaceAvailable = executor.getEstimatedAvailableSpace();
if (task.getExpectedWriteSize()  spaceAvailable  
executor.getActiveCount() == 0)
return executor;
}

// if not, use the one that has largest free space
if (task.getExpectedWriteSize()  
executors.first().getEstimatedAvailableSpace())
return executors.first();
else
   return task.recalculateWriteSize() ? selectExecutor(task) : null; // 
retry if needed
}
{code}

Before choosing disk, we sort by available disk space, but then choose the one 
that 1) fits for new sstable and 2) has zero task.
If we cannot find, then 3) we choose the one with largest free space.
So I think above code works as you described.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423953#comment-13423953
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

Hmm, may have been looking at the wrong patch.  Will reinspect.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424149#comment-13424149
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

Can you rebase post-CASSANDRA-2116?

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422574#comment-13422574
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

bq. Directory is chosen based on available space in both queue and disk.

We still want to prioritize disks that have no tasks yet, since ipos are a 
bigger bottleneck than space, in general.

So specifically, we want to prioritize in order of:

# enough space for the new sstable (boolean)
# zero tasks (boolean)
# total free space (long)

We may want to test changing #2 to ordering by task count...  both have pros 
and cons.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292-v2.txt, 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420904#comment-13420904
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

Looks reasonable to me so far.

A couple points:

- we'll want to prefer (1) disks that have no current writes, then (2) disks 
with the least projected data (including the estimated size of currently active 
writes)
- compaction should use this executor as well

Nit: probably cleaner to use a Map for the new getLocationForDisk method

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2

 Attachments: 4292.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4292) Per-disk I/O queues

2012-06-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294697#comment-13294697
 ] 

Jonathan Ellis commented on CASSANDRA-4292:
---

We'll also want to reserve space for in-progress writes; currently we just 
use the raw free space as reported by the OS, which means that when disks are 
close to evenly matched we're highly likely to stack multiple new sstables on 
the same one instead of spreading them out.

 Per-disk I/O queues
 ---

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira