[jira] [Comment Edited] (CASSANDRA-19671) Nodetool tabestats: add keyspace space used and table r/w ratio

2024-06-02 Thread Arun Ganesh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851519#comment-17851519
 ] 

Arun Ganesh edited comment on CASSANDRA-19671 at 6/3/24 4:58 AM:
-

[~bschoeni],

For the keyspace space-used metric, is it just a sum of the space-used values 
of all the tables under it?

And, I already see a {{localReadWriteRatio}} for the tables in 
{{StatsTable.java}}. Do you mean something else?

{code:java}
...
Speculative retries: 0
Local read count: 2
Local read latency: 8.774 ms
Local write count: 1
Local write latency: 2.759 ms
Local read/write ratio: 2.000
Pending flushes: 0
Percent repaired: 0.0
Bytes repaired: 0
Bytes unrepaired: 37
...
{code}



was (Author: JIRAUSER303038):
[~bschoeni],

For the keyspace space-used metric, is it just a sum of the space-used values 
of all the tables under it?

And, I already see a {{localReadWriteRatio}} for the tables in 
{{StatsTable.java}}. Do you mean something else?

{code:java}
...
Speculative retries: 0
Local read count: 2
Local read latency: 8.774 ms
Local write count: 1
Local write latency: 2.759 ms
{color:#FF8B00}Local read/write ratio: 2.000{color}
Pending flushes: 0
Percent repaired: 0.0
Bytes repaired: 0
Bytes unrepaired: 37
...
{code}


> Nodetool tabestats: add keyspace space used and table r/w ratio
> ---
>
> Key: CASSANDRA-19671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Brad Schoening
>Priority: Normal
>
> Nodetool tabestats reports the space used live and total per table, but not 
> for the entire keyspace.  This would be useful information.
> Also, in the table level stats, it would be useful to have the read/write 
> ratio. This metric is important in choosing compaction strategies such as 
> LCS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19671) Nodetool tabestats: add keyspace space used and table r/w ratio

2024-06-02 Thread Arun Ganesh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851519#comment-17851519
 ] 

Arun Ganesh commented on CASSANDRA-19671:
-

[~bschoeni],

For the keyspace space-used metric, is it just a sum of the space-used values 
of all the tables under it?

And, I already see a {{localReadWriteRatio}} for the tables in 
{{StatsTable.java}}. Do you mean something else?

{code:java}
...
Speculative retries: 0
Local read count: 2
Local read latency: 8.774 ms
Local write count: 1
Local write latency: 2.759 ms
{color:#FF8B00}Local read/write ratio: 2.000{color}
Pending flushes: 0
Percent repaired: 0.0
Bytes repaired: 0
Bytes unrepaired: 37
...
{code}


> Nodetool tabestats: add keyspace space used and table r/w ratio
> ---
>
> Key: CASSANDRA-19671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Brad Schoening
>Priority: Normal
>
> Nodetool tabestats reports the space used live and total per table, but not 
> for the entire keyspace.  This would be useful information.
> Also, in the table level stats, it would be useful to have the read/write 
> ratio. This metric is important in choosing compaction strategies such as 
> LCS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19150) Align values in rows in CQLSH right for numbers, left for text

2024-06-02 Thread Arun Ganesh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851515#comment-17851515
 ] 

Arun Ganesh commented on CASSANDRA-19150:
-

[~bschoeni],

Thanks for the review! I'm sorry I was not available for a while because of my 
finals.

#2 and #3 sounds good. For #4, I don't have the exact count because I don't 
have a paid CircleCI account, and I can see some dtests using the cqlsh output 
(like this 
[one|https://github.com/apache/cassandra-dtest/blob/trunk/json_test.py]).

Regarding #1, everything except "This could be generalized to a multi-value map 
for types" sounds good, because I see a lot of inline uses of the color map, 
like
{code:python}
coloredval = colormap['text'] + bits_to_turn_red_re.sub(tbr, bval) + 
colormap['reset']
{code}

Changing this map to include the alignment too would require changes in a lot 
of places.

Let me update my PR.

> Align values in rows in CQLSH right for numbers, left for text
> --
>
> Key: CASSANDRA-19150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19150
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Stefan Miklosovic
>Assignee: Arun Ganesh
>Priority: Low
> Fix For: 5.x
>
> Attachments: Screenshot 2023-12-04 at 00.38.16.png, Screenshot 
> 2023-12-09 at 16.58.25.png, signature.asc, test_output.txt, 
> test_output_old.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Updated* Jan 17 2024 after dev discussion
> Change CQLSH to left-align text while continue to right-align numbers.  This 
> will match how Postgres shell and Excel treat alignment of text and number.
> -
> *Original*
> We need to make this
> [https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/cqlshmain.py#L1101]
> configurable so values in columns are either all on left or on right side of 
> the column (basically change col.rjust to col.ljust).
> By default, it would be like it is now but there would be configuration 
> property in cqlsh for that as well as a corresponding CQLSH command 
> (optional), something like
> {code:java}
> ALIGNMENT LEFT|RIGHT
> {code}
> cc [~bschoeni]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19676) Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric

2024-06-02 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19676:
--
Status: Review In Progress  (was: Patch Available)

> Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric
> 
>
> Key: CASSANDRA-19676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19676
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Sam Lightfoot
>Assignee: Sam Lightfoot
>Priority: Normal
> Fix For: 5.x
>
> Attachments: image-2024-06-02-17-25-25-071.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> On profiling a write-heavy workload (90% writes) using easy-cass-stress, it 
> became very clear StorageProxy::updateCoordinatorWriteLatencyTableMetric was 
> a hot path that ~15% of the CPU cycles of 
> ModificationStatement::executeWithoutCondition were taken up by (see attached 
> async-profiler image).
> We should convert this stream to a simple for loop, as has been discussed 
> recently on the mail list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19676) Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric

2024-06-02 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851453#comment-17851453
 ] 

Stefan Miklosovic commented on CASSANDRA-19676:
---

It is very tempting to include this into 5.0.0 too but just trunk is also fine 
... 

 

> Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric
> 
>
> Key: CASSANDRA-19676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19676
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Sam Lightfoot
>Assignee: Sam Lightfoot
>Priority: Normal
> Fix For: 5.x
>
> Attachments: image-2024-06-02-17-25-25-071.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> On profiling a write-heavy workload (90% writes) using easy-cass-stress, it 
> became very clear StorageProxy::updateCoordinatorWriteLatencyTableMetric was 
> a hot path that ~15% of the CPU cycles of 
> ModificationStatement::executeWithoutCondition were taken up by (see attached 
> async-profiler image).
> We should convert this stream to a simple for loop, as has been discussed 
> recently on the mail list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19676) Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric

2024-06-02 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19676:
--
Test and Documentation Plan: ci
 Status: Patch Available  (was: In Progress)

> Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric
> 
>
> Key: CASSANDRA-19676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19676
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Sam Lightfoot
>Assignee: Sam Lightfoot
>Priority: Normal
> Fix For: 5.x
>
> Attachments: image-2024-06-02-17-25-25-071.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> On profiling a write-heavy workload (90% writes) using easy-cass-stress, it 
> became very clear StorageProxy::updateCoordinatorWriteLatencyTableMetric was 
> a hot path that ~15% of the CPU cycles of 
> ModificationStatement::executeWithoutCondition were taken up by (see attached 
> async-profiler image).
> We should convert this stream to a simple for loop, as has been discussed 
> recently on the mail list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19676) Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric

2024-06-02 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19676:
--
Change Category: Performance
 Complexity: Low Hanging Fruit
  Fix Version/s: 5.x
  Reviewers: Stefan Miklosovic
 Status: Open  (was: Triage Needed)

> Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric
> 
>
> Key: CASSANDRA-19676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19676
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Sam Lightfoot
>Assignee: Sam Lightfoot
>Priority: Normal
> Fix For: 5.x
>
> Attachments: image-2024-06-02-17-25-25-071.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> On profiling a write-heavy workload (90% writes) using easy-cass-stress, it 
> became very clear StorageProxy::updateCoordinatorWriteLatencyTableMetric was 
> a hot path that ~15% of the CPU cycles of 
> ModificationStatement::executeWithoutCondition were taken up by (see attached 
> async-profiler image).
> We should convert this stream to a simple for loop, as has been discussed 
> recently on the mail list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15452) Improve disk access patterns during compaction and streaming

2024-06-02 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851448#comment-17851448
 ] 

Jordan West commented on CASSANDRA-15452:
-

I backported the branch to 4.1 
[here|https://github.com/jrwest/cassandra/tree/jwest/15452-4.1-readahead] and 
ran tests on it:
 - 
[Java8|https://app.circleci.com/pipelines/github/jrwest/cassandra/186/workflows/b46f0e55-db2e-43e4-be9d-c88447034d45]
 - 
[Java11|https://app.circleci.com/pipelines/github/jrwest/cassandra/186/workflows/534fe8b7-cabb-461d-9614-2fcd70d3b156]

The only failures look like (unrelated, known) flakes and a container 
downloading configuration issue I am not familiar with but will take a closer 
look.

For the 5.0 branch, besides running tests the only thing still outstanding is 
to make it work with BTI format. BTI opens scanners differently than BIG format 
so we lose the read ahead buffer mode when its enabled, as currently 
implemented.

Also, right now the patch also applies to range scans. We should do some 
benchmarking to determine impact of that or limit it to compaction only.

> Improve disk access patterns during compaction and streaming
> 
>
> Key: CASSANDRA-15452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths, Local/Compaction
>Reporter: Jon Haddad
>Assignee: Jordan West
>Priority: Normal
> Attachments: everyfs.txt, iostat-5.0-head.output, 
> iostat-5.0-patched.output, iostat-ebs-15452.png, iostat-ebs-head.png, 
> iostat-instance-15452.png, iostat-instance-head.png, results.txt, 
> sequential.fio, throughput-1.png, throughput.png
>
>
> On read heavy workloads Cassandra performs much better when using a low read 
> ahead setting.   In my tests I've seen an 5x improvement in throughput and 
> more than a 50% reduction in latency.  However, I've also observed that it 
> can have a negative impact on compaction and streaming throughput. It 
> especially negatively impacts cloud environments where small reads incur high 
> costs in IOPS due to tiny requests.
>  # We should investigate using POSIX_FADV_DONTNEED on files we're compacting 
> to see if we can improve performance and reduce page faults. 
>  # This should be combined with an internal read ahead style buffer that 
> Cassandra manages, similar to a BufferedInputStream but with our own 
> machinery.  This buffer should read fairly large blocks of data off disk at 
> at time.  EBS, for example, allows 1 IOP to be up to 256KB.  A considerable 
> amount of time is spent in blocking I/O during compaction and streaming. 
> Reducing the frequency we read from disk should speed up all sequential I/O 
> operations.
>  # We can reduce system calls by buffering writes as well, but I think it 
> will have less of an impact than the reads



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19676) Stream processing for StorageProxy::updateCoordinatorWriteLatencyTableMetric

2024-06-02 Thread Sam Lightfoot (Jira)
Sam Lightfoot created CASSANDRA-19676:
-

 Summary: Stream processing for 
StorageProxy::updateCoordinatorWriteLatencyTableMetric
 Key: CASSANDRA-19676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19676
 Project: Cassandra
  Issue Type: Improvement
  Components: Legacy/Core
Reporter: Sam Lightfoot
Assignee: Sam Lightfoot
 Attachments: image-2024-06-02-17-25-25-071.png

On profiling a write-heavy workload (90% writes) using easy-cass-stress, it 
became very clear StorageProxy::updateCoordinatorWriteLatencyTableMetric was a 
hot path that ~15% of the CPU cycles of 
ModificationStatement::executeWithoutCondition were taken up by (see attached 
async-profiler image).

We should convert this stream to a simple for loop, as has been discussed 
recently on the mail list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org