[ https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joshua McKenzie updated CASSANDRA-11542: ---------------------------------------- Reviewer: T Jake Luciani > Create a benchmark to compare HDFS and Cassandra bulk read times > ---------------------------------------------------------------- > > Key: CASSANDRA-11542 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11542 > Project: Cassandra > Issue Type: Sub-task > Components: Testing > Reporter: Stefania > Assignee: Stefania > Fix For: 3.x > > Attachments: jfr_recordings.zip, spark-load-perf-results-001.zip, > spark-load-perf-results-002.zip, spark-load-perf-results-003.zip > > > I propose creating a benchmark for comparing Cassandra and HDFS bulk reading > performance. Simple Spark queries will be performed on data stored in HDFS or > Cassandra, and the entire duration will be measured. An example query would > be the max or min of a column or a count\(*\). > This benchmark should allow determining the impact of: > * partition size > * number of clustering columns > * number of value columns (cells) -- This message was sent by Atlassian JIRA (v6.3.4#6332)