Thank you, I will look into that option. On Mon, Apr 17, 2023 at 3:29 AM Bowen Song via user < user@cassandra.apache.org> wrote:
> Using a frozen UDT for all the non-key columns is a good starting point. > You can go a step further and use frozen UDTs for the partition keys and > clustering keys too if appropriate. This alone will dramatically reduce the > number of cells per row from from 13 to 3, and save 77% of deserialisation > work for Cassandra. > > If the performance is still suboptimal after you've done the above, you > should consider creating a batch process that reads the smaller rows from > this table and combining them into bigger rows, and then storing the new > row in another table which has the same partition key but each row is a > frozen list that contains many original rows. If you combine all rows from > each partition of the old table into a single row in the new table, the > read speed should be much faster. Keep in mind that this may not work if > the partition size of the original table is too large (approximately > >16MB), as the mutation size is limited to up to half of the commitlog > segment size. > > > On 12/04/2023 06:14, Gil Ganz wrote: > > Is there something I can do to speed up the deserialisation ? > In this example I did a count query, but in reality I need the actual data. > Write pattern in this table is such that all data for a given row is > written at the same time, so I know I can use frozen udt instead of this, > making it faster, but I wonder if there is another way. > > On Tue, Apr 11, 2023 at 9:06 PM Bowen Song via user < > user@cassandra.apache.org> wrote: > >> Reading 4MB from 70k rows and 13 columns (0.91 million cells) from disk >> in 120ms doesn't sound bad. That's a lots of deserialisation to do. If you >> want it to be faster, you can store the number of rows elsewhere if that's >> the only thing you need. >> On 11/04/2023 07:13, Gil Ganz wrote: >> >> Hey >> I have a 4.0.4 cluster, with reads of partitions that are a bit on the >> bigger side, taking longer than I would expect. >> Reading entire partition that has ~70000 rows, total partition size of >> 4mb, takes 120ms, I would expect it to take less. >> >> This is after major compaction, so there is only one sstables. local_one >> consistency level, no tombstones, and reading the entire partition in one >> fetch. >> Cluster is not doing much else at the time, nvme disk. I can see most of >> the time is spent on getting the data from the sstable. >> >> Is there any specific optimization one can do to speed up cases like this? >> I would expect fetching 4mb to take less, I assume if this was one blob >> of 4mb that would be the case. >> >> Table definition : >> >> CREATE TABLE ks1.item_to_item ( >> x1 bigint, >> x2 bigint, >> x3 int, >> x4 int, >> y1 bigint, >> y2 bigint, >> y3 bigint, >> y4 bigint, >> y5 bigint, >> metadata text, >> m_metadata_created_at timestamp, >> m_metadata_created_by bigint, >> m_metadata_updated_at timestamp, >> PRIMARY KEY ((x1, x2, x3, x4), y1, y2, y3, y4, y5) >> ) WITH CLUSTERING ORDER BY (y1 ASC, y2 ASC, y3 ASC, y4 ASC, y5 ASC) >> >> >> >> cqlsh> select count(0) from ks1.item_to_item where x1=4 and x2=70000 and >> x4=0 and x3=1; >> >> count >> ------- >> 70000 >> >> (1 rows) >> >> Tracing session: 6356d290-d785-11ed-aba5-ab86979f2f58 >> >> activity >> | timestamp | source | source_elapsed | client >> >> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+----------------+----------- >> Execute CQL3 query >> | 2023-04-10 09:52:21.561000 | 172.25.0.4 | 0 | >> 127.0.0.1 >> Parsing XXXX [Native-Transport-Requests-1] >> | 2023-04-10 09:52:21.561000 | 172.25.0.4 | 428 >> | 127.0.0.1 >> Preparing statement >> [Native-Transport-Requests-1] | 2023-04-10 09:52:21.562000 | >> 172.25.0.4 | 973 | 127.0.0.1 >> Acquiring sstable references >> [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | >> 2255 | 127.0.0.1 >> Skipped 0/1 non-slice-intersecting sstables, included 0 due to >> tombstones [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | >> 2524 | 127.0.0.1 >> Key cache hit for sstable 9 >> [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | >> 2692 | 127.0.0.1 >> Merged data from memtables and 1 sstables >> [ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | >> 90405 | 127.0.0.1 >> Read 70000 live rows and 0 tombstone cells >> [ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | >> 90726 | 127.0.0.1 >> Request >> complete | 2023-04-10 09:52:21.682603 | 172.25.0.4 | >> 121603 | 127.0.0.1 >> >> >> gil >> >>