It is interesting the press c* got about having 2 billion columns in a row. You *can* do it but it brings to light some realities of what that means.
On Sun, Feb 3, 2013 at 8:09 AM, Takenori Sato <ts...@cloudian.com> wrote: > Hi Aaron, > > Thanks for your answers. That helped me get a big picture. > > Yes, it contains a big row that goes up to 2GB with more than a million of > columns. > > Let me confirm if I correctly understand. > > - The stack trace is from Slice By Names query. And the deserialization is > at the step 3, "Read the row level Bloom Filter", on your blog. > > - BloomFilterSerializer#deserialize does readLong iteratively at each page > of size 4K for a given row, which means it could be 500,000 loops(calls > readLong) for a 2G row(from 1.0.7 source). > > Correct? > > That makes sense Slice By Names queries against such a wide row could be CPU > bottleneck. In fact, in our test environment, a > BloomFilterSerializer#deserialize of such a case takes more than 10ms, up to > 100ms. > >> Get a single named column. >> Get the first 10 columns using the natural column order. >> Get the last 10 columns using the reversed order. > > Interesting. A query pattern could make a difference? > > We thought the only solutions is to change the data structure(don't use such > a wide row if it is retrieved by Slice By Names query). > > Anyway, will give it a try! > > Best, > Takenori > > On Sat, Feb 2, 2013 at 2:55 AM, aaron morton <aa...@thelastpickle.com> > wrote: >> >> 5. the problematic Data file contains only 5 to 10 keys data but >> large(2.4G) >> >> So very large rows ? >> What does nodetool cfstats or cfhistograms say about the row sizes ? >> >> >> 1. what is happening? >> >> I think this is partially large rows and partially the query pattern, this >> is only by roughly correct >> http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and my talk here >> http://www.datastax.com/events/cassandrasummit2012/presentations >> >> 3. any more info required to proceed? >> >> Do some tests with different query techniques… >> >> Get a single named column. >> Get the first 10 columns using the natural column order. >> Get the last 10 columns using the reversed order. >> >> Hope that helps. >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 31/01/2013, at 7:20 PM, Takenori Sato <ts...@cloudian.com> wrote: >> >> Hi all, >> >> We have a situation that CPU loads on some of our nodes in a cluster has >> spiked occasionally since the last November, which is triggered by requests >> for rows that reside on two specific sstables. >> >> We confirmed the followings(when spiked): >> >> version: 1.0.7(current) <- 0.8.6 <- 0.8.5 <- 0.7.8 >> jdk: Oracle 1.6.0 >> >> 1. a profiling showed that BloomFilterSerializer#deserialize was the >> hotspot(70% of the total load by running threads) >> >> * the stack trace looked like this(simplified) >> 90.4% - org.apache.cassandra.db.ReadVerbHandler.doVerb >> 90.4% - org.apache.cassandra.db.SliceByNamesReadCommand.getRow >> ... >> 90.4% - org.apache.cassandra.db.CollationController.collectTimeOrderedData >> ... >> 89.5% - org.apache.cassandra.db.columniterator.SSTableNamesIterator.read >> ... >> 79.9% - org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter >> 68.9% - org.apache.cassandra.io.sstable.BloomFilterSerializer.deserialize >> 66.7% - java.io.DataInputStream.readLong >> >> 2. Usually, 1 should be so fast that a profiling by sampling can not >> detect >> >> 3. no pressure on Cassandra's VM heap nor on machine in overal >> >> 4. a little I/O traffic for our 8 disks/node(up to 100tps/disk by "iostat >> 1 1000") >> >> 5. the problematic Data file contains only 5 to 10 keys data but >> large(2.4G) >> >> 6. the problematic Filter file size is only 256B(could be normal) >> >> >> So now, I am trying to read the Filter file in the same way >> BloomFilterSerializer#deserialize does as possible as I can, in order to see >> if the file is something wrong. >> >> Could you give me some advise on: >> >> 1. what is happening? >> 2. the best way to simulate the BloomFilterSerializer#deserialize >> 3. any more info required to proceed? >> >> Thanks, >> Takenori >> >> >