Re: Adding large text blob causes read timeout...
oh.. the difference between the the ONE field and the remaining 29 is massive. It's like 200ms for just the 29 columns.. adding the extra one cause it to timeout .. 5000ms... On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com wrote: Don't forget that when you do the Select with limit set to 1000, Cassandra is actually fetching 1000 * 29 physical columns (29 fields per logical row). Adding one extra big html column may be too much and cause timeout. Try to: 1. Select only the big html only 2. Or reduce the limit incrementally until no timeout Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit : I have a table with a schema mostly of small fields. About 30 of them. The primary key is: primary key( bucket, sequence ) … I have 100 buckets and the idea is that sequence is ever increasing. This way I can read from bucket zero, and everything after sequence N and get all the writes ordered by time. I'm running SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence ASC LIMIT 1000; … using the have driver. If I add ALL the fields, except one, so 29 fields, the query is fast. Only 129ms…. However, if I add the 'html' field, which is snapshot of HTML obvious, the query times out… I'm going to add tracing and try to track it down further, but I suspect I'm doing something stupid. Is it going to burn me that the data is UTF8 encoded? I can't image decoding UTF8 is going to be THAT slow but perhaps cassandra is doing something silly under the covers? cqlsh doesn't time out … it actually works fine but it uses 100% CPU while writing out the data so it's not a good comparison unfortunately ception in thread main com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ...:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65) at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172) at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92) at com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: dev4.wdc.sl.spinn3r.com/10.24.23.94:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Adding large text blob causes read timeout...
Yes but adding the extra one ends up by * 1000. The limit in CQL3 specifies the number of logical rows, not the number of physical columns in the storage engine Le 24 juin 2014 08:30, Kevin Burton bur...@spinn3r.com a écrit : oh.. the difference between the the ONE field and the remaining 29 is massive. It's like 200ms for just the 29 columns.. adding the extra one cause it to timeout .. 5000ms... On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com wrote: Don't forget that when you do the Select with limit set to 1000, Cassandra is actually fetching 1000 * 29 physical columns (29 fields per logical row). Adding one extra big html column may be too much and cause timeout. Try to: 1. Select only the big html only 2. Or reduce the limit incrementally until no timeout Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit : I have a table with a schema mostly of small fields. About 30 of them. The primary key is: primary key( bucket, sequence ) … I have 100 buckets and the idea is that sequence is ever increasing. This way I can read from bucket zero, and everything after sequence N and get all the writes ordered by time. I'm running SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence ASC LIMIT 1000; … using the have driver. If I add ALL the fields, except one, so 29 fields, the query is fast. Only 129ms…. However, if I add the 'html' field, which is snapshot of HTML obvious, the query times out… I'm going to add tracing and try to track it down further, but I suspect I'm doing something stupid. Is it going to burn me that the data is UTF8 encoded? I can't image decoding UTF8 is going to be THAT slow but perhaps cassandra is doing something silly under the covers? cqlsh doesn't time out … it actually works fine but it uses 100% CPU while writing out the data so it's not a good comparison unfortunately ception in thread main com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ...:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65) at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172) at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92) at com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: dev4.wdc.sl.spinn3r.com/10.24.23.94:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.
Re: Adding large text blob causes read timeout...
Can you do you query in the cli after setting tracing on? On Mon, Jun 23, 2014 at 11:32 PM, DuyHai Doan doanduy...@gmail.com wrote: Yes but adding the extra one ends up by * 1000. The limit in CQL3 specifies the number of logical rows, not the number of physical columns in the storage engine Le 24 juin 2014 08:30, Kevin Burton bur...@spinn3r.com a écrit : oh.. the difference between the the ONE field and the remaining 29 is massive. It's like 200ms for just the 29 columns.. adding the extra one cause it to timeout .. 5000ms... On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com wrote: Don't forget that when you do the Select with limit set to 1000, Cassandra is actually fetching 1000 * 29 physical columns (29 fields per logical row). Adding one extra big html column may be too much and cause timeout. Try to: 1. Select only the big html only 2. Or reduce the limit incrementally until no timeout Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit : I have a table with a schema mostly of small fields. About 30 of them. The primary key is: primary key( bucket, sequence ) … I have 100 buckets and the idea is that sequence is ever increasing. This way I can read from bucket zero, and everything after sequence N and get all the writes ordered by time. I'm running SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence ASC LIMIT 1000; … using the have driver. If I add ALL the fields, except one, so 29 fields, the query is fast. Only 129ms…. However, if I add the 'html' field, which is snapshot of HTML obvious, the query times out… I'm going to add tracing and try to track it down further, but I suspect I'm doing something stupid. Is it going to burn me that the data is UTF8 encoded? I can't image decoding UTF8 is going to be THAT slow but perhaps cassandra is doing something silly under the covers? cqlsh doesn't time out … it actually works fine but it uses 100% CPU while writing out the data so it's not a good comparison unfortunately ception in thread main com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ...:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65) at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172) at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92) at com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: dev4.wdc.sl.spinn3r.com/10.24.23.94:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade
Re: Adding large text blob causes read timeout...
Don't forget that when you do the Select with limit set to 1000, Cassandra is actually fetching 1000 * 29 physical columns (29 fields per logical row). Adding one extra big html column may be too much and cause timeout. Try to: 1. Select only the big html only 2. Or reduce the limit incrementally until no timeout Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit : I have a table with a schema mostly of small fields. About 30 of them. The primary key is: primary key( bucket, sequence ) … I have 100 buckets and the idea is that sequence is ever increasing. This way I can read from bucket zero, and everything after sequence N and get all the writes ordered by time. I'm running SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence ASC LIMIT 1000; … using the have driver. If I add ALL the fields, except one, so 29 fields, the query is fast. Only 129ms…. However, if I add the 'html' field, which is snapshot of HTML obvious, the query times out… I'm going to add tracing and try to track it down further, but I suspect I'm doing something stupid. Is it going to burn me that the data is UTF8 encoded? I can't image decoding UTF8 is going to be THAT slow but perhaps cassandra is doing something silly under the covers? cqlsh doesn't time out … it actually works fine but it uses 100% CPU while writing out the data so it's not a good comparison unfortunately ception in thread main com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ...:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65) at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172) at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92) at com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: dev4.wdc.sl.spinn3r.com/10.24.23.94:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout during read)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.