Re: Adding large text blob causes read timeout...

2014-06-24 Thread Kevin Burton
oh.. the difference between the the ONE field and the remaining 29 is
massive.

It's like 200ms for just the 29 columns.. adding the extra one cause it to
timeout ..  5000ms...


On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Don't forget that when you do the Select with limit set to 1000, Cassandra
 is actually fetching 1000 * 29 physical columns (29 fields per logical
 row).

 Adding one extra big html column may be too much and cause timeout. Try to:

 1. Select only the big html only
 2. Or reduce the limit incrementally until no timeout
 Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit :

 I have a table with a schema mostly of small fields.  About 30 of them.

 The primary key is:

 primary key( bucket, sequence )

 … I have 100 buckets and the idea is that sequence is ever increasing.
  This way I can read from bucket zero, and everything after sequence N and
 get all the writes ordered by time.

 I'm running

 SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence
 ASC LIMIT 1000;

 … using the have driver.

 If I add ALL the fields, except one, so 29 fields, the query is fast.
  Only 129ms….

 However, if I add the 'html' field, which is snapshot of HTML obvious,
 the query times out…

 I'm going to add tracing and try to track it down further, but I suspect
 I'm doing something stupid.

 Is it going to burn me that the data is UTF8 encoded? I can't image
 decoding UTF8 is going to be THAT slow but perhaps cassandra is doing
 something silly under the covers?

 cqlsh doesn't time out … it actually works fine but it uses 100% CPU
 while writing out the data so it's not a good comparison unfortunately


 ception in thread main
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: ...:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
  at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
 at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
  at
 com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100)
 Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException:
 All host(s) tried for query failed (tried:
 dev4.wdc.sl.spinn3r.com/10.24.23.94:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:724)


 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


Re: Adding large text blob causes read timeout...

2014-06-24 Thread DuyHai Doan
Yes but adding the extra one ends up by * 1000. The limit in CQL3 specifies
the number of logical rows, not the number of physical columns in the
storage engine
Le 24 juin 2014 08:30, Kevin Burton bur...@spinn3r.com a écrit :

 oh.. the difference between the the ONE field and the remaining 29 is
 massive.

 It's like 200ms for just the 29 columns.. adding the extra one cause it to
 timeout ..  5000ms...


 On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Don't forget that when you do the Select with limit set to 1000,
 Cassandra is actually fetching 1000 * 29 physical columns (29 fields per
 logical row).

 Adding one extra big html column may be too much and cause timeout. Try
 to:

 1. Select only the big html only
 2. Or reduce the limit incrementally until no timeout
 Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit :

 I have a table with a schema mostly of small fields.  About 30 of them.

 The primary key is:

 primary key( bucket, sequence )

 … I have 100 buckets and the idea is that sequence is ever increasing.
  This way I can read from bucket zero, and everything after sequence N and
 get all the writes ordered by time.

 I'm running

 SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence
 ASC LIMIT 1000;

 … using the have driver.

 If I add ALL the fields, except one, so 29 fields, the query is fast.
  Only 129ms….

 However, if I add the 'html' field, which is snapshot of HTML obvious,
 the query times out…

 I'm going to add tracing and try to track it down further, but I suspect
 I'm doing something stupid.

 Is it going to burn me that the data is UTF8 encoded? I can't image
 decoding UTF8 is going to be THAT slow but perhaps cassandra is doing
 something silly under the covers?

 cqlsh doesn't time out … it actually works fine but it uses 100% CPU
 while writing out the data so it's not a good comparison unfortunately


 ception in thread main
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: ...:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
  at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
 at
 com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
  at
 com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100)
 Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException:
 All host(s) tried for query failed (tried:
 dev4.wdc.sl.spinn3r.com/10.24.23.94:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:724)


 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations
 are people.




 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




Re: Adding large text blob causes read timeout...

2014-06-24 Thread Jonathan Haddad
Can you do you query in the cli after setting tracing on?


On Mon, Jun 23, 2014 at 11:32 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Yes but adding the extra one ends up by * 1000. The limit in CQL3
 specifies the number of logical rows, not the number of physical columns in
 the storage engine
 Le 24 juin 2014 08:30, Kevin Burton bur...@spinn3r.com a écrit :

 oh.. the difference between the the ONE field and the remaining 29 is
 massive.

 It's like 200ms for just the 29 columns.. adding the extra one cause it
 to timeout ..  5000ms...


 On Mon, Jun 23, 2014 at 10:30 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Don't forget that when you do the Select with limit set to 1000,
 Cassandra is actually fetching 1000 * 29 physical columns (29 fields per
 logical row).

 Adding one extra big html column may be too much and cause timeout. Try
 to:

 1. Select only the big html only
 2. Or reduce the limit incrementally until no timeout
 Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit :

 I have a table with a schema mostly of small fields.  About 30 of them.

 The primary key is:

 primary key( bucket, sequence )

 … I have 100 buckets and the idea is that sequence is ever increasing.
  This way I can read from bucket zero, and everything after sequence N and
 get all the writes ordered by time.

 I'm running

 SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence
 ASC LIMIT 1000;

 … using the have driver.

 If I add ALL the fields, except one, so 29 fields, the query is fast.
  Only 129ms….

 However, if I add the 'html' field, which is snapshot of HTML obvious,
 the query times out…

 I'm going to add tracing and try to track it down further, but I
 suspect I'm doing something stupid.

 Is it going to burn me that the data is UTF8 encoded? I can't image
 decoding UTF8 is going to be THAT slow but perhaps cassandra is doing
 something silly under the covers?

 cqlsh doesn't time out … it actually works fine but it uses 100% CPU
 while writing out the data so it's not a good comparison unfortunately


 ception in thread main
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: ...:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
  at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
 at
 com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
  at
 com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100)
 Caused by:
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: dev4.wdc.sl.spinn3r.com/10.24.23.94:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:724)


 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations
 are people.




 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Re: Adding large text blob causes read timeout...

2014-06-23 Thread DuyHai Doan
Don't forget that when you do the Select with limit set to 1000, Cassandra
is actually fetching 1000 * 29 physical columns (29 fields per logical
row).

Adding one extra big html column may be too much and cause timeout. Try to:

1. Select only the big html only
2. Or reduce the limit incrementally until no timeout
Le 24 juin 2014 06:22, Kevin Burton bur...@spinn3r.com a écrit :

 I have a table with a schema mostly of small fields.  About 30 of them.

 The primary key is:

 primary key( bucket, sequence )

 … I have 100 buckets and the idea is that sequence is ever increasing.
  This way I can read from bucket zero, and everything after sequence N and
 get all the writes ordered by time.

 I'm running

 SELECT ... FROM content WHERE bucket=0 AND sequence0 ORDER BY sequence
 ASC LIMIT 1000;

 … using the have driver.

 If I add ALL the fields, except one, so 29 fields, the query is fast.
  Only 129ms….

 However, if I add the 'html' field, which is snapshot of HTML obvious, the
 query times out…

 I'm going to add tracing and try to track it down further, but I suspect
 I'm doing something stupid.

 Is it going to burn me that the data is UTF8 encoded? I can't image
 decoding UTF8 is going to be THAT slow but perhaps cassandra is doing
 something silly under the covers?

 cqlsh doesn't time out … it actually works fine but it uses 100% CPU while
 writing out the data so it's not a good comparison unfortunately


 ception in thread main
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: ...:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
  at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
 at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
  at
 com.spinn3r.artemis.robot.console.BenchmarkContentStream.main(BenchmarkContentStream.java:100)
 Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException:
 All host(s) tried for query failed (tried:
 dev4.wdc.sl.spinn3r.com/10.24.23.94:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
  at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:724)


 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.