Daniel Klessing created STORM-2242:
--------------------------------------

             Summary: Trident state persisting does not honor batch.size.rows 
configuration
                 Key: STORM-2242
                 URL: https://issues.apache.org/jira/browse/STORM-2242
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-cassandra
    Affects Versions: 1.0.2
            Reporter: Daniel Klessing


Persisting the Trident state in 
{{org.apache.storm.cassandra.trident.state.CassandraState}} with batching 
enabled does not honor the configuration for {{cassandra.batch.size.rows}}.

This results in a warning at least:
{code}
10:33:33.720 [SharedPool-Worker-16] WARN  o.a.c.cql3.statements.BatchStatement 
- Batch of prepared statements for [gin.ngram_count] is of size 5200, exceeding 
specified threshold of 5120 by 80.
{code}

An exception like this is also possible:
{code}
10:30:54.287 [SharedPool-Worker-1] ERROR o.a.c.cql3.statements.BatchStatement - 
Batch of prepared statements for [gin.df] is of size 103428, exceeding 
specified threshold of 51200 by 52228. (see batch_size_fail_threshold_in_kb)
10:30:54.295 [Thread-29-b-1-executor[7 7]] WARN  
o.a.s.c.trident.state.CassandraState - Batch write operation is failed.
10:30:54.297 [Thread-29-b-1-executor[7 7]] ERROR 
org.apache.storm.daemon.executor -
com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large
    at 
com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50)
 ~[cassandra-driver-core-3.1.0.jar:na]
    at 
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
 ~[cassandra-driver-core-3.1.0.jar:na]
    at 
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
 ~[cassandra-driver-core-3.1.0.jar:na]
    at 
com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64) 
~[cassandra-driver-core-3.1.0.jar:na]
    at 
org.apache.storm.cassandra.trident.state.CassandraState.updateState(CassandraState.java:159)
 ~[storm-cassandra-1.0.2.IQSER_20161212.jar:1.0.2.IQSER_20161212]
    at 
org.apache.storm.cassandra.trident.state.CassandraStateUpdater.updateState(CassandraStateUpdater.java:34)
 [storm-cassandra-1.0.2.IQSER_20161212.jar:1.0.2.IQSER_20161212]
    at 
org.apache.storm.cassandra.trident.state.CassandraStateUpdater.updateState(CassandraStateUpdater.java:30)
 [storm-cassandra-1.0.2.IQSER_20161212.jar:1.0.2.IQSER_20161212]
    at 
org.apache.storm.trident.planner.processor.PartitionPersistProcessor.finishBatch(PartitionPersistProcessor.java:98)
 [storm-core-1.0.2.jar:1.0.2]
    at 
org.apache.storm.trident.planner.SubtopologyBolt.finishBatch(SubtopologyBolt.java:151)
 [storm-core-1.0.2.jar:1.0.2]
    at 
org.apache.storm.trident.topology.TridentBoltExecutor.finishBatch(TridentBoltExecutor.java:266)
 [storm-core-1.0.2.jar:1.0.2]
{code}

This effectivly disables the usage of batching.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to