Hi,
Not sure this is the case for your Bad Performance, but you are Meassuring Data 
creation and Insertion together. Your Data creation involves Lots of class 
casts which are probably quite Slow.
Try
Timing only the b.send Part and See how Long that Takes. 

Roland

Am 03.05.2011 um 12:30 schrieb "charles THIBAULT" <charl.thiba...@gmail.com>:

> Hello everybody, 
> 
> first: sorry for my english in advance!!
> 
> I'm getting started with Cassandra on a 5 nodes cluster inserting data
> with the pycassa API.
> 
> I've read everywere on internet that cassandra's performance are better than 
> MySQL
> because of the writes append's only into commit logs files.
> 
> When i'm trying to insert 100 000 rows with 10 columns per row with batch 
> insert, I'v this result: 27 seconds
> But with MySQL (load data infile) this take only 2 seconds (using indexes)
> 
> Here my configuration
> 
> cassandra version: 0.7.5
> nodes : 192.168.1.210, 192.168.1.211, 192.168.1.212, 192.168.1.213, 
> 192.168.1.214
> seed: 192.168.1.210
> 
> My script
> *************************************************************************************************************
> #!/usr/bin/env python
> 
> import pycassa
> import time
> import random
> from cassandra import ttypes
> 
> pool = pycassa.connect('test', ['192.168.1.210:9160'])
> cf = pycassa.ColumnFamily(pool, 'test')
> b = cf.batch(queue_size=50, 
> write_consistency_level=ttypes.ConsistencyLevel.ANY)
> 
> tps1 = time.time()
> for i in range(100000):
>     columns = dict()
>     for j in range(10):
>         columns[str(j)] = str(random.randint(0,100))
>     b.insert(str(i), columns)
> b.send()
> tps2 = time.time()
> 
> 
> print("execution time: " + str(tps2 - tps1) + " seconds")
> *************************************************************************************************************
> 
> what I'm doing rong ?

Reply via email to