On 09/10/13 00:26, Leena Gupta wrote:

I do have an additional question related to Cassandra & Python. As part
of data processing, I need to fetch slices of data from Cassandra and
run computations like sum and percentile calculation on it.

Sorry, I've never even heard of Cassandra before

So for calculating the sum & percentile in Python, some of the data
slices on Cassandra could fetch a lot of rows (e.g.750,000 to 1mill
rows) … And since I need to compute a sum and percentile, I need to
consider all the rows.

But not all at the same time. You can create a running total and keep track of the count. Assuming the API supports an iterative read - but I've no experience there.

But if the rows are short even a million rows shouldn't be a big problem given your RAM. And assuming you are using 64bit of
course...

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to