On 8/10/2013 16:46, Leena Gupta wrote: > Hello, > > Looking for some inputs on Python's csv processing feature. > > I need to process a large csv file every 5-10 minutes. The file could > contain 3mill to 10 mill rows and size could be 6MB to 10MB(+). As part of > the processing, I need to sum up a number value by grouping on certain > attributes and store the output in a datastore. I wanted to know if Python > is recommended and can it be used for processing data in csv files of this > size? Any issues that we need to be aware of? I believe Python has a csv > library as well. > > Thanks! > > > <div dir="ltr">Hello,<br><br>Looking for some inputs on Python's csv > processing feature.<br><br>I need to process a large csv file every 5-10 > minutes. The file could contain 3mill to 10 mill rows and size could be 6MB > to 10MB(+). As part of the processing, I need to sum up a number value by > grouping on certain attributes and store the output in a datastore. I wanted > to know if Python is recommended and can it be used for processing data in > csv files of this size? Any issues that we need to be aware of? I believe > Python has a csv library as well.<br> > <br>Thanks!<br></div> >
Please use text messages here, not html. It not only wastes space, but frequently messes up formatting. Python's csv logic should have no problem dealing with a file of 10 million rows. As long as you're not trying to keep all 10 million of them in some internal data structure, the csv logic will deal you a row at a time, in a most incremental fashion. Just make sure the particular datastore you require is supported in Python. -- DaveA _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor