Re: How pickle helps in reading huge files?

Roy Smith Wed, 16 Oct 2013 05:32:58 -0700

In article <0044bfd0-f07f-4f7b-b976-5df034b6f...@googlegroups.com>,
 Harsh Jha <harshjha2...@gmail.com> wrote:


> I've a huge csv file and I want to read stuff from it again and again. Is it 
> useful to pickle it and keep and then unpickle it whenever I need to use that 
> data? Is it faster that accessing that file simply by opening it again and 
> again? Please explain, why?
> 
> Thank you.

It can be.  I did a project a bunch of years ago which involved reading 
(and parsing) SNMP MIBs before you could do any work.  Startup took 
something like 10-20 seconds.  If I pre-parsed the MIBs and wrote out 
the data structures as pickles, I could cut startup time to a couple of 
seconds.

But, that's because the parsing I was doing was pretty complicated.  
Parsing a CSV file is much easier, so I wouldn't expect you to have much 
improvement reading a pickle file vs. reading the original CSV.

The bottom line is, you should try it.  Pickling a data structure is 
about one line of code (not counting the 'import cPickle').  Try it and 
see what happens.  Time how long it takes to read the original file, and 
how long it takes to read the pickle.  Let us know your results.

Also, let us know what "huge" means.  1000 rows?  A million?  100 
million?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How pickle helps in reading huge files?

Reply via email to