Hi Pushkar, Il 18/07/2013 08:45, Pushkar Raj Pande ha scritto: > Both loadtxt and genfromtxt read the entire data into memory which is not > desirable. Is there a way to achieve streaming writes? >
OK, probably fromfile [1] can help you to cook something that works without loading the entire file into memory (and without too much iterations over the file). Anyway I strongly recommend you to not perform read/write cycles on single lines, rather define a reasonable data block size (number of rows) and process the file in chunks. If you find a reasonably simple solution it would be nice to include it in out documentation as an example or a "recipe" [2] [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#numpy.fromfile [2] http://pytables.github.io/latest/cookbook/index.html best regards antonio > Thanks, > Pushkar > > > On Wed, Jul 17, 2013 at 7:04 PM, Pushkar Raj Pande <topgun...@gmail.com>wrote: > >> Thanks Antonio and Anthony. I will give this a try. >> >> -Pushkar >> >> >> On Wed, Jul 17, 2013 at 2:59 PM, < >> pytables-users-requ...@lists.sourceforge.net> wrote: >> >>> Date: Wed, 17 Jul 2013 16:59:16 -0500 >>> From: Anthony Scopatz <scop...@gmail.com> >>> Subject: Re: [Pytables-users] Pytables bulk loading data >>> To: Discussion list for PyTables >>> <pytables-users@lists.sourceforge.net> >>> Message-ID: >>> < >>> capk-6t4ht9+ncdd_1oojrbn4u_6+ouekobklmokeufjojjk...@mail.gmail.com> >>> Content-Type: text/plain; charset="iso-8859-1" >>> >>> Hi Pushkar, >>> >>> I agree with Antonio. You should load your data with NumPy functions and >>> then write back out to PyTables. This is the fastest way to do things. >>> >>> Be Well >>> Anthony >>> >>> >>> On Wed, Jul 17, 2013 at 2:12 PM, Antonio Valentino < >>> antonio.valent...@tiscali.it> wrote: >>> >>>> Hi Pushkar, >>>> >>>> Il 17/07/2013 19:28, Pushkar Raj Pande ha scritto: >>>>> Hi all, >>>>> >>>>> I am trying to figure out the best way to bulk load data into >>> pytables. >>>>> This question may have been already answered but I couldn't find what >>> I >>>> was >>>>> looking for. >>>>> >>>>> The source data is in form of csv which may require parsing, type >>>> checking >>>>> and setting default values if it doesn't conform to the type of the >>>> column. >>>>> There are over 100 columns in a record. Doing this in a loop in python >>>> for >>>>> each row of the record is very slow compared to just fetching the rows >>>> from >>>>> one pytable file and writing it to another. Difference is almost a >>> factor >>>>> of ~50. >>>>> >>>>> I believe if I load the data using a C procedure that does the parsing >>>> and >>>>> builds the records to write in pytables I can get close to the speed >>> of >>>>> just copying and writing the rows from 1 pytable to another. But may >>> be >>>>> there is something simple and better that already exists. Can someone >>>>> please advise? But if it is a C procedure that I should write can >>> someone >>>>> point me to some examples or snippets that I can refer to put this >>>> together. >>>>> >>>>> Thanks, >>>>> Pushkar >>>>> >>>> >>>> numpy has some tools for loading data from csv files like loadtxt [1], >>>> genfromtxt [2] and other variants. >>>> >>>> Non of them is OK for you? >>>> >>>> [1] >>>> >>>> >>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt >>>> [2] >>>> >>>> >>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt >>>> >>>> >>>> cheers >>>> >>>> -- >>>> Antonio Valentino -- Antonio Valentino ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users