Re: [Pytables-users] writing metadata
Hi Andreas, Josh, Anthony and Antonio, Thanks for your help. Andre On Jun 26, 2013, at 2:48 AM, Antonio Valentino wrote: > Hi Andre', > > Il 25/06/2013 10:26, Andre' Walker-Loud ha scritto: >> Dear PyTables users, >> >> I am trying to figure out the best way to write some metadata into some >> files I have. >> >> The hdf5 file looks like >> >> /root/data_1/stat >> /root/data_1/sys >> >> where "stat" and "sys" are Arrays containing statistical and systematic >> fluctuations of numerical fits to some data I have. What I would like to do >> is add another object >> >> /root/data_1/fit >> >> where "fit" is just a metadata key that describes all the choices I made in >> performing the fit, such as seed for the random number generator, and many >> choices for fitting options, like initial guess values of parameters, >> fitting range, etc. >> >> I began to follow the example in the PyTables manual, in Section 1.2 "The >> Object Tree", where first a class is defined >> >> class Particle(tables.IsDescription): >> identity = tables.StringCol(itemsize=22, dflt=" ", pos=0) >> ... >> >> and then this class is used to populate a table. >> >> In my case, I won't have a table, but really just want a single object >> containing my metadata. I am wondering if there is a recommended way to do >> this? The "Table" does not seem optimal, but I don't see what else I would >> use. >> >> >> Thanks, >> >> Andre >> > > For leaf nodes (Tables, Array, ets) you can use the "attrs" attribute > set [1] as described in [2]. > For group objects (like e.g. "root") you can use the "set_node_attr" > method [3] of File objects or "_v_attrs". > > > cheers > > [1] > http://pytables.github.io/usersguide/libref/declarative_classes.html#attributesetclassdescr > [2] > http://pytables.github.io/usersguide/tutorials.html#setting-and-getting-user-attributes > [3] > http://pytables.github.io/usersguide/libref/file_class.html#tables.File.set_node_attr > > > -- > Antonio Valentino > > -- > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > ___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] writing metadata
Dear PyTables users, I am trying to figure out the best way to write some metadata into some files I have. The hdf5 file looks like /root/data_1/stat /root/data_1/sys where "stat" and "sys" are Arrays containing statistical and systematic fluctuations of numerical fits to some data I have. What I would like to do is add another object /root/data_1/fit where "fit" is just a metadata key that describes all the choices I made in performing the fit, such as seed for the random number generator, and many choices for fitting options, like initial guess values of parameters, fitting range, etc. I began to follow the example in the PyTables manual, in Section 1.2 "The Object Tree", where first a class is defined class Particle(tables.IsDescription): identity = tables.StringCol(itemsize=22, dflt=" ", pos=0) ... and then this class is used to populate a table. In my case, I won't have a table, but really just want a single object containing my metadata. I am wondering if there is a recommended way to do this? The "Table" does not seem optimal, but I don't see what else I would use. Thanks, Andre -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] EArray
Hi Anthony, > You can use tuple addition to accomplish what you want: > > (0,) + data.shape == (0,256,1,2) > > Be Well > Anthony Thanks! I knew there had to be a better way. Cheers, Andre > > On Sat, Oct 6, 2012 at 12:42 PM, Andre' Walker-Loud > wrote: > Hi All, > > I have a bunch of hdf5 files I am using to create one hdf5 file. > Each individual file has many different pieces of data, and they are all the > same shape in each file. > > I am using createEArray to make the large array in the final file. > > if the data files in the individual h5 files are of shape (256,1,2), then I > have to use > > createEArray('/path/','name',tables.floatAtom64(),(0,256,1,2),expectedrows=len(data_files)) > > if the np array I have grabbed from an individual file to append to my EArray > is defined as data, is there a way to use data.shape to create the shape of > my EArray? > > In spirit, I want to do something like (0,data.shape) but this does not work. > I have been scouring the numpy manual to see how to convert > > data.shape > (256,1,2) > > to (0,256,1,2) > > but failed to figure this out (if I don't know ahead of time the shape of > data - in which case I can manually reshape). > > > Thanks, > > Andre > > > -- > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev > ___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users > > -- > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] EArray
Hi All, I have a bunch of hdf5 files I am using to create one hdf5 file. Each individual file has many different pieces of data, and they are all the same shape in each file. I am using createEArray to make the large array in the final file. if the data files in the individual h5 files are of shape (256,1,2), then I have to use createEArray('/path/','name',tables.floatAtom64(),(0,256,1,2),expectedrows=len(data_files)) if the np array I have grabbed from an individual file to append to my EArray is defined as data, is there a way to use data.shape to create the shape of my EArray? In spirit, I want to do something like (0,data.shape) but this does not work. I have been scouring the numpy manual to see how to convert data.shape (256,1,2) to (0,256,1,2) but failed to figure this out (if I don't know ahead of time the shape of data - in which case I can manually reshape). Thanks, Andre -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] openFile strategy question
Hi Anthony, > Oh OK, I think I understand a little better. What I would do would be to make > "for i,file in enumerate(hdf5_files)" the outer most loop and then use the > File.walkNodes() method [1] to walk each file and pick out only the data sets > that you want to copy, skipping over all others. This should allow you to > only open each of the 400 files once. Hope this helps. Thanks. This is the idea I had, but was failing to implement (although I didn't use walkNodes). To get it to work, I had to figure out how to use createEArray properly. In the end, it was a silly fix. I created an EArray with shape (0,96,1,2), and was trying to append numpy arrays of shape (96,1,2) to this, which was failing. In the end, all I had to do was arr.append(np.array([my_array])) where as before, I was simply missing the "[ ]" brackets, so the shapes did not line up. Cheers, Andre -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] openFile strategy question
Hi Anthony, > I am a little confused. Let me verify. You have 400 hdf5 file (re and im) > buried in an a unix directory tree. You want to make a single file which > concatenates this data. Is this right? Sorry for my description - that is not quite right. The "unix directory tree" is the group tree I have made in each individual hdf5 file. So I have 400 hdf5 files, each with the given directory tree. And I basically want to copy that directory tree, but "merge" all of them together. However, there are bits in each of the small files that I do not want to merge - I only want to grab the average data sets, while the little files contains many different samples (which I have already averaged into the "avg" group. Is this clear? Thanks, Andre > > Be Well > Anthony > > On Wed, Aug 15, 2012 at 6:52 PM, Andre' Walker-Loud > wrote: > Hi All, > > Just a strategy question. > I have many hdf5 files containing data for different measurements of the same > quantities. > > My directory tree looks like > > top description [ group ] > sub description [ group ] > avg [ group ] > re [ numpy array shape = (96,1,2) ] > im [ numpy array shape = (96,1,2) ] - only exists for know subset of > data files > > I have ~400 of these files. What I want to do is create a single file, which > collects all of these files with exactly the same directory structure, except > at the very bottom > > re [ numpy array shape = (400,96,1,2) ] > > > The simplest thing I came up with to do this is loop over the two levels of > descriptive group structures, and build the numpy array for the final set > this way. > > basic loop structure: > > final_file = tables.openFile('all_data.h5','a') > > for d1 in top_description: > final_file.createGroup(final_file.root,d1) > for d2 in sub_description: > final_file.createGroup(final_file.root+'/'+d1,d2) > data_re = np.zeros([400,96,1,2]) > for i,file in enumerate(hdf5_files): > tmp = tables.openFile(file) > data_re[i] = np.array(tmp.getNode('/d1/d2/avg/re') > tmp.close() > final_file.createArray(final_file.root+'/'+d1+'/'+d2,'re',data_re) > > > But this involves opening and closing the individual 400 hdf5 files many > times. > There must be a smarter algorithmic way to do this - or perhaps built in > pytables tools. > > Any advice is appreciated. > > > Andre > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > ___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users > > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. > http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] openFile strategy question
Hi All, Just a strategy question. I have many hdf5 files containing data for different measurements of the same quantities. My directory tree looks like top description [ group ] sub description [ group ] avg [ group ] re [ numpy array shape = (96,1,2) ] im [ numpy array shape = (96,1,2) ] - only exists for know subset of data files I have ~400 of these files. What I want to do is create a single file, which collects all of these files with exactly the same directory structure, except at the very bottom re [ numpy array shape = (400,96,1,2) ] The simplest thing I came up with to do this is loop over the two levels of descriptive group structures, and build the numpy array for the final set this way. basic loop structure: final_file = tables.openFile('all_data.h5','a') for d1 in top_description: final_file.createGroup(final_file.root,d1) for d2 in sub_description: final_file.createGroup(final_file.root+'/'+d1,d2) data_re = np.zeros([400,96,1,2]) for i,file in enumerate(hdf5_files): tmp = tables.openFile(file) data_re[i] = np.array(tmp.getNode('/d1/d2/avg/re') tmp.close() final_file.createArray(final_file.root+'/'+d1+'/'+d2,'re',data_re) But this involves opening and closing the individual 400 hdf5 files many times. There must be a smarter algorithmic way to do this - or perhaps built in pytables tools. Any advice is appreciated. Andre -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] [POLL] Fully Adopt PEP8 Proposal - Please respond!
+1 while it will be painful to switch now, it will be more painful to switch when I have had time to write more code in the old style. Andre On Jul 25, 2012, at 12:37 AM, wrote: > +1 > > /Benjamin > >> -Ursprungligt meddelande- >> Från: Anthony Scopatz [mailto:scop...@gmail.com] >> Skickat: den 24 juli 2012 18:39 >> Till: Discussion list for PyTables >> Ämne: [Pytables-users] [POLL] Fully Adopt PEP8 Proposal - Please >> respond! >> >> Dear PyTables Community, >> >> The next version of PyTables that will be released will be v3.0 and >> will be Python 3 compliant. I thought that this would be an excellent >> (if not the only) chance to bring the PyTables API into full PEP8 >> compliance [1]. This would mean changing function and attribute names >> like: >> >> >> tb.openFile() -> tb.open_file() >> >> >> For the next couple of releases BOTH the new and old API would be >> available to facilitate this transition (ie tb.openFile() and >> tb.open_file() would both work).To ease migration, we would also >> provide a 2to3-like utility for you to use on your code base that would >> update the API for you automatically. At some fixed point in the >> future (v3.2?), the old API would go away, but you would have had ample >> time to run this script. The developers either feel positively or >> neutral to these changes. >> >> The question for you the user is then, would this be something that you >> would like to see? How valuable would you find this change? Is >> bundling this with the Python 3 change too much overhead? >> >> Please respond with a +1, +0, -0, or -1 and any comments you might >> have. I look forward to hearing from you! >> >> Be Well >> Anthony >> >> 1. http://www.python.org/dev/peps/pep-0008/ > > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > ___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] recursive walk question
Hi Anthony, I forgot to say Thanks! I tried using the _v_depth attr but that didn't give me an answer I understood. for example, the _v_depth of f.root was 0, and the _v_depth of my final data path was 2. But I have something that works now. also - help like this > Just one quick comment. You probably shouldn't test the string of the type > of the data. > Use the builtin isinstance() instead: > > found_array = isinstance(data, tables.Array) is very helpful to me. I have not been properly trained in any programming, I have just hacked as needed for work/research, so things like this are not yet common for me to realize. Cheers, Andre On Jun 14, 2012, at 3:28 PM, Anthony Scopatz wrote: > On Thu, Jun 14, 2012 at 4:30 PM, Andre' Walker-Loud > wrote: > Hi Anthony, > > On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > > > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud > > wrote: > > Hi All, > > > > Still trying to sort out a recursive walk through an hdf5 file using > > pytables. > > > > I have an hdf5 file with an unknown depth of groups/nodes. > > > > I am trying to write a little function to walk down the tree (with user > > input help) until a data file is found. > > > > I am hoping there is some function one can use to query whether you have > > found simply a group/node or an actual numpy array of data. So I can do > > something like > > > > if f.getNode('/',some_path) == "data_array": > >return f.getNode('/',some_path), True > > else: > >return f.getNode('/',some_path), False > > > > where I have some function that if the second returned variable is True, > > will recognize the file as data, where as if it is False, it will query the > > user for a further path down the tree. > > > > > > I suppose I could set this up with a try: except: but was hoping there is > > some built in functionality to handle this. > > > > Yup, I think that you are looking for the File.walkNodes() method. > > http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes > > I wasn't sure how to use walkNodes in an interactive search. Here is what I > came up with so far (it works on test cases I have given it). Comments are > welcome. > > One feature I would like to add to the while loop in the second function is > an iterator counting the depth of the search. I want to compare this to the > maximum tree/node/group depth in the file, so if the search goes over (maybe > my collaborators used createTable instead of createArray) the while loop > won't run forever. > > Is there a function to ask the deepest recursion into the hdf5 file? > > Hello Andre, > > Every Node object has a _v_depth attr that you can access > (http://pytables.github.com/usersguide/libref.html#tables.Node._v_depth). In > your walk function, therefore, you could test to see if you are over or under > the maximal value that you set. > > > Cheers, > > Andre > > > def is_array(file,path): >data = file.getNode(path) >if str(type(data)) == "": >found_array = True > > Just one quick comment. You probably shouldn't test the string of the type > of the data. > Use the builtin isinstance() instead: > > found_array = isinstance(data, tables.Array) > > Be Well > Anthony > >else: >found_array = False >for g in file.getNode(path): >print g >return data, found_array > > def pytable_walk(file): >found_data = False >path = '' >while found_data == False: >for g in file.getNode('/',path): >print g >path_new = raw_input('which node would you like?\n ') >path = path+'/'+path_new >data,found_data = is_array(file,path) >return path,data > > > > > > > > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > ___ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users > >
Re: [Pytables-users] recursive walk question
Hi Anthony, On Jun 14, 2012, at 11:30 AM, Anthony Scopatz wrote: > On Wed, Jun 13, 2012 at 8:23 PM, Andre' Walker-Loud > wrote: > Hi All, > > Still trying to sort out a recursive walk through an hdf5 file using pytables. > > I have an hdf5 file with an unknown depth of groups/nodes. > > I am trying to write a little function to walk down the tree (with user input > help) until a data file is found. > > I am hoping there is some function one can use to query whether you have > found simply a group/node or an actual numpy array of data. So I can do > something like > > if f.getNode('/',some_path) == "data_array": >return f.getNode('/',some_path), True > else: >return f.getNode('/',some_path), False > > where I have some function that if the second returned variable is True, will > recognize the file as data, where as if it is False, it will query the user > for a further path down the tree. > > > I suppose I could set this up with a try: except: but was hoping there is > some built in functionality to handle this. > > Yup, I think that you are looking for the File.walkNodes() method. > http://pytables.github.com/usersguide/libref.html#tables.File.walkNodes I wasn't sure how to use walkNodes in an interactive search. Here is what I came up with so far (it works on test cases I have given it). Comments are welcome. One feature I would like to add to the while loop in the second function is an iterator counting the depth of the search. I want to compare this to the maximum tree/node/group depth in the file, so if the search goes over (maybe my collaborators used createTable instead of createArray) the while loop won't run forever. Is there a function to ask the deepest recursion into the hdf5 file? Cheers, Andre def is_array(file,path): data = file.getNode(path) if str(type(data)) == "": found_array = True else: found_array = False for g in file.getNode(path): print g return data, found_array def pytable_walk(file): found_data = False path = '' while found_data == False: for g in file.getNode('/',path): print g path_new = raw_input('which node would you like?\n ') path = path+'/'+path_new data,found_data = is_array(file,path) return path,data -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] recursive walk question
Hi All, Still trying to sort out a recursive walk through an hdf5 file using pytables. I have an hdf5 file with an unknown depth of groups/nodes. I am trying to write a little function to walk down the tree (with user input help) until a data file is found. I am hoping there is some function one can use to query whether you have found simply a group/node or an actual numpy array of data. So I can do something like if f.getNode('/',some_path) == "data_array": return f.getNode('/',some_path), True else: return f.getNode('/',some_path), False where I have some function that if the second returned variable is True, will recognize the file as data, where as if it is False, it will query the user for a further path down the tree. I suppose I could set this up with a try: except: but was hoping there is some built in functionality to handle this. Thanks, Andre -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] smart search of data file
Hi All, I have a question about reading through a file in a smart way. I am trying to write a generic executable to read a data file, and with user input, grab a certain array from an hdf5 file. ./my_data_analyzer -i my_data.h5 -other_args The structure of the files is f = tables.openFile('my_data.h5') f.root.childNode.childNodedataArray where I do not know ahead of time how deep the childNodes go. My general strategy was to query the user (me and a colleague) which particular array we would like to analyze. So I was thinking to use raw_input to ask which node the user wants - but providing a list of possibilities for node in f.walkNodes(): print node if the user provides the full path to the data array - then read as numpy array - but if the user provides only partial path (maybe just the top level group) then query further until the user finds the full path, and finally read as numpy array. Also - given the data files, some of the groups (childNodes) likely won't obey the NaturalName conventions - in case that matters. 1) is my goal clear? 2) does anyone have a clever way to do this? some sort of recursive query with raw_input to get the desired data. Thanks, Andre -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] new user: advice on how to structure files
Hi All, I just stumbled upon pytables, and have been playing around with converting my data files into hdf5 using pytables. I am wondering about strategies to create data files. I have created a file with the following group structure root corr_name src_type snk_type config data the data = 1 x 48 array of floats config = a set which is to be averaged over, in this particular case, 1000, 1010, ..., 20100 (1911 in all) the other three groups are just collect metadata describing the data below, and provide a natural way to build matrices of data files, allowing the user (my collaborators) to pick and chose various combinations of srcs and snks (instead of taking them all). This structure arises naturally (to me) from the type of data files I am storing/analyzing, but I imagine there are better ways to build the file (also, when I make my file this way, it is only 105 MB, but it causes HDFViewer to fail to open with an OutOfMemory error). I would appreciate any advice on how to do this better. Below is the relevant python script which creates my file. Thanks, Andre import tables as pyt import personal_calls_to_numpy as pc import os corrs = ['name1','name2',...] dirs = [] for no in range(1000,20101,10): dirs.append('c'+str(no)) #dirs.append(str(no)) #this gives NaturalNaming error f = pyt.openFile('nplqcd_iso_old.h5','w') root = f.root for corr in corrs: cg = f.createGroup(root,corr.split('_')[-1]) src = f.createGroup(cg,'Src_GaussSmeared') for s in ['S','P']: if os.path.exists('concatonated/'+corr+'_'+tag+'_'+s+'.dat'): print('adding '+corr+'_'+tag+'_'+s+'.dat') h,c = pc.read_corr('concatonated/'+corr+'_'+tag+'_'+s+'.dat') Ncfg = int(h[0]); NT = int(h[1]) snk = f.createGroup(src,'Snk_'+s) #data = f.createArray(snk,'real',c) for cfg in range(Ncfg): gc = f.createGroup(snk,dirs[cfg]) data = f.createArray(gc,'real',c[cfg]) else: print('concatonated/'+corr+'_'+tag+'_'+s+'.dat DOES NOT EXIST') f.close() -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users