[Pytables-users] Pytables file structure

2012-07-15 Thread Juan Manuel Vázquez Tovar
Hello,

I have been using pytables for a few moths. The main structure of my files
has a four column table, two of which have multidimensional cells, (56,1)
and (133,6) respectively. The previous structure had more columns instead
of storing the 56x1 array into the same cell. The largest file has almost
three million rows in the table.
I usually request data from the table looping through the entire table and
getting for each row one specific row of the 133x6 2d array.
Currently, each of the requests can take from 15 sec up to 10 minutes, I
believe that depending on the status of the office network.
Could you please advice about how to improve the reading time?
I have tried to compress the data with zlib, but it takes more or less the
same time.

Thanks in advance,

Juan Manuel
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file structure

2012-07-15 Thread Juan Manuel Vázquez Tovar
Hello Anthony,

I have to loop over the whole set of rows. Does the where method has any
advantages in that case?

Thank you,
Juanma

2012/7/15 Anthony Scopatz 

> Hello Juan,
>
> Try using the where() method [1],  It has a lot of nice features under the
> covers.
>
> Be Well
> Anthony
>
> 1.
> http://pytables.github.com/usersguide/libref.html?highlight=where#tables.Table.where
>
> On Sun, Jul 15, 2012 at 4:01 PM, Juan Manuel Vázquez Tovar <
> jmv.to...@gmail.com> wrote:
>
>> Hello,
>>
>> I have been using pytables for a few moths. The main structure of my
>> files has a four column table, two of which have multidimensional cells,
>> (56,1) and (133,6) respectively. The previous structure had more columns
>> instead of storing the 56x1 array into the same cell. The largest file has
>> almost three million rows in the table.
>> I usually request data from the table looping through the entire table
>> and getting for each row one specific row of the 133x6 2d array.
>> Currently, each of the requests can take from 15 sec up to 10 minutes, I
>> believe that depending on the status of the office network.
>> Could you please advice about how to improve the reading time?
>> I have tried to compress the data with zlib, but it takes more or less
>> the same time.
>>
>> Thanks in advance,
>>
>> Juan Manuel
>>
>>
>>
>> --
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> ___
>> Pytables-users mailing list
>> Pytables-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>
>>
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file structure

2012-07-15 Thread Juan Manuel Vázquez Tovar
The column I´m requesting the data from has multidimensional cells, so each
time I request data from the table, I need to get a specific row for all
the multidimensional cells in the column. I hope this clarifies a bit.
I have at the office a Linux workstation, but it is part of a computing
cluster where all the users have access, so the files are in a folder of
the cluster, not in my hard drive.

Thank you,
Juanma

2012/7/15 Anthony Scopatz 

> Rereading the original post, I am a little confused are your trying to
> read the whole table, just a couple of rows that meet some condition, or
> just one whole column, or one part of the column.
>
> To request the whole table without looping over each row in Python, index
> every element:
>
> f.root.table[:]
>
>
> To just get certain rows, use where().
>
> To get a single column, use the cols namespace:
>
> f.root.table.cols.my_column[:]
>
>
> Why is this file elsewhere on the network?
>
> Be Well
> Anthony
>
> On Sun, Jul 15, 2012 at 4:08 PM, Juan Manuel Vázquez Tovar <
> jmv.to...@gmail.com> wrote:
>
>> Hello Anthony,
>>
>> I have to loop over the whole set of rows. Does the where method has any
>> advantages in that case?
>>
>> Thank you,
>> Juanma
>>
>> 2012/7/15 Anthony Scopatz 
>>
>>> Hello Juan,
>>>
>>> Try using the where() method [1],  It has a lot of nice features under
>>> the covers.
>>>
>>> Be Well
>>> Anthony
>>>
>>> 1.
>>> http://pytables.github.com/usersguide/libref.html?highlight=where#tables.Table.where
>>>
>>> On Sun, Jul 15, 2012 at 4:01 PM, Juan Manuel Vázquez Tovar <
>>> jmv.to...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I have been using pytables for a few moths. The main structure of my
>>>> files has a four column table, two of which have multidimensional cells,
>>>> (56,1) and (133,6) respectively. The previous structure had more columns
>>>> instead of storing the 56x1 array into the same cell. The largest file has
>>>> almost three million rows in the table.
>>>> I usually request data from the table looping through the entire table
>>>> and getting for each row one specific row of the 133x6 2d array.
>>>> Currently, each of the requests can take from 15 sec up to 10 minutes,
>>>> I believe that depending on the status of the office network.
>>>> Could you please advice about how to improve the reading time?
>>>> I have tried to compress the data with zlib, but it takes more or less
>>>> the same time.
>>>>
>>>> Thanks in advance,
>>>>
>>>> Juan Manuel
>>>>
>>>>
>>>>
>>>> --
>>>> Live Security Virtual Conference
>>>> Exclusive live event will cover all the ways today's security and
>>>> threat landscape has changed and how IT managers can respond.
>>>> Discussions
>>>> will include endpoint security, mobile security and the latest in
>>>> malware
>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>> ___
>>>> Pytables-users mailing list
>>>> Pytables-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>>>
>>>>
>>>
>>>
>>> --
>>> Live Security Virtual Conference
>>> Exclusive live event will cover all the ways today's security and
>>> threat landscape has changed and how IT managers can respond. Discussions
>>> will include endpoint security, mobile security and the latest in malware
>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>> ___
>>> Pytables-users mailing list
>>> Pytables-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>>
>>>
>>
>>
>> --
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security 

Re: [Pytables-users] Pytables file structure

2012-07-17 Thread Juan Manuel Vázquez Tovar
Thank you very much Anthony.
Do I have to sign up to store a ticket?

2012/7/15 Anthony Scopatz 

> Ahh I see, tricky.
>
> So I think what is killing you is that you are pulling each row of the
> table individually over the network.  Ideally you should be able to do
> something like the following:
>
> f.root.table.cols.my_col[:,n,:]
>
>
> using numpy-esque multidimensional slicing.  However, this fails when I
> just tested it.  So instead, I would just pull over the full column and
> slice using numpy in memory.
>
> my_col = f.root.table.cols.my_col[:]
> my_selection = my_col[:,n,:]
>
>
> We should open a ticket so that the top method works (though I think there
> might already be one).
>
> I hope this helps!
>
> On Sun, Jul 15, 2012 at 4:27 PM, Juan Manuel Vázquez Tovar <
> jmv.to...@gmail.com> wrote:
>
>> The column I´m requesting the data from has multidimensional cells, so
>> each time I request data from the table, I need to get a specific row for
>> all the multidimensional cells in the column. I hope this clarifies a bit.
>> I have at the office a Linux workstation, but it is part of a computing
>> cluster where all the users have access, so the files are in a folder of
>> the cluster, not in my hard drive.
>>
>> Thank you,
>> Juanma
>>
>> 2012/7/15 Anthony Scopatz 
>>
>>> Rereading the original post, I am a little confused are your trying to
>>> read the whole table, just a couple of rows that meet some condition, or
>>> just one whole column, or one part of the column.
>>>
>>> To request the whole table without looping over each row in Python,
>>> index every element:
>>>
>>> f.root.table[:]
>>>
>>>
>>> To just get certain rows, use where().
>>>
>>> To get a single column, use the cols namespace:
>>>
>>> f.root.table.cols.my_column[:]
>>>
>>>
>>> Why is this file elsewhere on the network?
>>>
>>> Be Well
>>>  Anthony
>>>
>>> On Sun, Jul 15, 2012 at 4:08 PM, Juan Manuel Vázquez Tovar <
>>> jmv.to...@gmail.com> wrote:
>>>
>>>> Hello Anthony,
>>>>
>>>> I have to loop over the whole set of rows. Does the where method has
>>>> any advantages in that case?
>>>>
>>>> Thank you,
>>>> Juanma
>>>>
>>>> 2012/7/15 Anthony Scopatz 
>>>>
>>>>> Hello Juan,
>>>>>
>>>>> Try using the where() method [1],  It has a lot of nice features under
>>>>> the covers.
>>>>>
>>>>> Be Well
>>>>> Anthony
>>>>>
>>>>> 1.
>>>>> http://pytables.github.com/usersguide/libref.html?highlight=where#tables.Table.where
>>>>>
>>>>> On Sun, Jul 15, 2012 at 4:01 PM, Juan Manuel Vázquez Tovar <
>>>>> jmv.to...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I have been using pytables for a few moths. The main structure of my
>>>>>> files has a four column table, two of which have multidimensional cells,
>>>>>> (56,1) and (133,6) respectively. The previous structure had more columns
>>>>>> instead of storing the 56x1 array into the same cell. The largest file 
>>>>>> has
>>>>>> almost three million rows in the table.
>>>>>> I usually request data from the table looping through the entire
>>>>>> table and getting for each row one specific row of the 133x6 2d array.
>>>>>> Currently, each of the requests can take from 15 sec up to 10
>>>>>> minutes, I believe that depending on the status of the office network.
>>>>>> Could you please advice about how to improve the reading time?
>>>>>> I have tried to compress the data with zlib, but it takes more or
>>>>>> less the same time.
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> Juan Manuel
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Live Security Virtual Conference
>>>>>> Exclusive live event will cover all the ways today's security and
>>>>>> threat landscape has changed and how IT managers can respond.
>>>>>> Discussions
>>&

[Pytables-users] Pytables file reading

2012-08-03 Thread Juan Manuel Vázquez Tovar
Hello all,

I´m managing a file close to 26 Gb size. It´s main structure is  a table
with a bit more than 8 million rows. The table is made by four columns, the
first two columns store names, the 3rd one has a 53 items array in each
cell and the last column has a 133x6 matrix in each cell.
I use to work with a Linux workstation with 24 Gb. My usual way of working
with the file is to retrieve, from each cell in the 4th column of the
table, the same row from the 133x6 matrix.
I store the information in a bumpy array with shape 8e6x6. In this process
I almost use the whole workstation memory.
Is there anyway to optimize the memory usage?
If not, I have been thinking about splitting the file.

Thank you,

Juanma
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file reading

2012-08-05 Thread Juan Manuel Vázquez Tovar
Hi Antonio,

You are right, I don´t need to load the entire table into memory.
The fourth column has multidimensional cells and when I read a single row
from every cell in the column, I almost fill the workstation memory.
I didn´t expect that process to use so much memory, but the fact is that it
uses it.
May be I didn´t explain very well last time.

Thank you,

Juanma

2012/8/5 Antonio Valentino 

> Hi Juan Manuel,
>
> Il 04/08/2012 01:55, Juan Manuel Vázquez Tovar ha scritto:
> > Hello all,
> >
> > I´m managing a file close to 26 Gb size. It´s main structure is  a table
> > with a bit more than 8 million rows. The table is made by four columns,
> the
> > first two columns store names, the 3rd one has a 53 items array in each
> > cell and the last column has a 133x6 matrix in each cell.
> > I use to work with a Linux workstation with 24 Gb. My usual way of
> working
> > with the file is to retrieve, from each cell in the 4th column of the
> > table, the same row from the 133x6 matrix.
> > I store the information in a bumpy array with shape 8e6x6. In this
> process
> > I almost use the whole workstation memory.
> > Is there anyway to optimize the memory usage?
>
> I'm not sure to understand.
> My impression is that you do not actually need to have the entire 8e6x6
> matrix in memory at once, is it correct?
>
> In that case you could simply try to load less data using something like
>
> data = table.read(0, 5e7, field='name of the 4-th field')
> process(data)
> data = table.read(5e7, 1e8,  field='name of the 4-th field')
> process(data)
>
> See also [1] and [2].
>
> Does it make sense for you?
>
>
> [1]
> http://pytables.github.com/usersguide/libref.html#table-methods-reading
> [2] http://pytables.github.com/usersguide/libref.html#tables.Table.read
>
> > If not, I have been thinking about splitting the file.
> >
> > Thank you,
> >
> > Juanma
>
>
> cheers
>
> --
> Antonio Valentino
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file reading

2012-08-05 Thread Juan Manuel Vázquez Tovar
Hi Antonio,

This is the piece of code I use to read the part of the table I need:

data = [case[´loads´][i] for case in table]

where i is the index of the row that I need to read from the matrix (133x6)
stored in each cell of the column "loads".

Juanma

2012/8/5 Antonio Valentino 

> Hi Juan Manuel,
>
> Il 05/08/2012 22:28, Juan Manuel Vázquez Tovar ha scritto:
> > Hi Antonio,
> >
> > You are right, I don´t need to load the entire table into memory.
> > The fourth column has multidimensional cells and when I read a single row
> > from every cell in the column, I almost fill the workstation memory.
> > I didn´t expect that process to use so much memory, but the fact is that
> it
> > uses it.
> > May be I didn´t explain very well last time.
> >
> > Thank you,
> >
> > Juanma
> >
>
> Sorry, still don't understand.
> Can you please post a short code snipped that shows how exactly do you
> read data into your program?
>
> My impression is that somewhere you use some instruction that triggers
> loading of unnecessary data into memory.
>
>
>
> > 2012/8/5 Antonio Valentino 
> >
> >> Hi Juan Manuel,
> >>
> >> Il 04/08/2012 01:55, Juan Manuel Vázquez Tovar ha scritto:
> >>> Hello all,
> >>>
> >>> I´m managing a file close to 26 Gb size. It´s main structure is  a
> table
> >>> with a bit more than 8 million rows. The table is made by four columns,
> >> the
> >>> first two columns store names, the 3rd one has a 53 items array in each
> >>> cell and the last column has a 133x6 matrix in each cell.
> >>> I use to work with a Linux workstation with 24 Gb. My usual way of
> >> working
> >>> with the file is to retrieve, from each cell in the 4th column of the
> >>> table, the same row from the 133x6 matrix.
> >>> I store the information in a bumpy array with shape 8e6x6. In this
> >> process
> >>> I almost use the whole workstation memory.
> >>> Is there anyway to optimize the memory usage?
> >>
> >> I'm not sure to understand.
> >> My impression is that you do not actually need to have the entire 8e6x6
> >> matrix in memory at once, is it correct?
> >>
> >> In that case you could simply try to load less data using something like
> >>
> >> data = table.read(0, 5e7, field='name of the 4-th field')
> >> process(data)
> >> data = table.read(5e7, 1e8,  field='name of the 4-th field')
> >> process(data)
> >>
> >> See also [1] and [2].
> >>
> >> Does it make sense for you?
> >>
> >>
> >> [1]
> >> http://pytables.github.com/usersguide/libref.html#table-methods-reading
> >> [2] http://pytables.github.com/usersguide/libref.html#tables.Table.read
> >>
> >>> If not, I have been thinking about splitting the file.
> >>>
> >>> Thank you,
> >>>
> >>> Juanma
> >>
> >>
> >> cheers
> >>
> >> --
> >> Antonio Valentino
> >>
>
> --
> Antonio Valentino
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file reading

2012-08-05 Thread Juan Manuel Vázquez Tovar
Thank you Antonio, I will try

Cheers

Juanma

El Aug 5, 2012, a las 17:32, Antonio Valentino  
escribió:

> Hi Juan Manuel,
> 
> Il 05/08/2012 22:52, Juan Manuel Vázquez Tovar ha scritto:
>> Hi Antonio,
>> 
>> This is the piece of code I use to read the part of the table I need:
>> 
>> data = [case[´loads´][i] for case in table]
>> 
>> where i is the index of the row that I need to read from the matrix (133x6)
>> stored in each cell of the column "loads".
>> 
>> Juanma
>> 
> 
> that looks perfectly fine to me.
> No idea about what could be the issue :/
> 
> You can perfform patrial reads using Table.iterrows:
> 
> data = [case[´loads´][i] for case in table.iterrows(start, stop)]
> 
> Please also consider that using a single np.array with 1e8 rows instead
> of a list of arrays will allows you to save the memory overhead of 1e8
> array objects.
> Considering that 6 doubles are 48 bytes while an empty np.array takes 80
> bytes
> 
> In [64]: sys.getsizeof(np.zeros((0,)))
> Out[64]: 80
> 
> you should be able to reduce the memory footprint by far more than an half.
> 
> 
> cheers
> 
> 
>> 2012/8/5 Antonio Valentino 
>> 
>>> Hi Juan Manuel,
>>> 
>>> Il 05/08/2012 22:28, Juan Manuel Vázquez Tovar ha scritto:
>>>> Hi Antonio,
>>>> 
>>>> You are right, I don´t need to load the entire table into memory.
>>>> The fourth column has multidimensional cells and when I read a single row
>>>> from every cell in the column, I almost fill the workstation memory.
>>>> I didn´t expect that process to use so much memory, but the fact is that
>>> it
>>>> uses it.
>>>> May be I didn´t explain very well last time.
>>>> 
>>>> Thank you,
>>>> 
>>>> Juanma
>>>> 
>>> 
>>> Sorry, still don't understand.
>>> Can you please post a short code snipped that shows how exactly do you
>>> read data into your program?
>>> 
>>> My impression is that somewhere you use some instruction that triggers
>>> loading of unnecessary data into memory.
> 
> 
> -- 
> Antonio Valentino
> 
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Pytables file reading

2012-08-06 Thread Juan Manuel Vázquez Tovar
Hi Antonio,

Last question about this, from pytables point of view and based on your 
experience, is it better to manage a table with 3 million rows and 
multidimensional cells or a table with 300 million rows and plain cells?

Thank you,

Juanma

El Aug 5, 2012, a las 17:32, Antonio Valentino  
escribió:

> Hi Juan Manuel,
> 
> Il 05/08/2012 22:52, Juan Manuel Vázquez Tovar ha scritto:
>> Hi Antonio,
>> 
>> This is the piece of code I use to read the part of the table I need:
>> 
>> data = [case[´loads´][i] for case in table]
>> 
>> where i is the index of the row that I need to read from the matrix (133x6)
>> stored in each cell of the column "loads".
>> 
>> Juanma
>> 
> 
> that looks perfectly fine to me.
> No idea about what could be the issue :/
> 
> You can perfform patrial reads using Table.iterrows:
> 
> data = [case[´loads´][i] for case in table.iterrows(start, stop)]
> 
> Please also consider that using a single np.array with 1e8 rows instead
> of a list of arrays will allows you to save the memory overhead of 1e8
> array objects.
> Considering that 6 doubles are 48 bytes while an empty np.array takes 80
> bytes
> 
> In [64]: sys.getsizeof(np.zeros((0,)))
> Out[64]: 80
> 
> you should be able to reduce the memory footprint by far more than an half.
> 
> 
> cheers
> 
> 
>> 2012/8/5 Antonio Valentino 
>> 
>>> Hi Juan Manuel,
>>> 
>>> Il 05/08/2012 22:28, Juan Manuel Vázquez Tovar ha scritto:
>>>> Hi Antonio,
>>>> 
>>>> You are right, I don´t need to load the entire table into memory.
>>>> The fourth column has multidimensional cells and when I read a single row
>>>> from every cell in the column, I almost fill the workstation memory.
>>>> I didn´t expect that process to use so much memory, but the fact is that
>>> it
>>>> uses it.
>>>> May be I didn´t explain very well last time.
>>>> 
>>>> Thank you,
>>>> 
>>>> Juanma
>>>> 
>>> 
>>> Sorry, still don't understand.
>>> Can you please post a short code snipped that shows how exactly do you
>>> read data into your program?
>>> 
>>> My impression is that somewhere you use some instruction that triggers
>>> loading of unnecessary data into memory.
> 
> 
> -- 
> Antonio Valentino
> 
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


[Pytables-users] Store a reference to a dataset

2012-11-10 Thread Juan Manuel Vázquez Tovar
Hello,

I have to deal in pytables with a very large dataset. The file already
compressed with blosc5 is about 5GB. Is it possible to store objects within
the same file, each of them containing a reference to a certain search over
the dataset?
It is like having a large numpy array and a mask of it in the same pytables
file.

Thank you,

Juanma
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users