Re: [Pytables-users] where() with start/stop args returning incorrect result set
Hi Derek, Ok That is very strange. I cannot reproduce this on any of my data. A quick couple of extra questions: 1) Does this still happen when you set start=0? 2) What is the chunksize of this data set (are you at a boundary)? 3) Could you send us the full table information, ie repr(table). Be Well Anthony On Tue, Sep 25, 2012 at 12:42 AM, Derek Shockey derek.shoc...@gmail.comwrote: I ran the tests. All 4988 passed. The information it output is: PyTables version: 2.4.0 HDF5 version: 1.8.9 NumPy version: 1.6.2 Numexpr version: 2.0.1 (not using Intel's VML/MKL) Zlib version: 1.2.5 (in Python interpreter) LZO version: 2.06 (Aug 12 2011) BZIP2 version: 1.0.6 (6-Sept-2010) Blosc version: 1.1.3 (2010-11-16) Cython version:0.16 Python version:2.7.3 (default, Jul 6 2012, 00:17:51) [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)] Platform: darwin-x86_64 Byte-ordering: little Detected cores:4 -Derek On Mon, Sep 24, 2012 at 9:09 PM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.com wrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] where() with start/stop args returning incorrect result set
Hi Anthony, It doesn't happen if I set start=0 or seemingly any number below 3257 (though I didn't try them *all*). I am new to PyTables and hdf5, so I'm not sure about the chunksize or if I'm at a boundary. I did however notice that the table's chunkshape is 203, and this happens for exactly 203 sequential records, so I doubt that's a coincidence. The table description is below. Thanks, Derek /events (Table(5988,)) '' description := { client_id: StringCol(itemsize=24, shape=(), dflt='', pos=0), data_01: StringCol(itemsize=36, shape=(), dflt='', pos=1), data_02: StringCol(itemsize=36, shape=(), dflt='', pos=2), data_03: StringCol(itemsize=36, shape=(), dflt='', pos=3), data_04: StringCol(itemsize=36, shape=(), dflt='', pos=4), data_05: StringCol(itemsize=36, shape=(), dflt='', pos=5), device_id: StringCol(itemsize=36, shape=(), dflt='', pos=6), id: StringCol(itemsize=36, shape=(), dflt='', pos=7), timestamp: Time64Col(shape=(), dflt=0.0, pos=8), type: UInt16Col(shape=(), dflt=0, pos=9), user_id: StringCol(itemsize=36, shape=(), dflt='', pos=10)} byteorder := 'little' chunkshape := (203,) autoIndex := True colindexes := { timestamp: Index(9, full, shuffle, zlib(1)).is_CSI=True, type: Index(9, full, shuffle, zlib(1)).is_CSI=True, id: Index(9, full, shuffle, zlib(1)).is_CSI=True, user_id: Index(9, full, shuffle, zlib(1)).is_CSI=True} On Tue, Sep 25, 2012 at 9:32 AM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Ok That is very strange. I cannot reproduce this on any of my data. A quick couple of extra questions: 1) Does this still happen when you set start=0? 2) What is the chunksize of this data set (are you at a boundary)? 3) Could you send us the full table information, ie repr(table). Be Well Anthony On Tue, Sep 25, 2012 at 12:42 AM, Derek Shockey derek.shoc...@gmail.com wrote: I ran the tests. All 4988 passed. The information it output is: PyTables version: 2.4.0 HDF5 version: 1.8.9 NumPy version: 1.6.2 Numexpr version: 2.0.1 (not using Intel's VML/MKL) Zlib version: 1.2.5 (in Python interpreter) LZO version: 2.06 (Aug 12 2011) BZIP2 version: 1.0.6 (6-Sept-2010) Blosc version: 1.1.3 (2010-11-16) Cython version:0.16 Python version:2.7.3 (default, Jul 6 2012, 00:17:51) [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)] Platform: darwin-x86_64 Byte-ordering: little Detected cores:4 -Derek On Mon, Sep 24, 2012 at 9:09 PM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.com wrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security
Re: [Pytables-users] where() with start/stop args returning incorrect result set
Hello Derek, and devs, After playing around with your data, I am able to reproduce this error on my system. I am not sure exactly where the problem is but I do know how to fix it! It turns out that this is an issue with the indexes not being properly in sync with the original table OR the start and stop values are not being propagated properly down to the indexes. When I tried to reindex by calling table.reIndex(), this did not fix the issue. This makes me think that the problem is propagating start, stop, and step all the way through correctly. I'll go ahead an make a ticket reflecting this. That said, the way to fix this in the short term is to do one of the following 1) Only use start=0, and step=1 (I bet that other stop values work) 2) Don't use indexes. When I removed the indexes from the file using ptrepack analysis.h5 analysis2.h5, everything worked fine. Thanks a ton for reporting this! Be Well Anthony On Tue, Sep 25, 2012 at 12:30 PM, Derek Shockey derek.shoc...@gmail.comwrote: Hi Anthony, It doesn't happen if I set start=0 or seemingly any number below 3257 (though I didn't try them *all*). I am new to PyTables and hdf5, so I'm not sure about the chunksize or if I'm at a boundary. I did however notice that the table's chunkshape is 203, and this happens for exactly 203 sequential records, so I doubt that's a coincidence. The table description is below. Thanks, Derek /events (Table(5988,)) '' description := { client_id: StringCol(itemsize=24, shape=(), dflt='', pos=0), data_01: StringCol(itemsize=36, shape=(), dflt='', pos=1), data_02: StringCol(itemsize=36, shape=(), dflt='', pos=2), data_03: StringCol(itemsize=36, shape=(), dflt='', pos=3), data_04: StringCol(itemsize=36, shape=(), dflt='', pos=4), data_05: StringCol(itemsize=36, shape=(), dflt='', pos=5), device_id: StringCol(itemsize=36, shape=(), dflt='', pos=6), id: StringCol(itemsize=36, shape=(), dflt='', pos=7), timestamp: Time64Col(shape=(), dflt=0.0, pos=8), type: UInt16Col(shape=(), dflt=0, pos=9), user_id: StringCol(itemsize=36, shape=(), dflt='', pos=10)} byteorder := 'little' chunkshape := (203,) autoIndex := True colindexes := { timestamp: Index(9, full, shuffle, zlib(1)).is_CSI=True, type: Index(9, full, shuffle, zlib(1)).is_CSI=True, id: Index(9, full, shuffle, zlib(1)).is_CSI=True, user_id: Index(9, full, shuffle, zlib(1)).is_CSI=True} On Tue, Sep 25, 2012 at 9:32 AM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Ok That is very strange. I cannot reproduce this on any of my data. A quick couple of extra questions: 1) Does this still happen when you set start=0? 2) What is the chunksize of this data set (are you at a boundary)? 3) Could you send us the full table information, ie repr(table). Be Well Anthony On Tue, Sep 25, 2012 at 12:42 AM, Derek Shockey derek.shoc...@gmail.com wrote: I ran the tests. All 4988 passed. The information it output is: PyTables version: 2.4.0 HDF5 version: 1.8.9 NumPy version: 1.6.2 Numexpr version: 2.0.1 (not using Intel's VML/MKL) Zlib version: 1.2.5 (in Python interpreter) LZO version: 2.06 (Aug 12 2011) BZIP2 version: 1.0.6 (6-Sept-2010) Blosc version: 1.1.3 (2010-11-16) Cython version:0.16 Python version:2.7.3 (default, Jul 6 2012, 00:17:51) [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)] Platform: darwin-x86_64 Byte-ordering: little Detected cores:4 -Derek On Mon, Sep 24, 2012 at 9:09 PM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.com wrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover
[Pytables-users] where() with start/stop args returning incorrect result set
Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] where() with start/stop args returning incorrect result set
Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.comwrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] where() with start/stop args returning incorrect result set
PS When I do this on linux all 5077 tests pass for me. On Mon, Sep 24, 2012 at 11:09 PM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.comwrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] where() with start/stop args returning incorrect result set
I ran the tests. All 4988 passed. The information it output is: PyTables version: 2.4.0 HDF5 version: 1.8.9 NumPy version: 1.6.2 Numexpr version: 2.0.1 (not using Intel's VML/MKL) Zlib version: 1.2.5 (in Python interpreter) LZO version: 2.06 (Aug 12 2011) BZIP2 version: 1.0.6 (6-Sept-2010) Blosc version: 1.1.3 (2010-11-16) Cython version:0.16 Python version:2.7.3 (default, Jul 6 2012, 00:17:51) [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)] Platform: darwin-x86_64 Byte-ordering: little Detected cores:4 -Derek On Mon, Sep 24, 2012 at 9:09 PM, Anthony Scopatz scop...@gmail.com wrote: Hi Derek, Can you please run the following command and report back what you see? python -c import tables; tables.test() Be Well Anthony On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey derek.shoc...@gmail.com wrote: Hello, I'm hoping someone can help me. When I specify start and stop values for calls to where() and readWhere(), it is returning blatantly incorrect results: table.readWhere(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows)[0]['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' table.where(id == 'ceec536a-394e-4dd7-a182-eea557f3bb93', start=3257, stop=table.nrows).next()['id'] '7f589d3e-a0e1-4882-b69b-0223a7de3801' This happens with a sequential block of about 150 rows of data, and each time it seems to be 8 rows off (i.e. the row it returns is 8 rows ahead of the row it should be returning). If I remove the start and stop args, it behaves correctly. This seems to be a bug, unless I am misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0, and hdf5 1.8.9 on OS X 10.8.2. Any ideas? Thanks, Derek -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users