Re: [Numpy-discussion] [ANN] PyTables 2.0 alpha2

2007-03-02 Thread Ivan Vilata i Balaguer
Hi everybody once again,

We have done a new micro-release of the second alpha of PyTables 2.0,
PyTables 2.0a2a.  This fixes a missing import (thanks to Antonio
Valentino and Steven H. Rogers for the information) and missing images
in the HTML version of the manual in the 2.0a2 version released
yesterday.

We hope that the next release will be a beta one, and we encourage you
to test it.  Thank you!

As usual, the released files are available at
http://www.pytables.org/download/preliminary/

For more information on PyTables, visit http://www.pytables.org/

Cheers,

::

Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
   Cárabos Coop. V.  V  V   Enjoy Data
  ""


signature.asc
Description: Digital signature
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [ANN] PyTables 2.0 alpha2

2007-03-01 Thread Ivan Vilata i Balaguer
Hi all,

I'm posting this message to announce the availability of the *second
alpha release of PyTables 2.0*, the new and shiny major version of
PyTables.

This release settles the file format used in this major version,
removing the need to use pickled objects in order to store system
attributes, so we expect that no more changes will happen to the on-disk
format for future 2.0 releases.  The storage and handling of group
filters has also been streamlined.  The new release also allows running
the complete test suite from within Python, enables new tests and fixes
some problems with test data installation, among other fixes.

We expect to have the documentation revised and the API definitely
settled very soon in order to release the first beta version.

The official announcement follows.  Enjoy data!

::

Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
   Cárabos Coop. V.  V  V   Enjoy Data
  ""


===
 Announcing PyTables 2.0a2
===

This is the second *alpha* version of PyTables 2.0.  This release,
although being fairly stable regarding its operativity, is tagged as
alpha because the API can still change a bit (but hopefully not a great
deal), so it is meant basically for developers and people who want to
get a taste of the new exciting features in this major version.

You can download a source package of the version 2.0a2 with
generated PDF and HTML docs from
http://www.pytables.org/download/preliminary/

You can also get the latest sources from the Subversion repository at
http://pytables.org/svn/pytables/trunk/

If you are afraid of Subversion (you shouldn't), you can always download
the latest, daily updated, packed sources from
http://www.pytables.org/download/snapshot/

Please have in mind that some sections in the manual can be obsolete
(specially the "Optimization tips" chapter).  The reference chapter
should be fairly up-to-date though.

You may also want to have an in-deep read of the ``RELEASE-NOTES.txt``
file where you will find an entire section devoted to how to migrate
your existing PyTables 1.x apps to the 2.0 version.  You can find an
HTML version of this document at
http://www.pytables.org/moin/ReleaseNotes/Release_2.0a2


Changes more in depth
=

Improvements:

- NumPy is finally at the core!  That means that PyTables no longer
  needs numarray in order to operate, although it continues to be
  supported (as well as Numeric).  This also means that you should be
  able to run PyTables in scenarios combining Python 2.5 and 64-bit
  platforms (these are a source of problems with numarray/Numeric
  because they don't support this combination yet).

- Most of the operations in PyTables have experimented noticeable
  speed-ups (sometimes up to 2x, like in regular Python table
  selections).  This is a consequence of both using NumPy internally and
  a considerable effort in terms of refactorization and optimization of
  the new code.

- Numexpr has been integrated in all in-kernel selections.  So, now it
  is possible to perform complex selections like::

  result = [ row['var3'] for row in
 table.where('(var2 < 20) | (var1 == "sas")') ]

  or::

  complex_cond = '((%s <= col5) & (col2 <= %s)) ' \
 '| (sqrt(col1 + 3.1*col2 + col3*col4) > 3)'
  result = [ row['var3'] for row in
 table.where(complex_cond % (inf, sup)) ]

  and run them at full C-speed (or perhaps more, due to the cache-tuned
  computing kernel of Numexpr).

- Now, it is possible to get fields of the ``Row`` iterator by
  specifiying their position, or even ranges of positions (extended
  slicing is supported).  For example, you can do::

  result = [ row[4] for row in table# fetch field #4
 if row[1] < 20 ]
  result = [ row[:] for row in table# fetch all fields
 if row['var2'] < 20 ]
  result = [ row[1::2] for row in   # fetch odd fields
 table.iterrows(2, 3000, 3) ]

  in addition to the classical::

  result = [row['var3'] for row in table.where('var2 < 20')]

- ``Row`` has received a new method called ``fetch_all_fields()`` in
  order to easily retrieve all the fields of a row in situations like::

  [row.fetch_all_fields() for row in table.where('column1 < 0.3')]

  The difference between ``row[:]`` and ``row.fetch_all_fields()`` is
  that the former will return all the fields as a tuple, while the
  latter will return the fields in a NumPy void type and should be
  faster.  Choose whatever fits better to your needs.

- Now, all data that is read from disk is converted, if necessary, to
  the native byteorder of the hosting machine (before, this only
  happened with ``Table`` objects).  This should help to accelerate apps
  that have to do computations with data generated in platforms with a
  byteorder different than the user machine.

- All the leaf co