Re: [Numpy-discussion] f2py - a recap

Fernando Perez Wed, 23 Jul 2008 15:46:35 -0700

Howdy,

On Wed, Jul 23, 2008 at 3:18 PM, Stéfan van der Walt <[EMAIL PROTECTED]> wrote:
> 2008/7/23 Fernando Perez <[EMAIL PROTECTED]>:


> I agree (with your previous e-mail) that it would be good to have some
> documentation, so if you could give me some pointers on *what* to
> document (I haven't used f2py much), then I'll try my best to get
> around to it.

Well, I think my 'recap' message earlier in this thread points to a
few issues that can probably be addressed quickly (the 'instead' error
in the help, the doc/docs dichotomy needs to be cleaned up so a single
documentation directory exists, etc).   I'm also  attaching a set of
very old notes I wrote years ago on f2py that you are free to use in
any way you see fit.  I gave them a 2-minute rst treatment but didn't
edit them at all, so they may be somewhat outdated (I originally wrote
them in 2002 I think).

If Pearu has moved to greener pastures, f2py could certainly use an
adoptive parent.  It happens to be a really important piece of
infrastructure and  for the most part it works fairly well.   I think
a litlte bit of cleanup/doc integration with the rest of numpy is
probably all that's needed, so it could be a good project for someone
to adopt that would potentially be low-demand yet quite useful.

Cheers,

f

F2PY
====

Author: Fernando Perez <[EMAIL PROTECTED]>.  Please send all
comments/corrections to this address.

This is a rough set of notes on how to use f2py.  It does NOT substitute the
official manual, but is rather meant to be used alongside with it.

For any non-trivial poject involving f2py, one should also keep at hand Pierre
Schnizer's excellent 'A short introduction to F2PY', available from
http://fubphpc.tu-graz.ac.at/~pierre/f2py_tutorial.tar.gz


Usage for the impatient
-----------------------

Start by building a scratch signature file automatically from your Fortran
sources (in this case all, you can choose only those .f files you need)::

    f2py -m MODULENAME -h MODULENAME.pyf *.f

This writes the file MODULENAME.pyf, making the best guesses it can from the
Fortran sources.  It builds an interface for the module to be accessed as
'import adap1d' from python.

You will then edit the .pyf file to fine-tune the python interface exhibited
by the resulting extension.  This means for example making unnecessary scratch
areas or array dimensions hidden, or making certain parameters be optional and
take a default value.

Then, write your setup.py file using distutils, and list the .pyf file along
with the Fortran sources it is meant to wrap.  f2py will build the module for
you automatically, respecting all the interface specifications you made in the
.pyf file.

This approach is ultimately far easier than trying to get all the declarations
(especially dependencies) right through Cf2py directives in the Fortran
sources.  While that may seem appealing at first, experience seems to show
that it's ultimately far more time-consuming and prone to subtle errors.
Using this approach, the first f2py pass can do the bulk of the interface
writing and only fine-tuning needs to be done manually.  I would only
recommend embedded Cf2py directives for very simple problems (where it works
very well).

The only drawback of this approach is that the interface and the original
Fortran source lie in different files, which need to be kept in sync.  This
increases a bit the chances of forgetting to update the .pyf file if the
Fortran interface changes (adding a parameter, for example).  However, the
benefit of having explicit, clear control over f2py's behavior far outweighs
this concern.


Choosing a default compiler
---------------------------

Set the FC_VENDOR environment variable.  This will then prevent f2py from
testing all the compilers it knows about.


Using Cf2py directives
----------------------

For simpler cases you may choose to go the route of Cf2py directives. Below
are some tips and examples for this approach.

Here's the signature of a simple Fortran routine::

        subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk)

        implicit real *8 (a-h, o-z)
        real *8 nodes(*),wei(*),x(*),wrk(*),phi(*)
        real *8 sum, one, two, half

The above is correctly handled by f2py, but it can't know what is meant to be
input/output and what the relations between the various variables are (such as
integers which are array dimensions).  If we add the following f2py
directives, the generated python interface is a lot nicer::

    c
            subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk)
    c
    c       Lines with Cf2py in them are directives for f2py to generate a 
better
    c   python interface.  These must come _before_ the Fortran variable
    c       declarations so we can control the dimension of the arrays in 
Python.
    c
    c       Inputs:
    Cf2py   integer check(0<=j && j<mm),depend(mm) :: j
    Cf2py   real *8 dimension(mm),intent(in) :: nodes
    Cf2py   real *8 dimension(mm),intent(in) :: wei
    Cf2py   real *8 dimension(nn),intent(in) :: x
    c
    c       Outputs:
    Cf2py   real *8 dimension(nn),intent(out),depend(nn) :: phi
    c
    c       Hidden args:
    c       - scratch areas can be auto-generated by python
    Cf2py   real *8 dimension(2*mm+2),intent(hide,cache),depend(mm) :: wrk
    c       - array sizes can be auto-determined
    Cf2py   integer intent(hide),depend(x):: nn=len(x)
    Cf2py   integer intent(hide),depend(nodes) :: mm = len(nodes)
    c
            implicit real *8 (a-h, o-z)
            real *8 nodes(*),wei(*),x(*),wrk(*),phi(*)
            real *8 sum, one, two, half
    c

Some comments on the above:

- The f2py directives should come immediately after the 'subroutine' line and
before the Fortran variable lines. This allows the f2py dimension directives
to override the Fortran var(*) directives.

- If the Fortran code uses var(N) instead of var(*), the f2py directives can
be placed after the Fortran declarations.  This mode is preferred, as there is
less redundancy overall.  The result is much simpler::

    c
            subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk)
    c
            implicit real *8 (a-h, o-z)
            real *8 nodes(mm),wei(mm),x(nn),wrk(2*mm),phi(nn)
            real *8 sum, one, two, half
    c
    c       The Cf2py lines allow f2py to generate a better Python interface.
    c
    c       Inputs:
    Cf2py   integer check(0<=j && j<mm),depend(mm) :: j
    Cf2py   intent(in) :: nodes
    Cf2py   intent(in) :: wei
    Cf2py   intent(in) :: x
    c
    c       Outputs:
    Cf2py   intent(out) :: phi
    c
    c       Hidden args:
    c       - scratch areas can be auto-generated by python
    Cf2py   intent(hide,cache) :: wrk
    c       - array sizes can be auto-determined
    Cf2py   integer intent(hide),depend(x):: nn=len(x)
    Cf2py   integer intent(hide),depend(nodes) :: mm = len(nodes)
    c


- Since python can automatically manage memory, it is possible to hide the
need for manually passed 'work' areas.  The C/python wrapper to the underlying
fortran routine will allocate the memory for the needed work areas on the
fly.  This is done by specifying intent(hide,cache).  'hide' tells f2py to
remove the variable from the argument list and 'cache' tells it to
auto-generate it.

In cases where the allocation cost becomes a performance problem, one can
remove the 'hide' part and make it an optional argument.  In this case it will
only be generated if not given.  For this, the line above should be changed
to::

    Cf2py   real *8 dimension(2*mm+2),intent(cache),optional,depend(mm) :: wrk

Note that this should only be done after _proving_ that the scratch areas are
causing a performance problem.  The 'cache' directive causes f2py to keep
cached copies of the scratch areas, so no unnecessary mallocs should be
triggered.

- Since f2py relies on NumPy arrays, all dimensions can be determined from
the arrays themselves and it is not necessary to pass them explicitly.


With all this, the resulting f2py-generated docstring becomes::

    phipol - Function signature:
      phi = phipol(j,nodes,wei,x)
    Required arguments:
      j : input int
      nodes : input rank-1 array('d') with bounds (mm)
      wei : input rank-1 array('d') with bounds (mm)
      x : input rank-1 array('d') with bounds (nn)
    Return objects:
      phi : rank-1 array('d') with bounds (nn)


Debugging
---------

For debugging, use the --debug-capi option to f2py.  This causes the extension
modules to print detailed information while in operation.  In distutils, this
must be passed as an option in the f2py_options to the Extension constructor.


Wrapping C codes with f2py
--------------------------

Below is Pearu Peterson's (the f2py author) response to a question about using
f2py to wrap existing C codes.  While SWIG provides similar functionality and
weave is perfect for inlining C, f2py seems to be an incredibly simple and
convenient tool for wrapping C libraries.

Pearu's response follows:

First, one cannot scan C sources with f2py as it lacks C parser.
Therefore, when wrapping C functions you must manually write the
corresponding signature file where the signatures of C functions are
represented by the signatures of equivalent Fortran functions.

For example, consider the following C file::

    /* foo.c */
    double foo(double *x, int n) {
      int i;
      double r = 0;
      for (i=0;i<n;++i)
        r += x[i];
      return r;
    }
    /* EOF foo.c */

To wrap the C function foo() with f2py, create the following signature
file bar.pyf::

    ! -*- F90 -*-
    python module bar
      interface
        real*8 function foo(x,n)
          intent(c) foo
          real*8 dimension(n),intent(in) :: x
          integer intent(c,hide),depend(x) :: n = len(x)
        end function foo
      end interface
    end python module bar
    ! EOF bar.pyf

(see usersguide for more info about intent(c)) and run::

    f2py -c bar.pyf foo.c

Finally, in Python::

    >>> import bar
    >>> bar.foo([1,2,3])
    6.0

The f2py mailing list archives contain more examples about wrapping C
functions with f2py.


Passing offset arrays to Fortran routines
-----------------------------------------

It is possible to pass offset arrays (like pointers to the middle of other
arrays) by using NumPy's slice notation.

The print_dvec function below simply prints its argument as "print*,'x',x".
We show some examples of how it behaves with both 1 and 2-d arrays::

    In [3]: x
    Out[3]: array([ 2.8,  3.4,  4.1])

    In [4]: tf.print_dvec(x)
     n 3
     x  2.8  3.4  4.1

    In [5]: tf.print_dvec ?
    Type:           fortran
    String Form:    <fortran object at 0x8306fe8>
    Namespace:      Currently not defined in user session.
    Docstring:
        print_dvec - Function signature:
          print_dvec(x,[n])
        Required arguments:
          x : input rank-1 array('d') with bounds (n)
        Optional arguments:
          n := len(x) input int


    In [6]: tf.print_dvec (x[1])
     n 1
     x  3.4

    In [7]: tf.print_dvec (x[1:])
     n 2
     x  3.4  4.1

    In [8]: A
    Out[8]:
    array([[ 3.5,  5.6,  8.2],
           [ 2.1,  4.5,  1.2],
           [ 6.3,  3.4,  3.1]])

    In [9]: tf.print_dvec(A)
     n 9
     x  3.5  5.6  8.2  2.1  4.5  1.2  6.3  3.4  3.1

    In [10]: A
    Out[10]:
    array([[ 3.5,  5.6,  8.2],
           [ 2.1,  4.5,  1.2],
           [ 6.3,  3.4,  3.1]])

    In [11]: tf.print_dvec(A[1:])
     n 6
     x  2.1  4.5  1.2  6.3  3.4  3.1

    In [12]: A[1:]
    Out[12]:
    array([[ 2.1,  4.5,  1.2],
           [ 6.3,  3.4,  3.1]])

    In [13]: A[1:,1:]
    Out[13]:
    array([[ 4.5,  1.2],
           [ 3.4,  3.1]])

    In [14]: tf.print_dvec(A[1:,1:])
     n 4
     x  4.5  1.2  3.4  3.1


On matrix ordering and in-memory copies
---------------------------------------

NumPy (which f2py relies on) is C-based, and therefore its arrays are
stored in row-major order.  Fortran stores its arrays in column-major order.
This means that copying issues must be dealt with.  Below we reproduce some
comments from Pearu on this topic given in the f2py mailing list in
June/2002::

    > So if I use intent(in,out) and I want to avoid any copying etc. when I
    > hand a matrix to a Fortran subroutine then I have to declare the matrix
    > with the correct (Fortran) dimensions
                                 ^^^^^^^^^^ - wrong
    Should be 'correct (Fortran) data ordering', dimensions are the same both
    in Fortran and Python.

    > and typecode in python and then pass the (lazy) transpose of this
    > matrix to the wrapper since that checks whether the matrix is
    > contiguous after another lazy transpose (the first
    > set_transpose_strides in the <<< marked region). Is that correct?

    No. To avoid copying, you should create array that has internally Fortran
    data ordering. This is achived, for example, by reading/creating your data
    in Fortran ordering to NumPy array and then doing NumPy.transpose on
    that. Every f2py generated extension module provides also function

      has_column_major_storage

    to check if an array is Fortran contiguous or not. If
    has_column_major_storage(arr) returns true then there will be no copying
    for the array arr if passed to f2py generated functions (assuming that
    the types are proper, of cource).

    Also note that copying done by f2py generated interface is carried out in
    C on the raw data and therefore it is extremely fast compared to if you
    would make a copy in Python, even when using NumPy. Tests with say
    1000x1000 matrices show that there is no noticable performance hit when
    copying is carried out, in fact, sometimes making a copy may speed up
    things a bit -- I was quite surprised about that myself.

    So, I think, you should worry about copying only if the sizes of matrices
    are really large, say, larger than 5000x5000 and efficient memory usage is
    relevant. The time spent for copying is negligible even for large arrays
    provided that your computer has plenty of memory (>=256MB).


Distutils
---------

Below is an example setup.py file which generates a Python extension module
from Fortran90 sources and a .pyf interface file generated by f2py and later
fine tuned::

    #!/usr/bin/env python
    """Setup script for F2PY-processed, Fortran based extension modules.

    A typical call is:

    % ./setup.py install --home=~/usr

    This will build and install the generated modules in ~/usr/lib/python.

    If called with no args, the script defaults to the above call form (it
    automatically adds the 'install --home=~/usr' options)."""

    # Global variables for this extension:
    name         = "mwadap_tools"  # name of the generated python extension 
(.so)
    description  = "F2PY-wrapped MultiWavelet Tree Toolbox"
    author       = "Fast Algorithms Group - CU Boulder"
    author_email = "[EMAIL PROTECTED]"

    # Necessary sources, _including_ the .pyf interface file
    sources = """
    binary_decomp.f90 binexpandx.f90 bitsequence.f90 constructwv.f90
    display_matrix.f90 findkeypos.f90 findlevel.f90 findnodx.f90 gauleg.f90
    gauleg2.f90 gauleg3.f90 ihpsort.f90 invert_f2cmatrix.f90 keysequence2d.f90
    level_of_nsi.f90 matmult.f90 plegnv.f90 plegvec.f90 r2norm.f90 xykeys.f90

    mwadap_tools.pyf""".split()

    # Additional libraries required by our extension module (these will be 
linked
    # in with -l):
    libraries = ['m']

    # Set to true (1) to turn on Fortran/C API debugging (very verbose)
    debug_capi = 0

    #***************************************************************************
    # Do not modify the code below unless you know what you are doing.

    # Required modules
    import sys,os
    from os.path import expanduser,expandvars
    from scipy_distutils.core import setup,Extension

    expand_sh = lambda path: expanduser(expandvars(path))

    # Additional directories for libraries (besides the compiler's defaults)
    fc_vendor = os.environ.get('FC_VENDOR','Gnu').lower()
    library_dirs = ["~/usr/lib/"+fc_vendor]

    # Modify default arguments (if none are supplied) to install in ~/usr
    if len(sys.argv)==1:
        default_args = 'install --home=~/usr'
        print '*** Adding default arguments to setup:',default_args
        sys.argv += default_args.split()  # it must be a list

    # Additional options specific to f2py:
    f2py_options = []
    if debug_capi:
        f2py_options.append('--debug-capi')

    # Define the extension module(s)
    extension = Extension(name = name,
                          sources = sources,
                          libraries = libraries,
                          library_dirs = map(expand_sh,library_dirs),
                          f2py_options = f2py_options,
                          )

    # Call the actual building/installation routine, in usual distutils form.
    setup(name = name,
          description  = description,
          author       = author,
          author_email = author_email,
          ext_modules  = [extension],
          )


Makefile
--------

Below is an example of a makefile for f2py.  In the long term, it's probably
better to rely on distutils instead.  Kept here for reference purposes::

    # Building python extension modules with f2py
    #
    # Note that this has been superceded by using setup.py, which relies
    # on distutils and gets all the proper python things done.
    #
    F2PY_BUILDDIR = .
    F2PY_FLAGS = --fcompiler=Gnu
    F2PY = f2py -c --build-dir $(F2PY_BUILDDIR) $(F2PY_FLAGS)

    # Additional libraries needed for the f2py extension modules
    F2PY_LIBS = -lio

    gauss.so: $(SOURCES)
            $(F2PY) $^ $(F2PY_LIBS) -m $(basename $@)
            mv $@ $(PY_DIRLIB)

    phipol.so: phipol.f legepols.f
            $(F2PY) $^ $(F2PY_LIBS) -m $(basename $@)
            mv $@ $(PY_DIRLIB)

    f2py_clean:
            rm -f $(F2PY_BUILDDIR)/*.o $(F2PY_BUILDDIR)/*module.c \
            $(F2PY_BUILDDIR)/*-f2pywrappers.f *~


Fortran90 Issues
----------------

While f2py supports most Fortran90 constructs, not all of the language is
supported.  Currently (april 2003), 'type' statemetns are NOT supported, so if
you have modules/routines with type statements, you'll need to write wrapper
routines which hide these.


ToDo
----

These are things I either need to find out about or which aren't yet possible
with f2py.

- Adding documentation to a variable's signature entry in the auto-generated
docstring. Also adding a global note to the docstring about the function
itself.  (From Pearu's response, this isn't implemented in f2py yet).

- Overriding the auto-generated compiler flags?

- Multiple entry points?

- Benchmark the cost of:

  * having checks (size checks esp.)
  * building work areas hidden from the user
  * building return values as python objects (NumPy arrays) instead of
working in-place.

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] f2py - a recap

Reply via email to