Howdy, On Wed, Jul 23, 2008 at 3:18 PM, Stéfan van der Walt <[EMAIL PROTECTED]> wrote: > 2008/7/23 Fernando Perez <[EMAIL PROTECTED]>:
> I agree (with your previous e-mail) that it would be good to have some > documentation, so if you could give me some pointers on *what* to > document (I haven't used f2py much), then I'll try my best to get > around to it. Well, I think my 'recap' message earlier in this thread points to a few issues that can probably be addressed quickly (the 'instead' error in the help, the doc/docs dichotomy needs to be cleaned up so a single documentation directory exists, etc). I'm also attaching a set of very old notes I wrote years ago on f2py that you are free to use in any way you see fit. I gave them a 2-minute rst treatment but didn't edit them at all, so they may be somewhat outdated (I originally wrote them in 2002 I think). If Pearu has moved to greener pastures, f2py could certainly use an adoptive parent. It happens to be a really important piece of infrastructure and for the most part it works fairly well. I think a litlte bit of cleanup/doc integration with the rest of numpy is probably all that's needed, so it could be a good project for someone to adopt that would potentially be low-demand yet quite useful. Cheers, f
F2PY ==== Author: Fernando Perez <[EMAIL PROTECTED]>. Please send all comments/corrections to this address. This is a rough set of notes on how to use f2py. It does NOT substitute the official manual, but is rather meant to be used alongside with it. For any non-trivial poject involving f2py, one should also keep at hand Pierre Schnizer's excellent 'A short introduction to F2PY', available from http://fubphpc.tu-graz.ac.at/~pierre/f2py_tutorial.tar.gz Usage for the impatient ----------------------- Start by building a scratch signature file automatically from your Fortran sources (in this case all, you can choose only those .f files you need):: f2py -m MODULENAME -h MODULENAME.pyf *.f This writes the file MODULENAME.pyf, making the best guesses it can from the Fortran sources. It builds an interface for the module to be accessed as 'import adap1d' from python. You will then edit the .pyf file to fine-tune the python interface exhibited by the resulting extension. This means for example making unnecessary scratch areas or array dimensions hidden, or making certain parameters be optional and take a default value. Then, write your setup.py file using distutils, and list the .pyf file along with the Fortran sources it is meant to wrap. f2py will build the module for you automatically, respecting all the interface specifications you made in the .pyf file. This approach is ultimately far easier than trying to get all the declarations (especially dependencies) right through Cf2py directives in the Fortran sources. While that may seem appealing at first, experience seems to show that it's ultimately far more time-consuming and prone to subtle errors. Using this approach, the first f2py pass can do the bulk of the interface writing and only fine-tuning needs to be done manually. I would only recommend embedded Cf2py directives for very simple problems (where it works very well). The only drawback of this approach is that the interface and the original Fortran source lie in different files, which need to be kept in sync. This increases a bit the chances of forgetting to update the .pyf file if the Fortran interface changes (adding a parameter, for example). However, the benefit of having explicit, clear control over f2py's behavior far outweighs this concern. Choosing a default compiler --------------------------- Set the FC_VENDOR environment variable. This will then prevent f2py from testing all the compilers it knows about. Using Cf2py directives ---------------------- For simpler cases you may choose to go the route of Cf2py directives. Below are some tips and examples for this approach. Here's the signature of a simple Fortran routine:: subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk) implicit real *8 (a-h, o-z) real *8 nodes(*),wei(*),x(*),wrk(*),phi(*) real *8 sum, one, two, half The above is correctly handled by f2py, but it can't know what is meant to be input/output and what the relations between the various variables are (such as integers which are array dimensions). If we add the following f2py directives, the generated python interface is a lot nicer:: c subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk) c c Lines with Cf2py in them are directives for f2py to generate a better c python interface. These must come _before_ the Fortran variable c declarations so we can control the dimension of the arrays in Python. c c Inputs: Cf2py integer check(0<=j && j<mm),depend(mm) :: j Cf2py real *8 dimension(mm),intent(in) :: nodes Cf2py real *8 dimension(mm),intent(in) :: wei Cf2py real *8 dimension(nn),intent(in) :: x c c Outputs: Cf2py real *8 dimension(nn),intent(out),depend(nn) :: phi c c Hidden args: c - scratch areas can be auto-generated by python Cf2py real *8 dimension(2*mm+2),intent(hide,cache),depend(mm) :: wrk c - array sizes can be auto-determined Cf2py integer intent(hide),depend(x):: nn=len(x) Cf2py integer intent(hide),depend(nodes) :: mm = len(nodes) c implicit real *8 (a-h, o-z) real *8 nodes(*),wei(*),x(*),wrk(*),phi(*) real *8 sum, one, two, half c Some comments on the above: - The f2py directives should come immediately after the 'subroutine' line and before the Fortran variable lines. This allows the f2py dimension directives to override the Fortran var(*) directives. - If the Fortran code uses var(N) instead of var(*), the f2py directives can be placed after the Fortran declarations. This mode is preferred, as there is less redundancy overall. The result is much simpler:: c subroutine phipol(j,mm,nodes,wei,nn,x,phi,wrk) c implicit real *8 (a-h, o-z) real *8 nodes(mm),wei(mm),x(nn),wrk(2*mm),phi(nn) real *8 sum, one, two, half c c The Cf2py lines allow f2py to generate a better Python interface. c c Inputs: Cf2py integer check(0<=j && j<mm),depend(mm) :: j Cf2py intent(in) :: nodes Cf2py intent(in) :: wei Cf2py intent(in) :: x c c Outputs: Cf2py intent(out) :: phi c c Hidden args: c - scratch areas can be auto-generated by python Cf2py intent(hide,cache) :: wrk c - array sizes can be auto-determined Cf2py integer intent(hide),depend(x):: nn=len(x) Cf2py integer intent(hide),depend(nodes) :: mm = len(nodes) c - Since python can automatically manage memory, it is possible to hide the need for manually passed 'work' areas. The C/python wrapper to the underlying fortran routine will allocate the memory for the needed work areas on the fly. This is done by specifying intent(hide,cache). 'hide' tells f2py to remove the variable from the argument list and 'cache' tells it to auto-generate it. In cases where the allocation cost becomes a performance problem, one can remove the 'hide' part and make it an optional argument. In this case it will only be generated if not given. For this, the line above should be changed to:: Cf2py real *8 dimension(2*mm+2),intent(cache),optional,depend(mm) :: wrk Note that this should only be done after _proving_ that the scratch areas are causing a performance problem. The 'cache' directive causes f2py to keep cached copies of the scratch areas, so no unnecessary mallocs should be triggered. - Since f2py relies on NumPy arrays, all dimensions can be determined from the arrays themselves and it is not necessary to pass them explicitly. With all this, the resulting f2py-generated docstring becomes:: phipol - Function signature: phi = phipol(j,nodes,wei,x) Required arguments: j : input int nodes : input rank-1 array('d') with bounds (mm) wei : input rank-1 array('d') with bounds (mm) x : input rank-1 array('d') with bounds (nn) Return objects: phi : rank-1 array('d') with bounds (nn) Debugging --------- For debugging, use the --debug-capi option to f2py. This causes the extension modules to print detailed information while in operation. In distutils, this must be passed as an option in the f2py_options to the Extension constructor. Wrapping C codes with f2py -------------------------- Below is Pearu Peterson's (the f2py author) response to a question about using f2py to wrap existing C codes. While SWIG provides similar functionality and weave is perfect for inlining C, f2py seems to be an incredibly simple and convenient tool for wrapping C libraries. Pearu's response follows: First, one cannot scan C sources with f2py as it lacks C parser. Therefore, when wrapping C functions you must manually write the corresponding signature file where the signatures of C functions are represented by the signatures of equivalent Fortran functions. For example, consider the following C file:: /* foo.c */ double foo(double *x, int n) { int i; double r = 0; for (i=0;i<n;++i) r += x[i]; return r; } /* EOF foo.c */ To wrap the C function foo() with f2py, create the following signature file bar.pyf:: ! -*- F90 -*- python module bar interface real*8 function foo(x,n) intent(c) foo real*8 dimension(n),intent(in) :: x integer intent(c,hide),depend(x) :: n = len(x) end function foo end interface end python module bar ! EOF bar.pyf (see usersguide for more info about intent(c)) and run:: f2py -c bar.pyf foo.c Finally, in Python:: >>> import bar >>> bar.foo([1,2,3]) 6.0 The f2py mailing list archives contain more examples about wrapping C functions with f2py. Passing offset arrays to Fortran routines ----------------------------------------- It is possible to pass offset arrays (like pointers to the middle of other arrays) by using NumPy's slice notation. The print_dvec function below simply prints its argument as "print*,'x',x". We show some examples of how it behaves with both 1 and 2-d arrays:: In [3]: x Out[3]: array([ 2.8, 3.4, 4.1]) In [4]: tf.print_dvec(x) n 3 x 2.8 3.4 4.1 In [5]: tf.print_dvec ? Type: fortran String Form: <fortran object at 0x8306fe8> Namespace: Currently not defined in user session. Docstring: print_dvec - Function signature: print_dvec(x,[n]) Required arguments: x : input rank-1 array('d') with bounds (n) Optional arguments: n := len(x) input int In [6]: tf.print_dvec (x[1]) n 1 x 3.4 In [7]: tf.print_dvec (x[1:]) n 2 x 3.4 4.1 In [8]: A Out[8]: array([[ 3.5, 5.6, 8.2], [ 2.1, 4.5, 1.2], [ 6.3, 3.4, 3.1]]) In [9]: tf.print_dvec(A) n 9 x 3.5 5.6 8.2 2.1 4.5 1.2 6.3 3.4 3.1 In [10]: A Out[10]: array([[ 3.5, 5.6, 8.2], [ 2.1, 4.5, 1.2], [ 6.3, 3.4, 3.1]]) In [11]: tf.print_dvec(A[1:]) n 6 x 2.1 4.5 1.2 6.3 3.4 3.1 In [12]: A[1:] Out[12]: array([[ 2.1, 4.5, 1.2], [ 6.3, 3.4, 3.1]]) In [13]: A[1:,1:] Out[13]: array([[ 4.5, 1.2], [ 3.4, 3.1]]) In [14]: tf.print_dvec(A[1:,1:]) n 4 x 4.5 1.2 3.4 3.1 On matrix ordering and in-memory copies --------------------------------------- NumPy (which f2py relies on) is C-based, and therefore its arrays are stored in row-major order. Fortran stores its arrays in column-major order. This means that copying issues must be dealt with. Below we reproduce some comments from Pearu on this topic given in the f2py mailing list in June/2002:: > So if I use intent(in,out) and I want to avoid any copying etc. when I > hand a matrix to a Fortran subroutine then I have to declare the matrix > with the correct (Fortran) dimensions ^^^^^^^^^^ - wrong Should be 'correct (Fortran) data ordering', dimensions are the same both in Fortran and Python. > and typecode in python and then pass the (lazy) transpose of this > matrix to the wrapper since that checks whether the matrix is > contiguous after another lazy transpose (the first > set_transpose_strides in the <<< marked region). Is that correct? No. To avoid copying, you should create array that has internally Fortran data ordering. This is achived, for example, by reading/creating your data in Fortran ordering to NumPy array and then doing NumPy.transpose on that. Every f2py generated extension module provides also function has_column_major_storage to check if an array is Fortran contiguous or not. If has_column_major_storage(arr) returns true then there will be no copying for the array arr if passed to f2py generated functions (assuming that the types are proper, of cource). Also note that copying done by f2py generated interface is carried out in C on the raw data and therefore it is extremely fast compared to if you would make a copy in Python, even when using NumPy. Tests with say 1000x1000 matrices show that there is no noticable performance hit when copying is carried out, in fact, sometimes making a copy may speed up things a bit -- I was quite surprised about that myself. So, I think, you should worry about copying only if the sizes of matrices are really large, say, larger than 5000x5000 and efficient memory usage is relevant. The time spent for copying is negligible even for large arrays provided that your computer has plenty of memory (>=256MB). Distutils --------- Below is an example setup.py file which generates a Python extension module from Fortran90 sources and a .pyf interface file generated by f2py and later fine tuned:: #!/usr/bin/env python """Setup script for F2PY-processed, Fortran based extension modules. A typical call is: % ./setup.py install --home=~/usr This will build and install the generated modules in ~/usr/lib/python. If called with no args, the script defaults to the above call form (it automatically adds the 'install --home=~/usr' options).""" # Global variables for this extension: name = "mwadap_tools" # name of the generated python extension (.so) description = "F2PY-wrapped MultiWavelet Tree Toolbox" author = "Fast Algorithms Group - CU Boulder" author_email = "[EMAIL PROTECTED]" # Necessary sources, _including_ the .pyf interface file sources = """ binary_decomp.f90 binexpandx.f90 bitsequence.f90 constructwv.f90 display_matrix.f90 findkeypos.f90 findlevel.f90 findnodx.f90 gauleg.f90 gauleg2.f90 gauleg3.f90 ihpsort.f90 invert_f2cmatrix.f90 keysequence2d.f90 level_of_nsi.f90 matmult.f90 plegnv.f90 plegvec.f90 r2norm.f90 xykeys.f90 mwadap_tools.pyf""".split() # Additional libraries required by our extension module (these will be linked # in with -l): libraries = ['m'] # Set to true (1) to turn on Fortran/C API debugging (very verbose) debug_capi = 0 #*************************************************************************** # Do not modify the code below unless you know what you are doing. # Required modules import sys,os from os.path import expanduser,expandvars from scipy_distutils.core import setup,Extension expand_sh = lambda path: expanduser(expandvars(path)) # Additional directories for libraries (besides the compiler's defaults) fc_vendor = os.environ.get('FC_VENDOR','Gnu').lower() library_dirs = ["~/usr/lib/"+fc_vendor] # Modify default arguments (if none are supplied) to install in ~/usr if len(sys.argv)==1: default_args = 'install --home=~/usr' print '*** Adding default arguments to setup:',default_args sys.argv += default_args.split() # it must be a list # Additional options specific to f2py: f2py_options = [] if debug_capi: f2py_options.append('--debug-capi') # Define the extension module(s) extension = Extension(name = name, sources = sources, libraries = libraries, library_dirs = map(expand_sh,library_dirs), f2py_options = f2py_options, ) # Call the actual building/installation routine, in usual distutils form. setup(name = name, description = description, author = author, author_email = author_email, ext_modules = [extension], ) Makefile -------- Below is an example of a makefile for f2py. In the long term, it's probably better to rely on distutils instead. Kept here for reference purposes:: # Building python extension modules with f2py # # Note that this has been superceded by using setup.py, which relies # on distutils and gets all the proper python things done. # F2PY_BUILDDIR = . F2PY_FLAGS = --fcompiler=Gnu F2PY = f2py -c --build-dir $(F2PY_BUILDDIR) $(F2PY_FLAGS) # Additional libraries needed for the f2py extension modules F2PY_LIBS = -lio gauss.so: $(SOURCES) $(F2PY) $^ $(F2PY_LIBS) -m $(basename $@) mv $@ $(PY_DIRLIB) phipol.so: phipol.f legepols.f $(F2PY) $^ $(F2PY_LIBS) -m $(basename $@) mv $@ $(PY_DIRLIB) f2py_clean: rm -f $(F2PY_BUILDDIR)/*.o $(F2PY_BUILDDIR)/*module.c \ $(F2PY_BUILDDIR)/*-f2pywrappers.f *~ Fortran90 Issues ---------------- While f2py supports most Fortran90 constructs, not all of the language is supported. Currently (april 2003), 'type' statemetns are NOT supported, so if you have modules/routines with type statements, you'll need to write wrapper routines which hide these. ToDo ---- These are things I either need to find out about or which aren't yet possible with f2py. - Adding documentation to a variable's signature entry in the auto-generated docstring. Also adding a global note to the docstring about the function itself. (From Pearu's response, this isn't implemented in f2py yet). - Overriding the auto-generated compiler flags? - Multiple entry points? - Benchmark the cost of: * having checks (size checks esp.) * building work areas hidden from the user * building return values as python objects (NumPy arrays) instead of working in-place.
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion