[Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Anne Archibald
Hi,

Suppose I have a C function,
double logsum(double a, double b);
What is needed to produce a ufunc object whose elementwise operation
is done by calling this function?

Also, is there a way to take a python function and automatically make
a ufunc out of it? (No, vectorize doesn't implement reduce(),
accumulate(), reduceat(), or outer().)

Thanks,
Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Alan McIntyre
On Sun, May 11, 2008 at 2:04 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
>  Also, is there a way to take a python function and automatically make
>  a ufunc out of it? (No, vectorize doesn't implement reduce(),
>  accumulate(), reduceat(), or outer().)

I've not used it, but have a look at numpy.frompyfunc; its docstring
suggests it turns arbitrary Python functions into ufuncs.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Anne Archibald
2008/5/11 Alan McIntyre <[EMAIL PROTECTED]>:
> On Sun, May 11, 2008 at 2:04 PM, Anne Archibald
> <[EMAIL PROTECTED]> wrote:
>>  Also, is there a way to take a python function and automatically make
>>  a ufunc out of it? (No, vectorize doesn't implement reduce(),
>>  accumulate(), reduceat(), or outer().)
>
> I've not used it, but have a look at numpy.frompyfunc; its docstring
> suggests it turns arbitrary Python functions into ufuncs.

Ah, thanks. Shame it produces object arrays though.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Robert Kern
On Sun, May 11, 2008 at 1:04 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Suppose I have a C function,
> double logsum(double a, double b);
> What is needed to produce a ufunc object whose elementwise operation
> is done by calling this function?

Basically, you need 3 arrays: functions implementing the type-specific
inner loops, void* extra data to pass to these functions, and an array
of arrays containing the type signatures of the ufunc. In numpy, we
already have generic implementations of the loop functions for common
combinations of types. In your case, for a binary function taking two
doubles and returning a double, we have PyUFunc_dd_d(). As its extra
void* data, it takes a function pointer that actually implements the
element-wise operation. So lets start making the arrays:

static char logsum_sigs[] = {
NPY_FLOAT64, NPY_FLOAT64, NPY_FLOAT64
};
static PyUFuncGenericFunction logsum_functions[] = {PyUFunc_dd_d};
static void* logsum_data[] = {logsum};
static char logsum_doc[] = "Some kind of docstring.";

Now in your initmodule() function, we will call PyUFunc
FromFuncAndData() with this information.

PyMODINIT_FUNC initmodule(void)
{
PyObject *m, *f, *d;
m = Py_InitModule3("module", module_methods,
"Some new ufuncs.\n"
);
if (m == NULL)
return;
d = PyModule_GetDict(m);
if (d == NULL)
return;
import_array();
import_umath();

f = PyUFunc_FromFuncAndData(
logsum_functions,
logsum_data,
logsum_sigs,
1,  // The number of type signatures.
2,  // The number of inputs.
1,  // The number of outputs.
PyUFunc_None,  // The identity element for reduction.
   // No good one to use for this function,
   // unfortunately.
"logsum",  // The name of the ufunc.
logsum_doc,
0  // Dummy for API backwards compatibility.
);
PyDict_SetItemString(d, "logsum", f);
Py_DECREF(f);
}

Since double functions can easily be used on float32's, too, you can
also add PyUFunc_ff_f to the logsum_functions array and

  NPY_FLOAT32, NPY_FLOAT32, NPY_FLOAT32,

to the corresponding place in the logsum_sigs array. Just bump the
number of signatures up to 2.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Anne Archibald
2008/5/11 Robert Kern <[EMAIL PROTECTED]>:

> Basically, you need 3 arrays: functions implementing the type-specific
> inner loops, void* extra data to pass to these functions, and an array
> of arrays containing the type signatures of the ufunc. In numpy, we
> already have generic implementations of the loop functions for common
> combinations of types. In your case, for a binary function taking two
> doubles and returning a double, we have PyUFunc_dd_d(). As its extra
> void* data, it takes a function pointer that actually implements the
> element-wise operation. So lets start making the arrays:

Great! Thanks!

Is it possible to provide a specialized implementation of reduce()?
(Since reduce() can be implemented more efficiently than doing it
pairwise.)

>PyUFunc_None,  // The identity element for reduction.
>   // No good one to use for this function,
>   // unfortunately.

Is it safe to use minus infinity, or is this going to give people all
kinds of warnings if they have seterr() set?

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Robert Kern
On Sun, May 11, 2008 at 9:50 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> 2008/5/11 Robert Kern <[EMAIL PROTECTED]>:
>
>> Basically, you need 3 arrays: functions implementing the type-specific
>> inner loops, void* extra data to pass to these functions, and an array
>> of arrays containing the type signatures of the ufunc. In numpy, we
>> already have generic implementations of the loop functions for common
>> combinations of types. In your case, for a binary function taking two
>> doubles and returning a double, we have PyUFunc_dd_d(). As its extra
>> void* data, it takes a function pointer that actually implements the
>> element-wise operation. So lets start making the arrays:
>
> Great! Thanks!
>
> Is it possible to provide a specialized implementation of reduce()?
> (Since reduce() can be implemented more efficiently than doing it
> pairwise.)

I don't think so, no.

>>PyUFunc_None,  // The identity element for reduction.
>>   // No good one to use for this function,
>>   // unfortunately.
>
> Is it safe to use minus infinity, or is this going to give people all
> kinds of warnings if they have seterr() set?

Perhaps, but ufuncs only allow 0 or 1 for this value, currently.

Also, I was wrong about using PyUFunc_ff_f. Instead, use PyUFunc_ff_f_As_dd_d.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Anne Archibald
2008/5/11 Robert Kern <[EMAIL PROTECTED]>:

> Perhaps, but ufuncs only allow 0 or 1 for this value, currently.

That's a shame, minus infinity is the identity for maximum too.

> Also, I was wrong about using PyUFunc_ff_f. Instead, use PyUFunc_ff_f_As_dd_d.

Hmm. Well, I tried implementing logsum(), and it turns out neither
PyUFunc_dd_d nor any of the other functions - mostly flagged as being
part of the ufunc API and described in the automatically-generated
ufunc_api.txt - are exported. All are static and not described in any
header I could find. That said, I know scipy defines its own ufuncs,
so it must either reimplement these or have some way I hadn't thought
of to get at them.

I've attached a patch to add the ufunc logsum to numpy. It's a bit
nasty for several reasons:
* I wasn't sure where to put it, so I put it in _compiled_base.c in
numpy/lib/src. I was very hesitant to touch umathmodule.c.src with its
sui generis macro language.
* I'm not sure what to do about log1p - it seems to be available in
spite of HAVE_LOG1P not being defined. In any case if it doesn't exist
it seems crazy to implement it again here.
* since PyUFunc_dd_d does not seem to be exported, I just cut-n-pasted
its code here. Obviously not a solution, but what are extension
writers supposed to do?

Do we want a ufunc version of logsum() in numpy? I got the impression
that in spite of some people's doubts as to its utility there was a
strong contingent of users who do want it. This plus a logdot() would
cover most of the operations one might want to do on numbers in log
representation. (Maybe subtraction?). It could be written in pure
python, for the most part, and reduce() would save a few exps and logs
at the cost of a temporary, but accumulate() remains a challenge. (I
don't expect many people feel the need for reduceat()).

Anne

P.S. it was surprisingly difficult to persuade numpy to find my tests.
I suppose that's what the proposed transition to nose is for? -A
Index: numpy/lib/src/_compiled_base.c
===
--- numpy/lib/src/_compiled_base.c	(revision 5155)
+++ numpy/lib/src/_compiled_base.c	(working copy)
@@ -1,6 +1,7 @@
 #include "Python.h"
 #include "structmember.h"
 #include "numpy/noprefix.h"
+#include "numpy/ufuncobject.h"
 
 static PyObject *ErrorObject;
 #define Py_Try(BOOLEAN) {if (!(BOOLEAN)) goto fail;}
@@ -835,16 +836,68 @@
 return;
 }
 
+#if !defined(HAVE_LOG1P) && 0
+/* logsum ufunc */
+static double log1p(double x) /* is this really necessary? */
+{
+double u = 1. + x;
+if (u == 1.0) {
+return x;
+} else {
+return log(u) * x / (u-1.);
+}
+}
+#endif HAVE_LOG1P
+
+
+static double logsum(double x, double y) {
+if (x>y) {
+return x+log1p(exp(y-x)); 
+} else {
+return y+log1p(exp(x-y));
+}
+};
+static void
+logsum_dd_d(char **args, intp *dimensions, intp *steps)
+{
+intp n = dimensions[0];
+intp is1 = steps[0];
+intp is2 = steps[1];
+intp os = steps[2];
+char *ip1 = args[0];
+char *ip2 = args[1];
+char *op = args[2];
+intp i;
+
+for(i = 0; i < n; i++, ip1 += is1, ip2 += is2, op += os) {
+double *in1 = (double *)ip1;
+double *in2 = (double *)ip2;
+double *out = (double *)op;
+
+*out = logsum(*in1, *in2);
+}
+}
+
+
+static char logsum_sigs[] = {
+   NPY_FLOAT64, NPY_FLOAT64, NPY_FLOAT64
+};
+static PyUFuncGenericFunction logsum_functions[] = {logsum_dd_d};
+static void* logsum_data[] = {logsum};
+static char logsum_doc[] = "Compute log(exp(x)+exp(y)) without numeric overflows and underflows";
+
+
 /* Initialization function for the module (*must* be called init) */
 
 PyMODINIT_FUNC init_compiled_base(void) {
-PyObject *m, *d, *s;
+PyObject *m, *d, *s, *f;
 
 /* Create the module and add the functions */
 m = Py_InitModule("_compiled_base", methods);
 
 /* Import the array objects */
 import_array();
+import_umath();
 
 /* Add some symbolic constants to the module */
 d = PyModule_GetDict(m);
@@ -857,6 +910,23 @@
 PyDict_SetItemString(d, "error", ErrorObject);
 Py_DECREF(ErrorObject);
 
+/* set up logsum ufunc */
+   f = PyUFunc_FromFuncAndData(
+   logsum_functions,
+   logsum_data,
+   logsum_sigs,
+   1,  // The number of type signatures.
+   2,  // The number of inputs.
+   1,  // The number of outputs.
+   PyUFunc_None,  // The identity element for reduction.
+  // No good one to use for this function,
+  // unfortunately.
+   "logsum",  // The name of the ufunc.
+   logsum_doc,
+   0  // Dummy for API backwards compatibility.
+   );
+   PyDict_SetItemString(d, "logsum", f);
+   Py_DECREF(f);
 
 /* define PyGetSetDescr_Type and PyMemberDescr_Type */
 define_types();
Index: numpy/lib/tests/test_ufunclike.py
===

Re: [Numpy-discussion] Writing new ufuncs

2008-05-11 Thread Robert Kern
On Mon, May 12, 2008 at 12:37 AM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> 2008/5/11 Robert Kern <[EMAIL PROTECTED]>:
>
>> Perhaps, but ufuncs only allow 0 or 1 for this value, currently.
>
> That's a shame, minus infinity is the identity for maximum too.
>
>> Also, I was wrong about using PyUFunc_ff_f. Instead, use 
>> PyUFunc_ff_f_As_dd_d.
>
> Hmm. Well, I tried implementing logsum(), and it turns out neither
> PyUFunc_dd_d nor any of the other functions - mostly flagged as being
> part of the ufunc API and described in the automatically-generated
> ufunc_api.txt - are exported. All are static and not described in any
> header I could find. That said, I know scipy defines its own ufuncs,
> so it must either reimplement these or have some way I hadn't thought
> of to get at them.

They are not exported as symbols. They are "exported" to other
extension modules by #defining them to an element in an array just
like the rest of the numpy C API. import_ufunc() sets up all of those
#defines. They are automatically generated into the file
__ufunc_api.h.

> I've attached a patch to add the ufunc logsum to numpy. It's a bit
> nasty for several reasons:
> * I wasn't sure where to put it, so I put it in _compiled_base.c in
> numpy/lib/src. I was very hesitant to touch umathmodule.c.src with its
> sui generis macro language.

The place to add it would be in code_generators/generate_umath.py,
which generates __umath_generated.c.

> * I'm not sure what to do about log1p - it seems to be available in
> spite of HAVE_LOG1P not being defined. In any case if it doesn't exist
> it seems crazy to implement it again here.

Then maybe our test for HAVE_LOG1P is not correct. I don't think we
can rely on its omnipresence, though.

> * since PyUFunc_dd_d does not seem to be exported, I just cut-n-pasted
> its code here. Obviously not a solution, but what are extension
> writers supposed to do?

See above.

> Do we want a ufunc version of logsum() in numpy? I got the impression
> that in spite of some people's doubts as to its utility there was a
> strong contingent of users who do want it.

Well, many of us were simply contending Chuck's contention that one
shouldn't need the log representation. I would probably toss it into
scipy.special, myself.

> This plus a logdot() would
> cover most of the operations one might want to do on numbers in log
> representation. (Maybe subtraction?). It could be written in pure
> python, for the most part, and reduce() would save a few exps and logs
> at the cost of a temporary, but accumulate() remains a challenge. (I
> don't expect many people feel the need for reduceat()).
>
> Anne
>
> P.S. it was surprisingly difficult to persuade numpy to find my tests.
> I suppose that's what the proposed transition to nose is for? -A

Yes.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion