Convention for C functions success/failure

2005-12-03 Thread spam . noam
Hello,

What is the convention for writing C functions which don't return a
value, but can fail?

If I understand correctly,
1. PyArg_ParseTuple returns 0 on failure and 1 on success.
2. PySet_Add returns -1 on failure and 0 on success.

Am I correct? What should I do with new C functions that I write?

Thanks,
Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Why keep identity-based equality comparison?

2006-01-09 Thread spam . noam
Hello,

Guido has decided, in python-dev, that in Py3K the id-based order
comparisons will be dropped. This means that, for example, "{} < []"
will raise a TypeError instead of the current behaviour, which is
returning a value which is, really, id({}) < id([]).

He also said that default equality comparison will continue to be
identity-based. This means that x == y will never raise an exception,
as is the situation is now. Here's his reason:

> Let me construct a hypothetical example: suppose we represent a car
> and its parts as objects. Let's say each wheel is an object. Each
> wheel is unique and we don't have equivalency classes for them.
> However, it would be useful to construct sets of wheels (e.g. the set
> of wheels currently on my car that have never had a flat tire). Python
> sets use hashing just like dicts. The original hash() and __eq__
> implementation would work exactly right for this purpose, and it seems
> silly to have to add it to every object type that could possibly be
> used as a set member (especially since this means that if a third
> party library creates objects for you that don't implement __hash__
> you'd have a hard time of adding it).

Now, I don't think it should be so. My reason is basically "explicit is
better than implicit" - I think that the == operator should be reserved
for value-based comparison, and raise an exception if the two objects
can't be meaningfully compared by value. If you want to check if two
objects are the same, you can always do "x is y". If you want to create
a set of objects based on their identity (that is, two different
objects with the same value are considered different elements), you
have two options:
1. Create another set type, which is identity-based - it doesn't care
about the hash value of objects, it just collects references to
objects. Instead of using set(), you would be able to use, say,
idset(), and everything would work as wanted.
2. Write a class like this:

class Ref(object):
def __init__(self, obj):
self._obj = obj
def __call__(self):
return self._obj
def __eq__(self, other):
return isinstance(other, Ref) and self._obj is other._obj
def __hash__(self):
return id(self._obj) ^ 0xBEEF

and use it like this:

st = set()
st.add(Ref(wheel1))
st.add(Ref(wheel2))
if Ref(wheel1) in st:
...
Those solutions allow the one who writes the class to define a
value-based comparison operator, and allow the user of the class to
explicitly state if he wants value-based behaviour or identity-based
behaviour.

A few more examples of why this explicit behaviour is good:

* Things like "Decimal(3.0) == 3.0" will make more sense (raise an
exception which explains that decimals should not be compared to
floats, instead of returning False).
* You won't be able to use objects as keys, expecting them to be
compared by value, and causing a bug when they don't. I recently wrote
a sort-of OCR program, which contains a mapping from a numarray array
of bits to a character (the array is the pixel-image of the char).
Everything seemed to work, but the program didn't recognize any
characters. I discovered that the reason was that arrays are hashed
according to their identity, which is a thing I had to guess. If
default == operator were not defined, I would simply get a TypeError
immediately.
* It is more forward compatible - when it is discovered that two types
can sensibly be compared, the comparison can be defined, without
changing an existing behaviour which doesn't raise an exception.

My question is, what reasons are left for leaving the current default
equality operator for Py3K, not counting backwards-compatibility?
(assume that you have idset and iddict, so explicitness' cost is only
two characters, in Guido's example)

Thanks,
Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why keep identity-based equality comparison?

2006-01-10 Thread spam . noam
> Can you provide a case where having a test for equality throw an  > exception 
> is actually useful?  Yes. It will be useful because: 1. The bug of not 
> finding a key in a dict because it was implicitly hashed by identity and not 
> by value, would not have happened. 2. You wouldn't get the weird 3.0 != 
> Decimal("3.0") - you'll get an exception which explains that these types 
> aren't comparable. 3. If, in some time, you will decide that float and 
> Decimal could be compared, you will be able to implement that without being 
> concerned about backwards compatibility issues.   But there are certainly 
> circumstances that I would prefer 1 == (1,2)   to throw an exception 
> instead of simply turning up False.  >>> So what are they?  >  > Again - give 
> us real use cases.   You may catch bugs earlier - say you have a 
> multidimensional array, and you forgot one index. Having comparison raise an 
> exception because type comparison is meaningless, instead of returning False 
> silently, will help y!
 ou catch your problem earlier.  Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why keep identity-based equality comparison?

2006-01-10 Thread spam . noam
It seems to me that both Mike's and Fuzzyman's objections were that
sometimes you want the current behaviour, of saying that two objects
are equal if they are: 1. the same object or 2. have the same value
(when it's meaningful).  In both cases this can be accomplished pretty
easily: You can do it with a try..except block, and you can write the
try...except block inside the __contains__ method.  (It's really pretty
simple: try: return a == b except TypeError: return a is b )
Also, Mike said that you'll need an idlist object too - and I think
he's right and that there's nothing wrong with it.  Note that while you
can easily define the current == behaviour using the proposed
behaviour, you can't define the proposed behaviour using the current
behaviour. Also note that using the current behaviour, you can't easily
treat objects that do define a meaningful value comparison, by
identity. Also note that in the cases that you do want identity-based
behaviour, defining it explicitly can result in a more efficient
program: explicit identity-based dict doesn't have to call any __hash__
and __eq__ protocols - it can compare the pointers themselves. The same
if you want to locate a specific object in a list - use the proposed
idlist and save yourself O(n) value-based comparisons, which might be
heavy.  Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Allowing zero-dimensional subscripts

2006-06-08 Thread spam . noam
Hello,

I discovered that I needed a small change to the Python grammar. I
would like to hear what you think about it.

In two lines:
Currently, the expression "x[]" is a syntax error.
I suggest that it will be evaluated like "x[()]", just as "x[a, b]" is
evaluated like "x[(a, b)]" right now.

In a few more words: Currently, an object can be subscripted by a few
elements, separated by commas. It is evaluated as if the object was
subscripted by a tuple containing those elements. I suggest that an
object will also be subscriptable with no elements at all, and it will
be evaluated as if the object was subscripted by an empty tuple.

It involves no backwards incompatibilities, since we are dealing with
the legalization of a currently illegal syntax.

It is consistent with the current syntax. Consider that these
identities currently hold:

x[i, j, k]  <-->  x[(i, j, k)]
x[i, j]  <-->  x[(i, j)]
x[i, ]  <-->  x[(i, )]
x[i]  <-->  x[(i)]

I suggest that the next identity will hold too:

x[]  <-->  x[()]

I need this in order to be able to refer to zero-dimensional arrays
nicely. In NumPy, you can have arrays with a different number of
dimensions. In order to refer to a value in a two-dimensional array,
you write a[i, j]. In order to refer to a value in a one-dimensional
array, you write a[i]. You can also have a zero-dimensional array,
which holds a single value (a scalar). To refer to its value, you
currently need to write a[()], which is unexpected - the user may not
even know that when he writes a[i, j] he constructs a tuple, so he
won't guess the a[()] syntax. If my suggestion is accepted, he will be
able to write a[] in order to refer to the value, as expected. It will
even work without changing the NumPy package at all!

In the normal use of NumPy, you usually don't encounter
zero-dimensional arrays. However, I'm designing another library for
managing multi-dimensional arrays of data. Its purpose is similiar to
that of a spreadsheet - analyze data and preserve the relations between
a source of a calculation and its destination. In such an environment
you may have a lot of multi-dimensional arrays - for example, the sales
of several products over several time periods. But you may also have a
lot of zero-dimensional arrays, that is, single values - for example,
the income tax. I want the access to the zero-dimensional arrays to be
consistent with the access to the multi-dimensional arrays. Just using
the name of the zero-dimensional array to obtain its value isn't going
to work - the array and the value it contains have to be distinguished.

I have tried to change CPython to support it, and it was fairly easy.
You can see the diff against the current SVN here:
http://python.pastebin.com/768317
The test suite passes without changes, as expected. I didn't include
diffs of autogenerated files. I know almost nothing about the AST, so I
would appreciate it if someone who is familiar with the AST will check
to see if I did it right. It does seem to work, though.

Well, what do you think about this?

Have a good day,
Noam Raphael

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-08 Thread spam . noam
Hello,

Terry Reedy wrote:
> > In a few more words: Currently, an object can be subscripted by a few
> > elements, separated by commas. It is evaluated as if the object was
> > subscripted by a tuple containing those elements.
>
> It is not 'as if'.   'a,b' *is* a tuple and the object *is* subcripted by a
> tuple.
> Adding () around the non-empty tuple adds nothing except a bit of noise.
>

It doesn't necessarily matter, but technically, it is not "a tuple".
The "1, 2" in "x[1, 2]" isn't evaluated according to the same rules as
in "x = 1, 2" - for example, you can have "x[1, 2:3:4, ..., 5]", which
isn't a legal tuple outside of square braces - in fact, it even isn't
legal inside parens: "x[(1, 2:3:4, ..., 5)]" isn't legal syntax.

Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-08 Thread spam . noam
Hello,

Terry Reedy wrote:
> So I do not see any point or usefulness in saying that a tuple subcript is
> not what it is.

I know that a tuple is *constructed*. The question is, is this,
conceptually, the feature that allows you to ommit the parentheses of a
tuple in some cases. If we see this as the same feature, it's
reasonable that "nothing" won't be seen as an empty tuple, just like "a
= " doesn't mean "a = ()".

However, if we see this as a different feature, which allows
multidimensional subscript by constructing a tuple behind the scenes,
constructing an empty tuple for x[] seems very reasonable to me. Since
in some cases you can't have the parentheses at all, I think that x[]
makes sense.

Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-09 Thread spam . noam
Hello,

Sybren Stuvel wrote:
> I think it's ugly to begin with. In math, one would write simply 'x'
> to denote an unsubscribed (ubsubscripted?) 'x'. And another point, why
> would one call __getitem__ without an item to call?

I think that in this case, mathematical notation is different from
python concepts.

If I create a zero-dimensional array, with the value 5, like this:
>>> a = array(5)

I refer to the array object as "a", and to the int it stores as "a[]".

For example, I can change the value it holds by writing
>>> a[] = 8
Writing "a = 8" would have a completely different meaning - create a
new name, a, pointing at a new int, 8.

Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-09 Thread spam . noam
Hello,

Fredrik Lundh wrote:
> (but should it really result in an empty tuple?  wouldn't None be a bit
> more Pythonic?)

I don't think it would. First of all, x[()] already has the desired
meaning in numpy. But I think it's the right thing - if you think of
what's inside the brackets as a list of subscripts, one for each
dimension, which is translated to a call to __getitem__ or __setitem__
with a tuple of objects representing the subscripts, then an empty
tuple is what you want to represent no subscripts.

Of course, one item without a comma doesn't make a tuple, but I see
this as the special case - just like parentheses with any number of
commas are interpreted as tuples, except for parentheses with one item
without a comma.

(By the way, thanks for the tips for posting a PEP - I'll try to do it
quickly.)

Noam

-- 
http://mail.python.org/mailman/listinfo/python-list


ANN: byteplay - a bytecode assembler/disassembler

2006-08-14 Thread spam . noam
Hello,

I would like to present a module that I have wrote, called byteplay.
It's a Python bytecode assembler/disassembler, which means that you can
take Python code object, disassemble them into equivalent objects which
are easy to play with, play with them, and then assemble a new,
modified, code object.

I think it's pretty useful if you like to learn more about Python's
bytecode - playing with things and seeing what happens is a nice way to
learn, I think.

Here's a quick example. We can define this stupid function:

>>> def f(a, b):
... print (a, b)
>>> f(3, 5)
(3, 5)

We can convert it to an equivalent object, and see how it stores the
byte code:

>>> from byteplay import *
>>> c = Code.from_code(f.func_code)
>>> from pprint import pprint; pprint(c.code)
[(SetLineno, 2),
 (LOAD_FAST, 'a'),
 (LOAD_FAST, 'b'),
 (BUILD_TUPLE, 2),
 (PRINT_ITEM, None),
 (PRINT_NEWLINE, None),
 (LOAD_CONST, None),
 (RETURN_VALUE, None)]

We can change the bytecode easily, and see what happens. Let's insert a
ROT_TWO opcode, that will swap the two arguments:

>>> c.code[3:3] = [(ROT_TWO, None)]
>>> f.func_code = c.to_code()
>>> f(3, 5)
(5, 3)

You can download byteplay from
http://byteplay.googlecode.com/svn/trunk/byteplay.py and you can read
(and edit) the documentation at http://wiki.python.org/moin/ByteplayDoc
. I will be happy to hear if you find it useful, or if you have any
comments or ideas.

Have a good day,
Noam Raphael

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-09 Thread spam . noam
Hello,

Following Fredrik's suggestion, I wrote a pre-PEP. It's available on
the wiki, at http://wiki.python.org/moin/EmptySubscriptListPEP and I
also copied it to this message.

Have a good day,
Noam


PEP: XXX
Title: Allow Empty Subscript List Without Parentheses
Version: $Revision$
Last-Modified: $Date$
Author: Noam Raphael <[EMAIL PROTECTED]>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 09-Jun-2006
Python-Version: 2.5?
Post-History: 30-Aug-2002

Abstract


This PEP suggests to allow the use of an empty subscript list, for
example ``x[]``, which is currently a syntax error. It is suggested
that in such a case, an empty tuple will be passed as an argument to
the __getitem__ and __setitem__ methods. This is consistent with the
current behaviour of passing a tuple with n elements to those methods
when a subscript list of length n is used, if it includes a comma.


Specification
=

The Python grammar specifies that inside the square brackets trailing
an expression, a list of "subscripts", separated by commas, should be
given. If the list consists of a single subscript without a trailing
comma, a single object (an ellipsis, a slice or any other object) is
passed to the resulting __getitem__ or __setitem__ call. If the list
consists of many subscripts, or of a single subscript with a trailing
comma, a tuple is passed to the resulting __getitem__ or __setitem__
call, with an item for each subscript.

Here is the formal definition of the grammar:

::
   trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
   subscriptlist: subscript (',' subscript)* [',']
   subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop]
   sliceop: ':' [test]

This PEP suggests to allow an empty subscript list, with nothing
inside the square brackets. It will result in passing an empty tuple
to the resulting __getitem__ or __setitem__ call.

The change in the grammar is to make "subscriptlist" in the first
quoted line optional:

::
   trailer: '(' [arglist] ')' | '[' [subscriptlist] ']' | '.' NAME


Motivation
==

This suggestion allows you to refer to zero-dimensional arrays
elegantly. In
NumPy, you can have arrays with a different number of dimensions. In
order to refer to a value in a two-dimensional array, you write
``a[i, j]``. In order to refer to a value in a one-dimensional array,
you write ``a[i]``. You can also have a zero-dimensional array, which
holds a single value (a scalar). To refer to its value, you currently
need to write ``a[()]``, which is unexpected - the user may not even
know that when he writes ``a[i, j]`` he constructs a tuple, so he
won't guess the ``a[()]`` syntax. If the suggestion is accepted, the
user will be able to write ``a[]`` in order to refer to the value, as
expected. It will even work without changing the NumPy package at all!

In the normal use of NumPy, you usually don't encounter
zero-dimensional arrays. However, the author of this PEP is designing
another library for managing multi-dimensional arrays of data. Its
purpose is similar to that of a spreadsheet - to analyze data and
preserve the relations between a source of a calculation and its
destination. In such an environment you may have many
multi-dimensional arrays - for example, the sales of several products
over several time periods. But you may also have several
zero-dimensional arrays, that is, single values - for example, the
income tax rate. It is desired that the access to the zero-dimensional
arrays will be consistent with the access to the multi-dimensional
arrays. Just using the name of the zero-dimensional array to obtain
its value isn't going to work - the array and the value it contains
have to be distinguished.


Rationale
=

Passing an empty tuple to the __getitem__ or __setitem__ call was
chosen because it is consistent with passing a tuple of n elements
when a subscript list of n elements is used. Also, it will make NumPy
and similar packages work as expected for zero-dimensional arrays
without
any changes.

Another hint for consistency: Currently, these equivalences hold:

::
   x[i, j, k]  <-->  x[(i, j, k)]
   x[i, j] <-->  x[(i, j)]
   x[i, ]  <-->  x[(i, )]
   x[i]<-->  x[(i)]

If this PEP is accepted, another equivalence will hold:

::
   x[] <-->  x[()]


Backwards Compatibility
===

This change is fully backwards compatible, since it only assigns a
meaning to a previously illegal syntax.


Reference Implementation


Available as SF Patch no. 1503556.
(and also in http://python.pastebin.com/768317 )

It passes the Python test suite, but currently doesn't provide
additional tests or documentation.


Copyright
=

This document has been placed in the public domain.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Allowing zero-dimensional subscripts

2006-06-10 Thread spam . noam
George Sakkis wrote:
> [EMAIL PROTECTED] wrote:
>
> > However, I'm designing another library for
> > managing multi-dimensional arrays of data. Its purpose is similiar to
> > that of a spreadsheet - analyze data and preserve the relations between
> > a source of a calculation and its destination.
>
> Sounds interesting. Will it be related at all to OLAP or the
> Multi-Dimensional eXpressions language
> (http://msdn2.microsoft.com/en-us/library/ms145506.aspx) ?
>
Thanks for the reference! I didn't know about any of these. It will
probably be interesting to learn from them. From a brief look at OLAP
in wikipedia, it may have similarities to OLAP. I don't think it will
be related to Microsoft's language, because the language will simply by
Python, hopefully making it very easy to do whatever you like with the
data.

I posted to python-dev a message that (hopefully) better explains my
use for x[]. Here it is - I think that it also gives an idea on how it
will look like.


I'm talking about something similar to a spreadsheet in that it saves
data, calculation results, and the way to produce the results.
However, it is not similar to a spreadsheet in that the data isn't
saved in an infinite two-dimensional array with numerical indices.
Instead, the data is saved in a few "tables", each storing a different
kind of data. The tables may be with any desired number of dimensions,
and are indexed by meaningful indices, instead of by natural numbers.

For example, you may have a table called sales_data. It will store the
sales data in years from set([2003, 2004, 2005]), for car models from
set(['Subaru', 'Toyota', 'Ford']), for cities from set(['Jerusalem',
'Tel Aviv', 'Haifa']). To refer to the sales of Ford in Haifa in 2004,
you will simply write: sales_data[2004, 'Ford', 'Haifa']. If the table
is a source of data (that is, not calculated), you will be able to set
values by writing: sales_data[2004, 'Ford', 'Haifa'] = 1500.

Tables may be computed tables. For example, you may have a table which
holds for each year the total sales in that year, with the income tax
subtracted. It may be defined by a function like this:

lambda year: sum(sales_data[year, model, city] for model in models for
city in cities) / (1 + income_tax_rate)

Now, like in a spreadsheet, the function is kept, so that if you
change the data, the result will be automatically recalculated. So, if
you discovered a mistake in your data, you will be able to write:

sales_data[2004, 'Ford', 'Haifa'] = 2000

and total_sales[2004] will be automatically recalculated.

Now, note that the total_sales table depends also on the
income_tax_rate. This is a variable, just like sales_data. Unlike
sales_data, it's a single value. We should be able to change it, with
the result of all the cells of the total_sales table recalculated. But
how will we do it? We can write

income_tax_rate = 0.18

but it will have a completely different meaning. The way to make the
income_tax_rate changeable is to think of it as a 0-dimensional table.
It makes sense: sales_data depends on 3 parameters (year, model,
city), total_sales depends on 1 parameter (year), and income_tax_rate
depends on 0 parameters. That's the only difference. So, thinking of
it like this, we will simply write:

income_tax_rate[] = 0.18

Now the system can know that the income tax rate has changed, and
recalculate what's needed. We will also have to change the previous
function a tiny bit, to:

lambda year: sum(sales_data[year, model, city] for model in models for
city in cities) / (1 + income_tax_rate[])

But it's fine - it just makes it clearer that income_tax_rate[] is a
part of the model that may change its value.


Have a good day,
Noam

-- 
http://mail.python.org/mailman/listinfo/python-list