On 5/30/2011 5:08 AM, Laurent Claessens wrote:
Le 30/05/2011 11:02, Terry Reedy a écrit :
On 5/30/2011 3:38 AM, Laurent wrote:

Cool. I was thinking that "5" was the name, but
>>> 5.__add__(6)
File "<stdin>", line 1
5.__add__(6)


Try 5 .__add__(6)

What is the rationale behind the fact to add a space between "5" and
".__add__" ?
Why does it work ?

Others have given you specific answers, here is the bigger picture.

For decades, text interpreter/compilers have generally run in two phases:
1. a lexer/tokenizer that breaks the stream of characters into tokens;
2. a parser that recognizes higher-level syntax and takes appropriate action.

Lexers are typically based on regular grammars and implemented as very simple and fast deterministic finite-state automata. In outline (leaving out error handling and end-of-stream handling), something like:

def lexer(stream, lookup, initial_state):
  state = initial_state
  buffer = []
  for char in stream:
    state,out = lookup[state,char]
    if out:
      yield output(buffer)
      # convert list of chars to token expected by parser, clear buffer
    buffer += char

There is no backup and no lookahead (except for the fact that output excludes the current char). For python, lookup[start,'5'] == in_number,False, and lookup[in_number,'.'] == in_float,False.

>>> 5..__add__(6)
11.0

works because lookup[in_float,'.'] == start,True, because buffer now contains a completed float ready to output and '.' signals the start of a new token.

I believe we read natural language text similarly, breaking it into words and punctuation. I believe the ability to read programs depends on being able to adjust the internal lexer a bit. Python is easier to read than some other algorithm languages because it tends to have at most one punctuation-like symbol unit between words, as is the case in the code above.

--
Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to