On Tue, Nov 24, 2015 at 10:53 AM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> On Tue, Nov 24, 2015 at 10:32 AM, Antoon Pardon
> <antoon.par...@rece.vub.ac.be> wrote:
>> Op 24-11-15 om 17:56 schreef Ian Kelly:
>>
>>>
>>>> So on what grounds would you argue that () is not a literal.
>>>
>>> This enumerates exactly what literals are in Python:
>>>
>>> https://docs.python.org/3/reference/lexical_analysis.html#literals
>>>
>>> I think it's a rather pedantic point, though. How are nuances of the
>>> grammar at all related to user expectations?
>>>
>>
>> I think that enumaration is too limited. The section starts with:
>>
>>    Literals are notations for constant values of some built-in types.
>>
>> () satisfies that definition, which is confirmed by the byte code
>> produced for it.
>
> Literals are a type of lexical token. All of the literals shown in
> that section are, indeed, tokens. Now I would point you to the grammar
> specification:
>
> https://docs.python.org/3/reference/grammar.html
>
> And specifically the "atom" rule, which defines both list displays and
> list comprehensions (as well as literals) as being atoms.
> Specifically, it parses () as the token '(', followed by an optional
> yield_expr or testlist_comp, followed by the token ')'. In no way is
> that a single token, nor therefore a literal.

In case reading the grammar doesn't convince, we can also get this
result from playing with the language:

>>> import tokenize
>>> from io import StringIO
>>> list(tokenize.generate_tokens(StringIO('()').readline))
[TokenInfo(type=52 (OP), string='(', start=(1, 0), end=(1, 1),
line='()'), TokenInfo(type=52 (OP), string=')', start=(1, 1), end=(1,
2), line='()'), TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0),
end=(2, 0), line='')]

Two separate tokens of the OP type.

>>> list(tokenize.generate_tokens(StringIO('42').readline))
[TokenInfo(type=2 (NUMBER), string='42', start=(1, 0), end=(1, 2),
line='42'), TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0),
end=(2, 0), line='')]

One token, of a literal type.

>>> list(tokenize.generate_tokens(StringIO('(1,2,3)').readline))
[TokenInfo(type=52 (OP), string='(', start=(1, 0), end=(1, 1),
line='(1,2,3)'), TokenInfo(type=2 (NUMBER), string='1', start=(1, 1),
end=(1, 2), line='(1,2,3)'), TokenInfo(type=52 (OP), string=',',
start=(1, 2), end=(1, 3), line='(1,2,3)'), TokenInfo(type=2 (NUMBER),
string='2', start=(1, 3), end=(1, 4), line='(1,2,3)'),
TokenInfo(type=52 (OP), string=',', start=(1, 4), end=(1, 5),
line='(1,2,3)'), TokenInfo(type=2 (NUMBER), string='3', start=(1, 5),
end=(1, 6), line='(1,2,3)'), TokenInfo(type=52 (OP), string=')',
start=(1, 6), end=(1, 7), line='(1,2,3)'), TokenInfo(type=0
(ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')]

Two tokens for the parentheses, plus three literal tokens for the
ints, plus two more tokens for the separating commas. Definitely not a
literal.

Credit to Steven for using this approach to make a similar point about
literals in a previous thread.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to