Re: Raw string statement (proposal)

2018-05-27 Thread Dan Stromberg
On Thu, May 24, 2018 at 9:34 PM, Mikhail V  wrote:

> Hi.
> I've put some thoughts together, and
> need some feedback on this proposal.
> Main question is:  Is it convincing?
> Is there any flaw?
> My own opinion - there IS something to chase.
> Still the justification for such syntax is hard.
>
This is a strange syntax that is quite unnecessary; just use triple-quoted
strings.

Please, please, please don't add it.

I believe Python's core language should be kept small.  Big libraries are
fine - important even.  But the core language should be easy to learn but
powerful.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Mikhail V
On Sat, May 26, 2018 at 10:21 PM, Chris Angelico  wrote:

>
> I'm done. Argue with brick walls for the rest of eternity if you like.

I see you like me, but I can reciprocate your feelings.


>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Chris Angelico
On Sun, May 27, 2018 at 5:13 AM, Mikhail V  wrote:
> On Sat, May 26, 2018 at 7:10 PM, Steven D'Aprano

>> temp = >>| Mikhail's syntax for {65290} is {65290}
>> Can you see the problem yet? How does your collapse function f()
>> distinguish between the escape code {65290} and the literal string
>> {65290}?
>
> Look, there is no any problem here. You can choose
> whatever YOU want to write "{" and "}". e.g. I'd pick {<} {>}.
>
> temp = >>| Mikhail's syntax for {65290} is {<}65290{>}

Whatever notation you pick, you need some way to represent it
literally. Don't you see? All you're doing is dodging the issue by
making your literal syntax more and more complicated.

> I just show you that syntax allows you not only input TEXT
> without any manipulations, but also it allows to solve any task related
> to custom presentation - in this case you want work with
> ordinals for example. And you may end up with a scheme which
> reads way better than your cryptic example.

How about a simple notation that bases everything on a single typable
character like REVERSE SOLIDUS?

> I'll also ask you a question  -
> Which such non-text codes you may need to
> generate C code? Python code? HTML code?

What do you mean?

>> And we've gone from a single string literal, which is an expression that
>> can be used anywhere, to a statement that cannot be used in expressions.
>
> Ok show me your suggestion for a raw string definition for expressions
> (no modifications to source text and no usage of invisible control chars).

Ahh but "no modifications to source text" was YOUR rule, not ours.
Most of us are perfectly happy for string literals to be spelled with
quotes around them and a straight-forward system of escapes.

Or if we want NO modifications whatsoever, then we use another very
standard way of delimiting the contents: put it in a separate file and
then read that in. Oh wow, look at that, we can just put what we like
and it doesn't break anything!

I'm done. Argue with brick walls for the rest of eternity if you like.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Mikhail V
On Sat, May 26, 2018 at 7:10 PM, Steven D'Aprano
 wrote:
> On Sat, 26 May 2018 18:22:15 +0300, Mikhail V wrote:
>
>>> Here is a string assigned to name `s` using Python's current syntax:
>>>
>>> s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere"
>>>
>>> How do you represent that assignment using your syntax?
>>
>> Hope its not mandatory to decipher your random example. If for example
>> I'd want to work with a lot of non-presentable characters, I'd use a
>> more human-oriented notation than this ^. And that is exactly where raw
>> strings are needed.
>>
>> So I'd make a readable notation where I can present a character by its
>> ordinal enclosed in some tag for example {10}. Then just write a
>> function which collapses those depending on further needs:
>>
>> data >>| abc{10}def
>> data = f(data)
>>
>> And the notation itself can be chosen depending on my needs. Hope you
>> get the point.
>
> Loud and clear: your syntax has no way of representing the string s.

Just 5 lines before it is - you did not notice.


> temp = >>| Mikhail's syntax for {65290} is {65290}
> Can you see the problem yet? How does your collapse function f()
> distinguish between the escape code {65290} and the literal string
> {65290}?

Look, there is no any problem here. You can choose
whatever YOU want to write "{" and "}". e.g. I'd pick {<} {>}.

temp = >>| Mikhail's syntax for {65290} is {<}65290{>}


I just show you that syntax allows you not only input TEXT
without any manipulations, but also it allows to solve any task related
to custom presentation - in this case you want work with
ordinals for example. And you may end up with a scheme which
reads way better than your cryptic example.

This motivation is covered in documentation, but maybe
not good enough? anyway you're welcome to make suggestion for
the documentation.

I'll also ask you a question  -
Which such non-text codes you may need to
generate C code? Python code? HTML code?


>
> And we've gone from a single string literal, which is an expression that
> can be used anywhere, to a statement that cannot be used in expressions.

Ok show me your suggestion for a raw string definition for expressions
(no modifications to source text and no usage of invisible control chars).

> I don't know what TQS is supposed to mean.
Triple Quoted String



> s = >>>
> this is some text
> x = 'a'
> y = 'b'
> t = 'c'
>
>
> Am I close?

Yes very close, just shift it with a char of your choice, e.g. 1 space:
s >>> !" "
 this is some text
 x = 'a'
 y = 'b'
t = 'c'

or use a tag of your choice (no shifting needed in this case):

s >>> ?"#your favorite closing tag"
this is some text
x = 'a'
y = 'b'
#your favorite closing tag
t = 'c'

There can be other suggestions for symbols, etc.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Steven D'Aprano
On Sat, 26 May 2018 18:22:15 +0300, Mikhail V wrote:

>> Here is a string assigned to name `s` using Python's current syntax:
>>
>> s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere"
>>
>> How do you represent that assignment using your syntax?
> 
> Hope its not mandatory to decipher your random example. If for example
> I'd want to work with a lot of non-presentable characters, I'd use a
> more human-oriented notation than this ^. And that is exactly where raw
> strings are needed.
> 
> So I'd make a readable notation where I can present a character by its
> ordinal enclosed in some tag for example {10}. Then just write a
> function which collapses those depending on further needs:
> 
> data >>| abc{10}def
> data = f(data)
> 
> And the notation itself can be chosen depending on my needs. Hope you
> get the point.

Loud and clear: your syntax has no way of representing the string s.

Okay, let's take a simpler example. You want to use {65290} to represent 
the Unicode code point \uFF0A. Fair enough. Everyone else in the world 
uses hexadecimal for this purpose, but whatever.

But isn't {65290} just another way of spelling an escape code? Isn't your 
syntax supposed to eliminate the need for escape codes?

So if I had:

result = function("Mikhail's syntax for \uFF0A is {65290}")

your syntax would become:

temp = >>| Mikhail's syntax for {65290} is {65290}
result = function(f(temp))

where f is your (as yet non-existent?) function for collapsing the {} 
codes to characters.

Can you see the problem yet? How does your collapse function f() 
distinguish between the escape code {65290} and the literal string 
{65290}?

And we've gone from a single string literal, which is an expression that 
can be used anywhere, to a statement that cannot be used in expressions.



>> And another example:
>>
>> s = """this is some text
>> x = 'a'
>> y = 'b'"""
>> t = 'c'
>>
>> How do we write that piece of code using your syntax?
> 
> That's too easy - maybe you can try it yourself? I am not trying to
> imply anything, but I don't see how this example can cause problems -
> just put the TQS in a block.

I don't know what TQS is supposed to mean.

I'll give it a try, and see if I understand your syntax. The idea is to 
avoid needing to put BEGIN END delimiters such as quotation marks around 
the string, right?


s = >>>
this is some text
x = 'a'
y = 'b'
t = 'c'


Am I close?

How is the interpreter supposed to know that the final line is not part 
of the string, but a line of actual Python code?

[...]
>> Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text.
> 
> I know. So you've decided to share that you also understood this? Good,
> I'm glad that you understand :-)

I believe that the whole point of your proposal to avoid needing to 
change the text in any way when you paste it into your source code. If 
that's not the point, then what is the point?


-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Mikhail V
On Sat, May 26, 2018 at 10:55 AM, Steven D'Aprano
 wrote:
> On Sat, 26 May 2018 08:09:51 +0300, Mikhail V wrote:
>
>> On Fri, May 25, 2018 at 1:15 PM, bartc  wrote:
> [...]
>>> One problem here is how to deal with embedded non-printable characters:
>>> CR, LF and TAB might become part of the normal source text, but how
>>> about anything else? Or would you only allow text that might appear in
>>> a text file where those characters would also cause issues?
>>
>> This syntax does not imply anything about text. From the editor's POV
>> it's just the same as it is now - you can insert anything in a .py file.
>> So it does not add new cases to current state of affairs in this regard.
>> But maybe I'm not completely understand your question.
>
> Here is a string assigned to name `s` using Python's current syntax:
>
> s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere"
>
> How do you represent that assignment using your syntax?

Hope its not mandatory to decipher your random example.
If for example I'd want to work with a lot of non-presentable
characters, I'd use a more human-oriented notation than this ^.
And that is exactly where raw strings are needed.

So I'd make a readable notation where I can present a character
by its ordinal enclosed in some tag for example {10}. Then just
write a function which collapses those depending on further needs:

data >>| abc{10}def
data = f(data)

And the notation itself can be chosen depending on my needs.
Hope you get the point.

>
>
> And another example:
>
> s = """this is some text
> x = 'a'
> y = 'b'"""
> t = 'c'
>
> How do we write that piece of code using your syntax?

That's too easy - maybe you can try it yourself?
I am not trying to imply anything, but I don't see how
this example can cause problems - just put the TQS in a block.


>>> Would it then be possible to create a source file PROG2.PY which
>>> contains PROG1.PY as a raw string? That is, without changing the text
>>> from PROG1.PY at all.
>>
>> Should be fine, with only difference that you must indent the PROG1.PY
>> if it will be placed inside an indented suite.
>
> Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text.

I know. So you've decided to share that you also understood this?
Good, I'm glad that you understand :-)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-26 Thread Steven D'Aprano
On Sat, 26 May 2018 08:09:51 +0300, Mikhail V wrote:

> On Fri, May 25, 2018 at 1:15 PM, bartc  wrote:
[...]
>> One problem here is how to deal with embedded non-printable characters:
>> CR, LF and TAB might become part of the normal source text, but how
>> about anything else? Or would you only allow text that might appear in
>> a text file where those characters would also cause issues?
> 
> This syntax does not imply anything about text. From the editor's POV
> it's just the same as it is now - you can insert anything in a .py file.
> So it does not add new cases to current state of affairs in this regard.
> But maybe I'm not completely understand your question.

Here is a string assigned to name `s` using Python's current syntax:

s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere"

How do you represent that assignment using your syntax?


And another example:

s = """this is some text
x = 'a'
y = 'b'"""
t = 'c'

How do we write that piece of code using your syntax?



>> Another thing that might come up: suppose you do come up with a
>> workable scheme, and have a source file PROG1.PY which contains such
>> raw strings.
>>
>> Would it then be possible to create a source file PROG2.PY which
>> contains PROG1.PY as a raw string? That is, without changing the text
>> from PROG1.PY at all.
> 
> Should be fine, with only difference that you must indent the PROG1.PY
> if it will be placed inside an indented suite.

Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text.


> I was thinking about this
> nuance - I've added a special case for this in addition to the ? flag.

Oh good, another cryptic magical flag that changes the meaning of the 
syntax. Just what I was hoping for.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-25 Thread Mikhail V
On Fri, May 25, 2018 at 1:15 PM, bartc  wrote:
> On 25/05/2018 05:34, Mikhail V wrote:
>

> I had one big problem with your proposal, which is that I couldn't make head
> or tail of your syntax. Such a thing should be immediately obvious.
>
> (In your first two examples, what IS the exact string that you're trying to
> incorporate? That is not clear at all.)

You re right, this part is not very clear.
I was working on syntax mainly, but the document is getting better.
I make constant changes to it, here is a link on github:

https://github.com/Mikhail22/Documents/blob/master/raw-strings.rst


> One problem here is how to deal with embedded non-printable characters: CR,
> LF and TAB might become part of the normal source text, but how about
> anything else? Or would you only allow text that might appear in a text file
> where those characters would also cause issues?

This syntax does not imply anything about text. From the editor's POV
it's just the same as it is now - you can insert anything in a .py file.
So it does not add new cases to current state of affairs in this regard.
But maybe I'm not completely understand your question.



> Another thing that might come up: suppose you do come up with a workable
> scheme, and have a source file PROG1.PY which contains such raw strings.
>
> Would it then be possible to create a source file PROG2.PY which contains
> PROG1.PY as a raw string? That is, without changing the text from PROG1.PY
> at all.

Should be fine, with only difference that you must
indent the PROG1.PY if it will be placed inside an indented suite.
I was thinking about this nuance - I've added a special case for this
in addition to the ? flag.

data >>> X"#tag"
...
#tag

It will treat the block "as is", namely grab everythin together with indents,
like in TQS. This may cover some edge-cases.


> Here's one scheme I use in another language:
>
>print strinclude "file.txt"
>
> 'strinclude "file.txt"' is interpreted as a string literal which contains
> the contents of file.txt, with escapes used as needed. In fact it can be
> used for binary files too.
> [...]
> As for a better proposal, I'm inclined not to make it part of the language
> at all, but to make it an editor feature: insert a block of arbitrary text,
> and give a command to turn it into a string literal. With perhaps another
> command to take a string literal within a program and view it as un-escaped
> text.

I think it may be vice-versa - including links to external files
might be more effective approach in some sense. It only needs
some special kind of editor that would seamlessly embed them.
Though I don't know of such feature frankly speaking.
And there might be many caveats here.

And the feature to convert a text piece to Python string directly -
it is already possible in many editors - via macros or scripting.
But I think you falsely think that it is the solution to the problem.
Such changes - it's exactly what should be avoided.

In theory - an adequate feature like this (if it has real value)
will require the editor to track all manipulations - and give feedback.
You don't know when you have escaped some string or
not. And how do you save or see this events?
IOW this might be way harder to implement than the
approach with external text bits.


The simplest solution would be of course to write a translator.
For such syntax change - it is **millions** times easier than
what you've described.



M
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-25 Thread bartc

On 25/05/2018 05:34, Mikhail V wrote:


Proposal
---

Current proposal suggests adding syntax for the "raw text" statement.
This should enable the possibility to define text pieces in source
code without the need for interpreted characters.
Thereby it should solve the mentioned issues.
Additionally it should solve some issues with visual appearance.



General rules:

- parsing is aware of the indent of containing
   block, i.e. no de-dention needed.
- single line assignment may be allowed with
   some restrictions.

Difficulties:

- change of core parsing rules
- backward compatibility broken
- syntax highlighting may not work


I had one big problem with your proposal, which is that I couldn't make 
head or tail of your syntax. Such a thing should be immediately obvious.


(In your first two examples, what IS the exact string that you're trying 
to incorporate? That is not clear at all.)


The aim is to allow arbitrary text in program source which is to be 
interpreted as a string literal, and to be able to see the text as much 
in its natural form as possible.


One problem here is how to deal with embedded non-printable characters: 
CR, LF and TAB might become part of the normal source text, but how 
about anything else? Or would you only allow text that might appear in a 
text file where those characters would also cause issues?


Another thing that might come up: suppose you do come up with a workable 
scheme, and have a source file PROG1.PY which contains such raw strings.


Would it then be possible to create a source file PROG2.PY which 
contains PROG1.PY as a raw string? That is, without changing the text 
from PROG1.PY at all.


Here's one scheme I use in another language:

   print strinclude "file.txt"

'strinclude "file.txt"' is interpreted as a string literal which 
contains the contents of file.txt, with escapes used as needed. In fact 
it can be used for binary files too.


This ticks some of the boxes, but not all: the text isn't shown inline 
in the program source code. If you send someone this source code, they 
will also need FILE.TXT.


And it won't pass my PROG2/PROG1 test above (because both strincludes 
need expanding to strings, but the compiler won't recognise the one 
inside PROG1, as that is after all just text, not program code).


As for a better proposal, I'm inclined not to make it part of the 
language at all, but to make it an editor feature: insert a block of 
arbitrary text, and give a command to turn it into a string literal. 
With perhaps another command to take a string literal within a program 
and view it as un-escaped text.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list