Re: Raw string statement (proposal)
On Thu, May 24, 2018 at 9:34 PM, Mikhail Vwrote: > Hi. > I've put some thoughts together, and > need some feedback on this proposal. > Main question is: Is it convincing? > Is there any flaw? > My own opinion - there IS something to chase. > Still the justification for such syntax is hard. > This is a strange syntax that is quite unnecessary; just use triple-quoted strings. Please, please, please don't add it. I believe Python's core language should be kept small. Big libraries are fine - important even. But the core language should be easy to learn but powerful. -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sat, May 26, 2018 at 10:21 PM, Chris Angelicowrote: > > I'm done. Argue with brick walls for the rest of eternity if you like. I see you like me, but I can reciprocate your feelings. > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sun, May 27, 2018 at 5:13 AM, Mikhail Vwrote: > On Sat, May 26, 2018 at 7:10 PM, Steven D'Aprano >> temp = >>| Mikhail's syntax for {65290} is {65290} >> Can you see the problem yet? How does your collapse function f() >> distinguish between the escape code {65290} and the literal string >> {65290}? > > Look, there is no any problem here. You can choose > whatever YOU want to write "{" and "}". e.g. I'd pick {<} {>}. > > temp = >>| Mikhail's syntax for {65290} is {<}65290{>} Whatever notation you pick, you need some way to represent it literally. Don't you see? All you're doing is dodging the issue by making your literal syntax more and more complicated. > I just show you that syntax allows you not only input TEXT > without any manipulations, but also it allows to solve any task related > to custom presentation - in this case you want work with > ordinals for example. And you may end up with a scheme which > reads way better than your cryptic example. How about a simple notation that bases everything on a single typable character like REVERSE SOLIDUS? > I'll also ask you a question - > Which such non-text codes you may need to > generate C code? Python code? HTML code? What do you mean? >> And we've gone from a single string literal, which is an expression that >> can be used anywhere, to a statement that cannot be used in expressions. > > Ok show me your suggestion for a raw string definition for expressions > (no modifications to source text and no usage of invisible control chars). Ahh but "no modifications to source text" was YOUR rule, not ours. Most of us are perfectly happy for string literals to be spelled with quotes around them and a straight-forward system of escapes. Or if we want NO modifications whatsoever, then we use another very standard way of delimiting the contents: put it in a separate file and then read that in. Oh wow, look at that, we can just put what we like and it doesn't break anything! I'm done. Argue with brick walls for the rest of eternity if you like. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sat, May 26, 2018 at 7:10 PM, Steven D'Apranowrote: > On Sat, 26 May 2018 18:22:15 +0300, Mikhail V wrote: > >>> Here is a string assigned to name `s` using Python's current syntax: >>> >>> s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere" >>> >>> How do you represent that assignment using your syntax? >> >> Hope its not mandatory to decipher your random example. If for example >> I'd want to work with a lot of non-presentable characters, I'd use a >> more human-oriented notation than this ^. And that is exactly where raw >> strings are needed. >> >> So I'd make a readable notation where I can present a character by its >> ordinal enclosed in some tag for example {10}. Then just write a >> function which collapses those depending on further needs: >> >> data >>| abc{10}def >> data = f(data) >> >> And the notation itself can be chosen depending on my needs. Hope you >> get the point. > > Loud and clear: your syntax has no way of representing the string s. Just 5 lines before it is - you did not notice. > temp = >>| Mikhail's syntax for {65290} is {65290} > Can you see the problem yet? How does your collapse function f() > distinguish between the escape code {65290} and the literal string > {65290}? Look, there is no any problem here. You can choose whatever YOU want to write "{" and "}". e.g. I'd pick {<} {>}. temp = >>| Mikhail's syntax for {65290} is {<}65290{>} I just show you that syntax allows you not only input TEXT without any manipulations, but also it allows to solve any task related to custom presentation - in this case you want work with ordinals for example. And you may end up with a scheme which reads way better than your cryptic example. This motivation is covered in documentation, but maybe not good enough? anyway you're welcome to make suggestion for the documentation. I'll also ask you a question - Which such non-text codes you may need to generate C code? Python code? HTML code? > > And we've gone from a single string literal, which is an expression that > can be used anywhere, to a statement that cannot be used in expressions. Ok show me your suggestion for a raw string definition for expressions (no modifications to source text and no usage of invisible control chars). > I don't know what TQS is supposed to mean. Triple Quoted String > s = >>> > this is some text > x = 'a' > y = 'b' > t = 'c' > > > Am I close? Yes very close, just shift it with a char of your choice, e.g. 1 space: s >>> !" " this is some text x = 'a' y = 'b' t = 'c' or use a tag of your choice (no shifting needed in this case): s >>> ?"#your favorite closing tag" this is some text x = 'a' y = 'b' #your favorite closing tag t = 'c' There can be other suggestions for symbols, etc. -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sat, 26 May 2018 18:22:15 +0300, Mikhail V wrote: >> Here is a string assigned to name `s` using Python's current syntax: >> >> s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere" >> >> How do you represent that assignment using your syntax? > > Hope its not mandatory to decipher your random example. If for example > I'd want to work with a lot of non-presentable characters, I'd use a > more human-oriented notation than this ^. And that is exactly where raw > strings are needed. > > So I'd make a readable notation where I can present a character by its > ordinal enclosed in some tag for example {10}. Then just write a > function which collapses those depending on further needs: > > data >>| abc{10}def > data = f(data) > > And the notation itself can be chosen depending on my needs. Hope you > get the point. Loud and clear: your syntax has no way of representing the string s. Okay, let's take a simpler example. You want to use {65290} to represent the Unicode code point \uFF0A. Fair enough. Everyone else in the world uses hexadecimal for this purpose, but whatever. But isn't {65290} just another way of spelling an escape code? Isn't your syntax supposed to eliminate the need for escape codes? So if I had: result = function("Mikhail's syntax for \uFF0A is {65290}") your syntax would become: temp = >>| Mikhail's syntax for {65290} is {65290} result = function(f(temp)) where f is your (as yet non-existent?) function for collapsing the {} codes to characters. Can you see the problem yet? How does your collapse function f() distinguish between the escape code {65290} and the literal string {65290}? And we've gone from a single string literal, which is an expression that can be used anywhere, to a statement that cannot be used in expressions. >> And another example: >> >> s = """this is some text >> x = 'a' >> y = 'b'""" >> t = 'c' >> >> How do we write that piece of code using your syntax? > > That's too easy - maybe you can try it yourself? I am not trying to > imply anything, but I don't see how this example can cause problems - > just put the TQS in a block. I don't know what TQS is supposed to mean. I'll give it a try, and see if I understand your syntax. The idea is to avoid needing to put BEGIN END delimiters such as quotation marks around the string, right? s = >>> this is some text x = 'a' y = 'b' t = 'c' Am I close? How is the interpreter supposed to know that the final line is not part of the string, but a line of actual Python code? [...] >> Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text. > > I know. So you've decided to share that you also understood this? Good, > I'm glad that you understand :-) I believe that the whole point of your proposal to avoid needing to change the text in any way when you paste it into your source code. If that's not the point, then what is the point? -- Steven D'Aprano "Ever since I learned about confirmation bias, I've been seeing it everywhere." -- Jon Ronson -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sat, May 26, 2018 at 10:55 AM, Steven D'Apranowrote: > On Sat, 26 May 2018 08:09:51 +0300, Mikhail V wrote: > >> On Fri, May 25, 2018 at 1:15 PM, bartc wrote: > [...] >>> One problem here is how to deal with embedded non-printable characters: >>> CR, LF and TAB might become part of the normal source text, but how >>> about anything else? Or would you only allow text that might appear in >>> a text file where those characters would also cause issues? >> >> This syntax does not imply anything about text. From the editor's POV >> it's just the same as it is now - you can insert anything in a .py file. >> So it does not add new cases to current state of affairs in this regard. >> But maybe I'm not completely understand your question. > > Here is a string assigned to name `s` using Python's current syntax: > > s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere" > > How do you represent that assignment using your syntax? Hope its not mandatory to decipher your random example. If for example I'd want to work with a lot of non-presentable characters, I'd use a more human-oriented notation than this ^. And that is exactly where raw strings are needed. So I'd make a readable notation where I can present a character by its ordinal enclosed in some tag for example {10}. Then just write a function which collapses those depending on further needs: data >>| abc{10}def data = f(data) And the notation itself can be chosen depending on my needs. Hope you get the point. > > > And another example: > > s = """this is some text > x = 'a' > y = 'b'""" > t = 'c' > > How do we write that piece of code using your syntax? That's too easy - maybe you can try it yourself? I am not trying to imply anything, but I don't see how this example can cause problems - just put the TQS in a block. >>> Would it then be possible to create a source file PROG2.PY which >>> contains PROG1.PY as a raw string? That is, without changing the text >>> from PROG1.PY at all. >> >> Should be fine, with only difference that you must indent the PROG1.PY >> if it will be placed inside an indented suite. > > Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text. I know. So you've decided to share that you also understood this? Good, I'm glad that you understand :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Sat, 26 May 2018 08:09:51 +0300, Mikhail V wrote: > On Fri, May 25, 2018 at 1:15 PM, bartcwrote: [...] >> One problem here is how to deal with embedded non-printable characters: >> CR, LF and TAB might become part of the normal source text, but how >> about anything else? Or would you only allow text that might appear in >> a text file where those characters would also cause issues? > > This syntax does not imply anything about text. From the editor's POV > it's just the same as it is now - you can insert anything in a .py file. > So it does not add new cases to current state of affairs in this regard. > But maybe I'm not completely understand your question. Here is a string assigned to name `s` using Python's current syntax: s = "some\ncharacters\0abc\x01\ndef\uFF0A\nhere" How do you represent that assignment using your syntax? And another example: s = """this is some text x = 'a' y = 'b'""" t = 'c' How do we write that piece of code using your syntax? >> Another thing that might come up: suppose you do come up with a >> workable scheme, and have a source file PROG1.PY which contains such >> raw strings. >> >> Would it then be possible to create a source file PROG2.PY which >> contains PROG1.PY as a raw string? That is, without changing the text >> from PROG1.PY at all. > > Should be fine, with only difference that you must indent the PROG1.PY > if it will be placed inside an indented suite. Bart said WITHOUT CHANGING THE TEXT. Indenting it is changing the text. > I was thinking about this > nuance - I've added a special case for this in addition to the ? flag. Oh good, another cryptic magical flag that changes the meaning of the syntax. Just what I was hoping for. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On Fri, May 25, 2018 at 1:15 PM, bartcwrote: > On 25/05/2018 05:34, Mikhail V wrote: > > I had one big problem with your proposal, which is that I couldn't make head > or tail of your syntax. Such a thing should be immediately obvious. > > (In your first two examples, what IS the exact string that you're trying to > incorporate? That is not clear at all.) You re right, this part is not very clear. I was working on syntax mainly, but the document is getting better. I make constant changes to it, here is a link on github: https://github.com/Mikhail22/Documents/blob/master/raw-strings.rst > One problem here is how to deal with embedded non-printable characters: CR, > LF and TAB might become part of the normal source text, but how about > anything else? Or would you only allow text that might appear in a text file > where those characters would also cause issues? This syntax does not imply anything about text. From the editor's POV it's just the same as it is now - you can insert anything in a .py file. So it does not add new cases to current state of affairs in this regard. But maybe I'm not completely understand your question. > Another thing that might come up: suppose you do come up with a workable > scheme, and have a source file PROG1.PY which contains such raw strings. > > Would it then be possible to create a source file PROG2.PY which contains > PROG1.PY as a raw string? That is, without changing the text from PROG1.PY > at all. Should be fine, with only difference that you must indent the PROG1.PY if it will be placed inside an indented suite. I was thinking about this nuance - I've added a special case for this in addition to the ? flag. data >>> X"#tag" ... #tag It will treat the block "as is", namely grab everythin together with indents, like in TQS. This may cover some edge-cases. > Here's one scheme I use in another language: > >print strinclude "file.txt" > > 'strinclude "file.txt"' is interpreted as a string literal which contains > the contents of file.txt, with escapes used as needed. In fact it can be > used for binary files too. > [...] > As for a better proposal, I'm inclined not to make it part of the language > at all, but to make it an editor feature: insert a block of arbitrary text, > and give a command to turn it into a string literal. With perhaps another > command to take a string literal within a program and view it as un-escaped > text. I think it may be vice-versa - including links to external files might be more effective approach in some sense. It only needs some special kind of editor that would seamlessly embed them. Though I don't know of such feature frankly speaking. And there might be many caveats here. And the feature to convert a text piece to Python string directly - it is already possible in many editors - via macros or scripting. But I think you falsely think that it is the solution to the problem. Such changes - it's exactly what should be avoided. In theory - an adequate feature like this (if it has real value) will require the editor to track all manipulations - and give feedback. You don't know when you have escaped some string or not. And how do you save or see this events? IOW this might be way harder to implement than the approach with external text bits. The simplest solution would be of course to write a translator. For such syntax change - it is **millions** times easier than what you've described. M -- https://mail.python.org/mailman/listinfo/python-list
Re: Raw string statement (proposal)
On 25/05/2018 05:34, Mikhail V wrote: Proposal --- Current proposal suggests adding syntax for the "raw text" statement. This should enable the possibility to define text pieces in source code without the need for interpreted characters. Thereby it should solve the mentioned issues. Additionally it should solve some issues with visual appearance. General rules: - parsing is aware of the indent of containing block, i.e. no de-dention needed. - single line assignment may be allowed with some restrictions. Difficulties: - change of core parsing rules - backward compatibility broken - syntax highlighting may not work I had one big problem with your proposal, which is that I couldn't make head or tail of your syntax. Such a thing should be immediately obvious. (In your first two examples, what IS the exact string that you're trying to incorporate? That is not clear at all.) The aim is to allow arbitrary text in program source which is to be interpreted as a string literal, and to be able to see the text as much in its natural form as possible. One problem here is how to deal with embedded non-printable characters: CR, LF and TAB might become part of the normal source text, but how about anything else? Or would you only allow text that might appear in a text file where those characters would also cause issues? Another thing that might come up: suppose you do come up with a workable scheme, and have a source file PROG1.PY which contains such raw strings. Would it then be possible to create a source file PROG2.PY which contains PROG1.PY as a raw string? That is, without changing the text from PROG1.PY at all. Here's one scheme I use in another language: print strinclude "file.txt" 'strinclude "file.txt"' is interpreted as a string literal which contains the contents of file.txt, with escapes used as needed. In fact it can be used for binary files too. This ticks some of the boxes, but not all: the text isn't shown inline in the program source code. If you send someone this source code, they will also need FILE.TXT. And it won't pass my PROG2/PROG1 test above (because both strincludes need expanding to strings, but the compiler won't recognise the one inside PROG1, as that is after all just text, not program code). As for a better proposal, I'm inclined not to make it part of the language at all, but to make it an editor feature: insert a block of arbitrary text, and give a command to turn it into a string literal. With perhaps another command to take a string literal within a program and view it as un-escaped text. -- bartc -- https://mail.python.org/mailman/listinfo/python-list