Hi. I've put some thoughts together, and need some feedback on this proposal. Main question is: Is it convincing? Is there any flaw? My own opinion - there IS something to chase. Still the justification for such syntax is hard.
Raw string statement -------------- Issue --------- Vast majority of tasks include operations with text in various grades of complexity. It is relevant even by simple ubiquitous tasks, like for example defining file paths. String literals are interpreted - i.e. special character "\" may change the contents of a string. Python raw string r"" has the least amount of such cases : namely the inclusion of the quote character requires escaping. There is still no string type which is totally uninterpreted. As a result, any text piece that contains a quote must be *edited* before it can be used in sources. This may seem a minor problem, but if we count all cases, then the cumulative long-term impact may be significant. Also this problem may become more acute in cases related to: - development of text/code generators - and, in general, all text processing with a lot of literal data definition - proofreading Such applications may *require* a lot of embedded text definitions and this may even lead to frustration by proofreading of 'escaped' pieces and it adds necessity for keeping track of changes in these pieces. Using external resources for these tasks could help, but it may lead to even worse experience because of spread definitions and increased maintenance times. The most common solution in existing syntax - triple quoted strings and raw strings has some additional issues: - Data is parsed including indents. This may be a benefit for some cases (e.g. start lines always without any indent) but it also may become confusing for the readers when not aligned with containing block. So-called "de-denting" is also needed. - Triple quotes cause visual ambiguity in some edge-cases, e.g. when a string starts or ends with a quote. Also in many fonts a pair of single quotes is visually identical to one double quote. Proposal ----------- Current proposal suggests adding syntax for the "raw text" statement. This should enable the possibility to define text pieces in source code without the need for interpreted characters. Thereby it should solve the mentioned issues. Additionally it should solve some issues with visual appearance. Specification --------- Raw string statement has the following form: name >>> "condition_string" ... text ... in example: data >>> " " begin end #rest will parse the block by comparing each next line part with the string " " (2 spaces here). This will return: " begin\n end" -- Additional option: parse and remove: data >>> !" " begin end #rest Will parse by the same rule but also remove the string from the result: "begin\nend" - Additional option: parse *until* condition: data >>> ?"#eof" begin end #eof Will parse up to character sequence "#eof" (if it is on the same level) and returns: "begin\nend". The benefit of last option - the data can be put at zero level. It may be also prefered due to explicit terminator. General rules: -------- - parsing is aware of the indent of containing block, i.e. no de-dention needed. - single line assignment may be allowed with some restrictions. Difficulties: -------- - change of core parsing rules - backward compatibility broken - syntax highlighting may not work -- https://mail.python.org/mailman/listinfo/python-list