Re: [swift-evolution] Strings in Swift 4

Matthew Johnson via swift-evolution Wed, 25 Jan 2017 12:20:07 -0800

> On Jan 25, 2017, at 12:23 PM, Joe Groff via swift-evolution 
> <swift-evolution@swift.org> wrote:
> 
> 
>> On Jan 24, 2017, at 9:35 PM, Chris Lattner via swift-evolution 
>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>> 
>> On Jan 24, 2017, at 12:05 AM, Chris Eidhof via swift-evolution 
>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>> 
>>> I agree that being able to implement parsers in a nice way can be a huge 
>>> step forward in being really good at string processing.
>> 
>> +1 from me as well, I agree with Joe that Swift can learn a lot from Perl 6 
>> grammar’s and we should take the time to do it right.  Below I say “regex” a 
>> lot, but I really mean a more general grammar system (and even Perl 5 
>> regex’s aren’t regular :-)
>> 
>>> There are a couple of possibilities that come to mind directly:
>>> 
>>> 1. Build parsers right into the language (like Perl 6 grammars)
>>> 2. Provide a parser combinator language (e.g. 
>>> https://github.com/davedufresne/SwiftParsec 
>>> <https://github.com/davedufresne/SwiftParsec>). 
>>> 3. Rely on external tools like bison/yacc/etc.
>>> 4. Make it easy for people to write hand-written parsers (e.g. by providing 
>>> an NSScanner alternative).
>> 
>> 
>> My opinion is that #1 is the right path to start with, but it wouldn’t 
>> preclude doing #2.  Here’s my rationale / half-baked thought process:
>> 
>> There are two important use cases for regex's: the literal case (e.g. 
>> /aa+b*/) and the dynamically computed case.  The former is really what we’re 
>> talking about here, the latter should obviously be handled with some sort of 
>> Regex type which can be formed from string values or whatever.  Regex 
>> literals in an expression context should default to producing the Regex type 
>> of course.
>> 
>> This means that when you pass a regex literal into an API call (e.g. split 
>> on a string), it is really just creating something of Regex type, and 
>> passing it down.  If you wanted to introduce a parser combinator DSL, you 
>> could totally plug it into the system, by having the combinators produce 
>> something of the Regex type.
>> 
>> So why bless regex literals with language support at all?  I see several 
>> reasons:
>> 
>> 1. Diagnostics: These will be heavily used by people, and you want to have 
>> good compiler error and warning messages for them.  You want to be able to 
>> validate the regex at compile time, not wait until runtime to detect 
>> syntactic mistakes like unbalanced parens.
>> 
>> 2. Syntax Familiarity: To take advantage of people’s familiarity with other 
>> languages, we should strive to make the basic regex syntax familiar and 
>> obvious.  I’d argue that /aa+b*/ should “just work” and do the thing you 
>> think it does.  Relying on a combinator library to do that would be crazy.
>> 
>> 3. Performance: Many regex’s are actually regular, so they can be trivially 
>> compiled into DFAs.  There is a well understood body of work that can be 
>> simply dropped into the compiler to do this. Regex’s that are not regular 
>> can be compiled into hybrid DFA/NFA+backtracking schemes, and allowing a 
>> divide and conquer style of compiler optimization to do this is the path 
>> that makes the most sense (to me at least).  Further, if you switch on a 
>> string and have a bunch of cases that are regex’s, you’d obviously want the 
>> compiler to generate a single state machine (like a lexer), not check each 
>> pattern in series.
>> 
>> 4. Pattern matching greatness: One of the most obnoxious/error prone aspects 
>> of regex’s in many languages is that when you match a pattern, the various 
>> matches are dumped into numbered result values (often by the order of the 
>> parens in the pattern).  This is totally barbaric: it begs for off by one 
>> errors, often breaks as the program is being evolved/maintained, etc.  It is 
>> just as bad as printf/scanf!  
>> 
>> You should instead be able to directly bind subexpressions into local 
>> variables.  For example if you were trying to match something like “42: 
>> Chris”, you should be able to use straw man syntax like this:
>> 
>>    case /(let id: \d+): (let name: \w+)/: print(id); print(name)
>> 
>> Unless we were willing to dramatically expand how patterns work, this 
>> requires baking support into the language.
>> 
>> 5. Scanner/“Formatter" integration:  Taking the above one step farther, we 
>> could have default patterns for known types (and make it extensible to user 
>> defined types of course). For example, \d+ is the obvious pattern for 
>> integers, so you should be able to write the above like this (in principle):
>> 
>>    case /(let id: Int): (let name: \w+)/: print(id); print(name)
>> 
>> In addition to avoiding having to specify \d+ all the time, this eliminates 
>> the need for a “string to int” conversion after the pattern is matched, 
>> because id would be bound as type Int already.
>> 
>> 
>> Anyway, to summarize, I think that getting regex’s into the language is 
>> really important and expect them to be widely used.  As such, I think it is 
>> worth burning compiler/language complexity to make them be truly great in 
>> Swift.
> 
> Another good reason to bake them into pattern matching is that it would make 
> it easier to optimize when you want to match one of multiple patterns. Often, 
> you don't want just one grammar, but possibly one of many, and it'd be nice 
> if switch-ing over multiple string patterns led to a reasonably efficient 
> DFA/NFA/rec-descent machine based on the needs of the grammars being matched.


+1 to the comments by Chris and Joe.  We should do as much as we can in the 
library, but compile-time error detection, optimizability and syntactic 
convenience are important reasons to bake some support into the language itself.

> 
> -Joe
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] Strings in Swift 4

Reply via email to