All,

I would like for there to be a simple and terse way for Perl 6 identifiers or symbols, including variable and subroutine and identifier names, to be able to be composed of any characters whatsoever, even whitespace, as it is possible to do in some other languages like SQL, and as it is possible to name filesystem files.

I also want to emphasize that what I'm looking for is simply a compile time feature; the delimited identifiers are always literal constants resolvable at compile time, so there is no possible deferral to runtime like with symbolic references that can come from variables.

This would asist in having closer mapping when porting code from a language like PLSQL to Perl, or invoking code in such languages, but also gaining that native ability internally. And simply remapping characters, like spaces to underscores, won't work partly because of clashes like if the source had both a "the var" and a "the_var" already. And certain other workarounds, like hex-escaping all source identifiers, would cause obfuscation, which is bad for understanding the result.

In a way, this would be a wider application of that hash keys can already contain any characters, or that named parameter arguments can be string-quoted, though the latter are akin to identifiers in the method declarations.

Unless its already done, I see that support for this is only something that the tokenizer, and perhaps wider parser, of Perl 6 code has to be concerned with, and all other parts of the Perl 6 runtime don't have to care. Because, really, one main reason it isn't common place to, say have space characters in variable names, is because that could make the parser's job more difficult when determining the boundaries of a symbol name in code.

I propose that this can be accomplished with a simple and optional de-sugaring of the language that simply provides clues to the tokenizer in the form of special delimiters.

For example, if Perl 6 doesn't currently have back-tick (`) delimiters reserved (I forget) like Perl 5 does for invoking the Unix shell, we could use that; literal occurances of the delimiter characters in the identifier would be backslash-escaped as usual like with the single-quote (') delimited strings. Or if you consider this being used rarely, we could huffman code to have a longer delimiter like "qi()" or "qs()" or something.

If the delimited identifier would be valid as a non-delimited identifier (since it only contains alphanums for example), which Perl 6 code is composed of by default, then delimited and non-delimited versions of the same can be intermixed as equivalent; otherwise (eg, if they contain whitespace), they appear only in delimited form.

Using the back-ticks as an example, we could say:

  my $baz = 7; # parsed symbol is "baz"
  say $baz;    # parsed symbol is "baz"

  my $`foo` = 3; # parsed symbol is "foo"
  say $`foo`;    # parsed symbol is "foo"

  my $`the bar` = 5; # parsed symbol is "the bar"
  say $`the bar`;    # parsed symbol is "the bar"

Similarly, with subroutine or method names:

  method `do it` (:$`with this`) { ... }

  $myobj.`do it`( 'with this' => 17 );
  $myobj.`do it`( :`with this`<44> );

Note that named arguments can already have string quoted key names, I think, this is sort of an extension of that.

Of course, the exact syntax can be different, but I want to not lose functionality that I have in other languages and environments when in Perl 6.

Unless we have this feature, I would have to resort to either storing all symbols in hashes, or hex-escaping them all to ensure useable characters without name collisions, and that makes the resulting code obfuscated and hard to understand; I don't want to obfuscate.

Thank you. -- Darren Duncan

Reply via email to