Le 08/02/15 23:07, Alfred Zett a écrit :
Hi Jean-Francois Colson,
I hope this doesn't mess up the mailing list.
- Indentation codepoint, with no fixed defined graphical
representation. For indentation based programming languages.
That wouldn’t be compliant with existing languages and future
languages might use any existing character.
This was for new languages. Creators of future languages mostly orient
on whatever is available and make sense, so I may make this proposal
as well, so they don't have to choose the half-assed workarounds they
use now.
I need a few tens of characters for a conlang I’m developping. ☺
The problem is that Unicode only encodes characters which are
effectively used today or which have been used in the past. It doesn’t
encode characters which could perhaps be used in a hypothetical new
programing language in the future.
Also, as long as there is stuff like
https://github.com/sferik/active_emoji it still makes more sense.
Because:
-- specific clients may want to show it different (for example as
arrows, lines etc., using another color):
Can’t good editors display tabs in a different color when required ?
Not as reliable and customizable as a special codepoint. For example
--- browsers could let the web page creator let decide the visual
representation (character and size) via CSS
can't be done and on-the-fly copy and paste conversion with JavaScript
is horrid and broken for security reasons.
But it's an issue even in good editors as well. You need a lexing
plugin that may work or not. And the size and other factors are still
fixed. After all, tabs have whitespace semantics that may appear
everywhere in the text.
--- the same with editors, independent from the actual font
--- in case of visual impairment, the user could even change the
accoustical representation if the editor allows it
-- unlike a space symbol, it wouldn't need more than one character
per indentation
-- unlike tabs or space, it wouldn't be whitespace
-- unlike normal arrow characters, one could customize the length in
an editor and wouldn't have to insert extra spaces for a better
visual imagery
- A codepoint for string literal quotes, that would spare one the
escaping.
I rarely escape quotes.
In a text, I use ’ (U+2019) as an apostrophe and «»“”‘’ as quotes, so
I don’t need to escape them.
When I use PHP to generate some HTML code, I try to alternate simple
and double quotes as much as possible. That way I rarely need to
escape them.
OK, but that's just your scenario. With a language design from the
past. With probably an editor from the past that allows non-unicode
encodings. In a better world, manual code point inserting was a last
resort.
Imagine someone wants to make his text look like written with a
typewriter. Or something else.
- A statement separator symbol.
To replace the semicolon in C and the languages based on its syntax?
Again, for future uses. To be honest, this might sound questionable,
but this could blur the line between visual line breaks and visual
characters like semicolons.
Line-break ended comments are separator ended comments.
Of course, that's the least required part of those three proposed
characters, but I thought for the sake and completeness that shouldn't
miss.
Come to think of it, two sets of opening and closing block symbols
couldn't harm either. And a continue-after-linebreak symbol as well.
- Other ideas?
Aren’t you trying to reinvent APL?
No. APL places a lot of alien-looking, annoying characters to anyone
except mathematicians into your code that are hard to input. In
particular from the context.
My proposal on the other hand - if implemented right - introduces some
really intuitive looking and easy to input characters, because a bold
arrow at the left doesn't need further explanation and your IDE of the
future can easily place them when pressing tab in the right position.
_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode
_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode