Re: Unicode, strings, and Show

2016-03-30 Thread Manuel M T Chakravarty
> Brandon Allbery :
> 
> On Wed, Mar 30, 2016 at 9:50 PM, Manuel M T Chakravarty  > wrote:
> Firstly, we have
> 
>   isPrint :: Char -> Bool
> 
> Are you saying that this type is wrong?
> 
> Secondly, how often do you feed the output of ’show’ to ’read’ in another 
> locale versus how often is everybody whose whole life is outside of ASCII 
> (i.e., not anglo-centric people) bothered by this shortcoming? (*)
> 
> Moreover, the argument on the ticket was that changing the current 
> implementation would go against the standard. Now that I am saying, the 
> current implementation is not conforming to the standard, the standard 
> suddenly doesn’t seem to matter. Personally, I would say, when we wrote that 
> standard, we knew what we were doing.
> 
> The standard I am aware of is the Report, which deliberately limited the 
> output to the subset which is guaranteed to be usable in all locales. show 
> conforms to this; apparently people want it to *not* conform, and in a way 
> which requires some locale to become the One True Locale.

Where does it say that in the Report?

> isPrint is, as per the language Report, based on what Char is --- which is 
> Unicode codepoints. Using it for output — or for input, for that matter --- 
> gets you into locale issues because nobody anywhere guarantees that Unicode 
> codepoints that pass isPrint are representable in every locale. isPrint is 
> not the place to verify that a character can actually be displayed in the 
> current locale.

Yet, this is apparently what the report requires.

IMHO, it also makes sense. We have seen that either set up (the current or 
using ’isPrint’) has imperfections. However, getting \ is rarely 
helpful, whereas using ’isPrint’ is going to be helpful most of the time.

Manuel

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Brandon Allbery
On Wed, Mar 30, 2016 at 11:03 PM, Carter Schonwald <
carter.schonw...@gmail.com> wrote:

> One point in the design space that the swift language does, which seems
> intersting at least to me, is to have the notion of a character be backed
> by a Unicode grapheme cluster, which is a character like sequence of
> Unicode code points.  Would library support for this at all help this
> discussion or problem?


That's also Perl 6's solution. But in this case it would not help because
it's still living in Unicode space and not the I/O locale that is the
destination for the character.

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Carter Schonwald
One point in the design space that the swift language does, which seems
intersting at least to me, is to have the notion of a character be backed
by a Unicode grapheme cluster, which is a character like sequence of
Unicode code points.  Would library support for this at all help this
discussion or problem?

On Wednesday, March 30, 2016, Brandon Allbery  wrote:

> On Wed, Mar 30, 2016 at 9:50 PM, Manuel M T Chakravarty <
> c...@justtesting.org
> > wrote:
>
>> Firstly, we have
>>
>>   isPrint :: Char -> Bool
>>
>> Are you saying that this type is wrong?
>>
>> Secondly, how often do you feed the output of ’show’ to ’read’ in another
>> locale versus how often is everybody whose whole life is outside of ASCII
>> (i.e., not anglo-centric people) bothered by this shortcoming? (*)
>>
>> Moreover, the argument on the ticket was that changing the current
>> implementation would go against the standard. Now that I am saying, the
>> current implementation is not conforming to the standard, the standard
>> suddenly doesn’t seem to matter. Personally, I would say, when we wrote
>> that standard, we knew what we were doing.
>>
>
> The standard I am aware of is the Report, which deliberately limited the
> output to the subset which is guaranteed to be usable in all locales. show
> conforms to this; apparently people want it to *not* conform, and in a way
> which requires some locale to become the One True Locale.
>
> isPrint is, as per the language Report, based on what Char is --- which is
> Unicode codepoints. Using it for output --- or for input, for that matter
> --- gets you into locale issues because nobody anywhere guarantees that
> Unicode codepoints that pass isPrint are representable in every locale.
> isPrint is not the place to verify that a character can actually be
> displayed in the current locale.
>
> Or have you decided that ghc should require Unicode locales and nothing
> but Unicode locales from now on? If so, what do you do when the next issue
> comes up, where Unix is UTF8 and Windows is UTF16?
>
> --
> brandon s allbery kf8nh   sine nomine
> associates
> allber...@gmail.com 
>  ballb...@sinenomine.net
> 
> unix, openafs, kerberos, infrastructure, xmonad
> http://sinenomine.net
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Brandon Allbery
On Wed, Mar 30, 2016 at 9:50 PM, Manuel M T Chakravarty <
c...@justtesting.org> wrote:

> Firstly, we have
>
>   isPrint :: Char -> Bool
>
> Are you saying that this type is wrong?
>
> Secondly, how often do you feed the output of ’show’ to ’read’ in another
> locale versus how often is everybody whose whole life is outside of ASCII
> (i.e., not anglo-centric people) bothered by this shortcoming? (*)
>
> Moreover, the argument on the ticket was that changing the current
> implementation would go against the standard. Now that I am saying, the
> current implementation is not conforming to the standard, the standard
> suddenly doesn’t seem to matter. Personally, I would say, when we wrote
> that standard, we knew what we were doing.
>

The standard I am aware of is the Report, which deliberately limited the
output to the subset which is guaranteed to be usable in all locales. show
conforms to this; apparently people want it to *not* conform, and in a way
which requires some locale to become the One True Locale.

isPrint is, as per the language Report, based on what Char is --- which is
Unicode codepoints. Using it for output --- or for input, for that matter
--- gets you into locale issues because nobody anywhere guarantees that
Unicode codepoints that pass isPrint are representable in every locale.
isPrint is not the place to verify that a character can actually be
displayed in the current locale.

Or have you decided that ghc should require Unicode locales and nothing but
Unicode locales from now on? If so, what do you do when the next issue
comes up, where Unix is UTF8 and Windows is UTF16?

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Manuel M T Chakravarty
Firstly, we have

  isPrint :: Char -> Bool

Are you saying that this type is wrong?

Secondly, how often do you feed the output of ’show’ to ’read’ in another 
locale versus how often is everybody whose whole life is outside of ASCII 
(i.e., not anglo-centric people) bothered by this shortcoming? (*)

Moreover, the argument on the ticket was that changing the current 
implementation would go against the standard. Now that I am saying, the current 
implementation is not conforming to the standard, the standard suddenly doesn’t 
seem to matter. Personally, I would say, when we wrote that standard, we knew 
what we were doing.

Manuel

(*) BTW, (read . show) is a pretty bad serialisation story anyway.

> Brandon Allbery :
> 
> On Wed, Mar 30, 2016 at 9:16 PM, Manuel M T Chakravarty  > wrote:
> Thank you for all the replies and especially pointing to this ticket.
> 
> I think, the discussion on this ticket is actually misleading and there is a 
> simple solution, which I added as a comment.
> 
> That is in fact not simple at all: with that, the ostensibly pure `show` now 
> depends on the user's locale and therefore should be in IO (and you cannot 
> reliably feed it to `read` in a program running in a different locale)! This 
> is why the ticket was concentrating on ghci, where it's at least somewhat 
> reasonable to assume a UTF8 environment.
> 
> -- 
> brandon s allbery kf8nh   sine nomine associates
> allber...@gmail.com   
> ballb...@sinenomine.net 
> unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net 
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Brandon Allbery
On Wed, Mar 30, 2016 at 9:16 PM, Manuel M T Chakravarty <
c...@justtesting.org> wrote:

> Thank you for all the replies and especially pointing to this ticket.
>
> I think, the discussion on this ticket is actually misleading and there is
> a simple solution, which I added as a comment.
>

That is in fact not simple at all: with that, the ostensibly pure `show`
now depends on the user's locale and therefore should be in IO (and you
cannot reliably feed it to `read` in a program running in a different
locale)! This is why the ticket was concentrating on ghci, where it's at
least somewhat reasonable to assume a UTF8 environment.

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Manuel M T Chakravarty
Thank you for all the replies and especially pointing to this ticket.

I think, the discussion on this ticket is actually misleading and there is a 
simple solution, which I added as a comment.

Manuel

> Thomas Miedema :
> 
> 
> It would be great if someone could create a Trac ticket
> 
> Existing ticket: https://ghc.haskell.org/trac/ghc/ticket/11529 
>  ("Show instance of Char 
> should print literals for non-ascii printable charcters")

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Ben Gamari
Thomas Miedema  writes:

>> It would be great if someone could create a Trac ticket
>
>
> Existing ticket: https://ghc.haskell.org/trac/ghc/ticket/11529 ("Show
> instance of Char should print literals for non-ascii printable charcters")

Thanks Thomas! I've added a reference to this thread on the ticket.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Thomas Miedema
> It would be great if someone could create a Trac ticket


Existing ticket: https://ghc.haskell.org/trac/ghc/ticket/11529 ("Show
instance of Char should print literals for non-ascii printable charcters")
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-30 Thread Ben Gamari
Evan Laforge  writes:

> There was recently a discussion about it, search for subject "Can we
> improve Show instance for non-ascii charcters?"
>
> You can read for yourself but my impression was that people were
> generally favorable, but had some backward compatibility worries, and
> came up with some workarounds, but no one committed to following up on
> a ghci patch.
>
It would be great if someone could create a Trac ticket so we had
someplace persistent to track this discussion. Manuel, perhaps you could
handle this?

Cheers,

- Ben



signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-29 Thread Evan Laforge
There was recently a discussion about it, search for subject "Can we
improve Show instance for non-ascii charcters?"

You can read for yourself but my impression was that people were
generally favorable, but had some backward compatibility worries, and
came up with some workarounds, but no one committed to following up on
a ghci patch.

On Tue, Mar 29, 2016 at 7:26 PM, Manuel M T Chakravarty
 wrote:
> Why are we doing this?
>
>   GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
>   Prelude> "文字"
>   "\25991\23383"
>   Prelude>
>
> After all, we don’t print ’a’ as ’\97’.
>
> Manuel
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Unicode, strings, and Show

2016-03-29 Thread Manuel Gómez
On Tue, Mar 29, 2016 at 9:56 PM, Manuel M T Chakravarty
 wrote:
> Why are we doing this?
>
>   GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
>   Prelude> "文字"
>   "\25991\23383"
>   Prelude>
>
> After all, we don’t print ’a’ as ’\97’.
>
> Manuel

Indeed:

• 2016: 
https://mail.haskell.org/pipermail/haskell-cafe/2016-February/122874.html
• 2012: 
http://stackoverflow.com/questions/14039726/how-to-make-haskell-or-ghci-able-to-show-chinese-characters-and-run-chinese-char
• 2012 again: 
https://mail.haskell.org/pipermail/haskell-cafe/2012-July/102569.html
• 2011: 
http://stackoverflow.com/questions/5535512/how-to-hack-ghci-or-hugs-so-that-it-prints-unicode-chars-unescaped
• 2010: https://mail.haskell.org/pipermail/haskell-cafe/2010-August/082823.html

This is a constant source of pain and should be relatively easy to fix.

Manuel
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Unicode, strings, and Show

2016-03-29 Thread Manuel M T Chakravarty
Why are we doing this?

  GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
  Prelude> "文字"
  "\25991\23383"
  Prelude> 

After all, we don’t print ’a’ as ’\97’.

Manuel

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs