Re: Reducing the need for CPP (was: Monad of no `return` Proposal (MRP): Moving `return` out of `Monad`)

2015-10-06 Thread Johan Tibell
It might be enough to just add a NOWARN  pragma that acts on
a single line/expression. I've seen it in both C++ and Python linters and
it works reasonably well and it's quite general.

On Tue, Oct 6, 2015 at 10:44 AM, Ben Gamari  wrote:

> Sven Panne  writes:
>
> > 2015-10-05 17:09 GMT+02:00 Gershom B :
> >
> >> On October 5, 2015 at 10:59:35 AM, Bryan O'Sullivan (b...@serpentine.com
> )
> >> wrote:
> >> [...] As for libraries, it has been pointed out, I believe, that without
> >> CPP one can write instances compatible with AMP, and also with AMP +
> MRP.
> >> One can also write code, sans CPP, compatible with pre- and post- AMP.
> [...]
> >>
> >
> > Nope, at least not if you care about -Wall: If you take e.g. (<$>) which
> is
> > now part of the Prelude, you can't simply import some compatibility
> module,
> > because GHC might tell you (rightfully) that that import is redundant,
> > because (<$>) is already visible through the Prelude. So you'll have to
> use
> > CPP to avoid that import on base >= 4.8, be it from it Data.Functor,
> > Control.Applicative or some compat-* module. And you'll have to use CPP
> in
> > each and every module using <$> then, unless I miss something obvious.
> > AFAICT all transitioning guides ignore -Wall and friends...
> >
> This is a fair point that comes up fairly often. The fact that CPP is
> required to silence redundant import warnings is quite unfortunate.
> Others languages have better stories in this area. One example is Rust,
> which has a quite flexible `#[allow(...)]` pragma which can be used to
> acknowledge and silence a wide variety of warnings and lints [1].
>
> I can think of a few ways (some better than others) how we might
> introduce a similar idea for import redundancy checks in Haskell,
>
>  1. Attach a `{-# ALLOW redundant_import #-}` pragma to a definition,
>
> -- in Control.Applicative
> {-# ALLOW redundant_import (<$>) #-}
> (<$>) :: (a -> b) -> f a -> f b
> (<$>) = fmap
>
> asking the compiler to pretend that any import of the symbol did not
> exist when looking for redundant imports. This would allow library
> authors to appropriately mark definitions when they are moved,
> saving downstream users from having to make any change whatsoever.
>
>  2. Or alternatively we could make this a idea a bit more precise,
>
> -- in Control.Applicative
> {-# ALLOW redundant_import Prelude.(<$>) #-}
> (<$>) :: (a -> b) -> f a -> f b
> (<$>) = fmap
>
> Which would ignore imports of `Control.Applicative.(<$>)` only if
> `Prelude.(<$>)` were also in scope.
>
>  3. Attach a `{-# ALLOW redundancy_import #-}` pragma to an import,
>
> import {-# ALLOW redundant_import #-} Control.Applicative
>
> -- or perhaps
> import Control.Applicative
> {-# ALLOW redundant_import Control.Applicative #-}
>
> allowing the user to explicitly state that they are aware that this
> import may be redundant.
>
>  4. Attach a `{-# ALLOW redundancy_import #-}` pragma to a name in an
> import list,
>
> import Control.Applicative ((<$>) {-# ALLOW redundant_import #-})
>
> allowing the user to explicitly state that they are aware that this
> imported function may be redundant.
>
> In general I'd like to reiterate that many of the comments in this
> thread describe genuine sharp edges in our language which have presented
> a real cost in developer time during the AMP and and FTP transitions. I
> think it is worth thinking of ways to soften these edges; we may be
> surprised how easy it is to fix some of them.
>
> - Ben
>
>
> [1] https://doc.rust-lang.org/stable/reference.html#lint-check-attributes
>
> ___
> Libraries mailing list
> librar...@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
>
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-prime


Re: MRP, 3-year-support-window, and the non-requirement of CPP (was: [Haskell-cafe] Monad of no `return` Proposal (MRP): Moving `return` out of `Monad`)

2015-10-06 Thread Johan Tibell
(Resending with smaller recipient list to avoid getting stuck in the
moderator queue.)

On Tue, Oct 6, 2015 at 9:10 AM, Herbert Valerio Riedel  wrote:

> On 2015-10-05 at 21:01:16 +0200, Johan Tibell wrote:
> > On the libraries I maintain and have a copy of on my computer right now:
> 329
>
>
> Although this was already pointed out to you in a response to a Tweet of
> yours, I'd like to expand on this here to clarify:
>
>
> You say that you stick to the 3-major-ghc-release support-window
> convention for your libraries. This is good, because then you don't need
> any CPP at all! Here's why:
>
> [...]
>

So what do I have to write today to have my Monad instances be:

 * Warning free - Warnings are useful. Turning them off or having spurious
warnings both contribute to bugs.
 * Use imports that either are qualified or have explicit import lists -
Unqualified imports makes code more likely to break when dependencies add
exports.
 * Don't use CPP.

Neither AMP or MRP includes a recipe for this in their proposal. AMP got
one post-facto on the Wiki. It turns out that the workaround there didn't
work (we tried it in Cabal and it conflicted with one of the above
requirements.)

PS: I'm a bit disappointed you seem to dismiss this proposal right away
> categorically without giving us a chance to address your
> concerns. The proposal is not a rigid all-or-nothing thing that
> can't be tweaked and revised.  That's why we're having these
> proposal-discussions in the first place (rather than doing blind
> +1/-1 polls), so we can hear everyone out and try to maximise the
> agreement (even if we will never reach 100% consensus on any
> proposal).
>
> So please, keep on discussing!
>

The problem by discussions is that they are done between two groups with
quite a difference in experience. On one hand you have people like Bryan,
who have considerable contributions to the Haskell ecosystem and much
experience in large scale software development (e.g. from Facebook). On the
other hand you have people who don't. That's okay. We've all been at the
latter group at some point of our career.

What's frustrating is that people don't take a step bad and realize that
they might be in the latter group and should perhaps listen to those in the
former. This doesn't happen, instead we get lots of "C++ and Java so bad
and we don't want to be like them." Haskell is not at risk of becoming C++
or Java (which are a large improvement compared to the languages came
before them). We're at risk of missing our window of opportunity. I think
that would be a shame, as I think Haskell is a step forward compared to
those languages and I would like to see more software that used be written
in Haskell.

We've been through this many times before on the libraries list. I'm not
going to win an argument on this mailing list. Between maintaining
libraries you all use and managing a largish team at Google, I don't have
much time for a discussion which approaches a hundred emails and is won by
virtue of having lots of time to write emails.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Monad of no `return` Proposal (MRP): Moving `return` out of `Monad`

2015-10-05 Thread Johan Tibell
On Mon, Oct 5, 2015 at 9:02 PM, Erik Hesselink  wrote:

> On 5 October 2015 at 20:58, Sven Panne  wrote:
> > 2015-10-05 17:09 GMT+02:00 Gershom B :
> >>
> >> [...] As for libraries, it has been pointed out, I believe, that without
> >> CPP one can write instances compatible with AMP, and also with AMP +
> MRP.
> >> One can also write code, sans CPP, compatible with pre- and post- AMP.
> [...]
> >
> > Nope, at least not if you care about -Wall: If you take e.g. (<$>) which
> is
> > now part of the Prelude, you can't simply import some compatibility
> module,
> > because GHC might tell you (rightfully) that that import is redundant,
> > because (<$>) is already visible through the Prelude. So you'll have to
> use
> > CPP to avoid that import on base >= 4.8, be it from it Data.Functor,
> > Control.Applicative or some compat-* module. And you'll have to use CPP
> in
> > each and every module using <$> then, unless I miss something obvious.
> > AFAICT all transitioning guides ignore -Wall and friends...
>
> Does the hack mentioned on the GHC trac [1] work for this? It seems a
> bit fragile but that page says it works and it avoids CPP.
>

No it doesn't, if you also don't want closed import lists (which you
should).
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Monad of no `return` Proposal (MRP): Moving `return` out of `Monad`

2015-10-05 Thread Johan Tibell
On Mon, Oct 5, 2015 at 8:34 PM, Gregory Collins 
wrote:

>
> On Mon, Oct 5, 2015 at 8:09 AM, Gershom B  wrote:
>
>> My understanding of the argument here, which seems to make sense to me,
>> is that the AMP already introduced a significant breaking change with
>> regards to monads. Books and lecture notes have already not caught up to
>> this, by and large. Hence, by introducing a further change, which
>> _completes_ the general AMP project, then by the time books and lecture
>> notes are all updated, they will be able to tell a much nicer story than
>> the current one?
>
>
> This is a multi-year, "boil the ocean"-style project, affecting literally
> every Haskell user, and the end result after all of this labor is going to
> be... a slightly spiffier bike shed?
>
> Strongly -1 from me also. My experience over the last couple of years is
> that every GHC release breaks my libraries in annoying ways that require
> CPP to fix:
>
> ~/personal/src/snap λ  find . -name '*.hs' | xargs egrep
> '#if.*(MIN_VERSION)|(GLASGOW_HASKELL)' | wc -l
> 64
>
>
> As a user this is another bikeshedding change that is not going to benefit
> me at all. Maintaining a Haskell library can be an exasperating exercise of
> running on a treadmill to keep up with busywork caused by changes to the
> core language and libraries. My feeling is starting to become that the
> libraries committee is doing as much (if not more) to *cause* problems
> and work for me than it is doing to improve the common infrastructure.
>

On the libraries I maintain and have a copy of on my computer right now: 329
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-prime


Re: Remove Enum from Float and Double

2013-06-11 Thread Johan Tibell
On Tue, Jun 11, 2013 at 3:00 PM, Roman Cheplyaka  wrote:

> Does such thing as a deprecation pragma for an instance exist?
> What triggers it?
>

I don't know. We'll need one if we're going to deprecating core instances.
Just deleting them is not an option (as it gives users with large code
bases no time to migrate).
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Remove Enum from Float and Double

2013-06-11 Thread Johan Tibell
If we truly believe that the instance is dangerous for users (and not
merely for people who don't understand floating point arithmetic on
computers), then we should add a deprecation pragma to the instance and
discourage its use. But what would the deprecation message encourage
instead, for users to write an explicit loop that tests against some
lower/upper bound? It would have the same problem as enumFromTo. I think
the issue here is really that floating point math on computers is hard to
think about.


On Tue, Jun 11, 2013 at 11:18 AM, harry  wrote:

> Johan Tibell  writes:
>
> > I don't see much gain. It will break previously working code and the
> workaround to the breakage will likely be manually reimplementing
> enumFromTo
> in each instance.
>
> I forgot the main point of my post :-)
>
> The primary motivation for removing these instances is that they cause
> endless confusion for beginners, e.g.
>
> http://stackoverflow.com/questions/13203471/the-math-behind-1-0999-in-haskell
> ,
> http://stackoverflow.com/questions/9810002/floating-point-list-generator,
> http://stackoverflow.com/questions/7290438/haskell-ranges-and-floats,
>
> http://stackoverflow.com/questions/10328435/how-to-solve-floating-point-number-getting-wrong-in-list-haskell
> ,
> and many more.
>
> On the other hand, how much working code is there "correctly" using there
> instances?
>
>
> ___
> Haskell-prime mailing list
> Haskell-prime@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-prime
>
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Remove Enum from Float and Double

2013-06-11 Thread Johan Tibell
Hi Harry,

On Tue, Jun 11, 2013 at 3:51 AM, harry  wrote:

> There have been several discussions over the years regarding Enum instances
> for Float and Double. The conclusion each time appears to have been that
> there are no instances which are both sane and practical.
>

Do you have a link to some of those discussions? I have a vague memory of
them but can no longer remember the specifics.


> I would like to propose that these instances be removed from Haskell 2014.
> This may be covered by the various alternative prelude and number class
> proposals, but they are much more ambitious and less likely to make it into
> the standard in the short term.
>

I don't see much gain. It will break previously working code and the
workaround to the breakage will likely be manually reimplementing
enumFromTo in each instance.

Cheers,
Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Proposal: NoImplicitPreludeImport

2013-05-28 Thread Johan Tibell
On Tue, May 28, 2013 at 8:23 AM, Ian Lynagh  wrote:

>
> Dear Haskellers,
>
> I have made a wiki page describing a new proposal,
> NoImplicitPreludeImport, which I intend to propose for Haskell 2014:
>
> http://hackage.haskell.org/trac/haskell-prime/wiki/NoImplicitPreludeImport
>
> What do you think?


-1 for me.

Breaking every single Haskell module for some namespace reorganization
doesn't seem worth it. I don't think alternative Prelused (one of the
justifications) is a good idea to begin with, as programmers will have to
first understand which particular version of e.g. map this module uses,
instead of knowing it's the same one as every other module uses.

Changes like this will likely cause years worth of pain e.g. see the Python
2/Python 3 failure.

The likely practical result of this is that every module will now read:

module M where

#if MIN_VERSION_base(x,y,z)
import Prelude
#else
import Data.Num
import Control.Monad
...
#endif

for the next 3 years or so.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: proposal for trailing comma and semicolon

2013-05-17 Thread Johan Tibell
On Fri, May 17, 2013 at 9:17 AM, Garrett Mitchener <
garrett.mitche...@gmail.com> wrote:

> Anyway, this is a "paper cut" in the language that has been bugging me for
> a while, and since there's now a call for suggestions for Haskell 2014, I
> thought I'd ask about it.
>

I've also thought about this issue and I agree with Garrett, allowing that
trailing comma (or semicolon) would help readability*. If it doesn't work
with tuples, perhaps we could at least do it with lists and records?

* It also hurts source control diffs a bit, as adding extra commas will
give diffs that suggest that one additional line was changed.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Bang patterns

2013-02-04 Thread Johan Tibell
On Sun, Feb 3, 2013 at 4:44 PM, Ben Millwood  wrote:
> I have two proposals, I suppose:
> - make bang patterns in let altogether invalid

I would prefer it to be valid. It's the syntactically most lightweight
option we have to force some thunks before using the resulting values
in a constructor that we have. Example

let !x = ...
!y = ...
in C x y

The alternative would be

let x = ...
y = ...
in x `seq` y `seq` C x y

which obscures the code much more.

My 2 cents.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell] Status of Haskell'?

2012-11-30 Thread Johan Tibell
On Fri, Nov 30, 2012 at 1:42 PM, Jason Dusek  wrote:
> It would be nice for there to be a new standard so that many
> features in GHC -- such as overloaded strings, rank n types,
> MPTCs, &c. -- were enabled by default without any pragmas.

I think this is one of these nice gains for day-to-day Haskell
programming. Less typing and fewer things to explain to beginners
("You see, this might seem like an experimental feature I'm asking you
to use, but it really isn't.")

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-26 Thread Johan Tibell
On Mon, Mar 26, 2012 at 10:12 AM, Ian Lynagh  wrote:
> I am very unicode-ignorant, so apologies if I have misunderstood
> something, but doesn't Text do the same thing?
>
> Prelude T> import Data.Text.IO as T
> Prelude T T> T.putStrLn (T.take 5 (T.pack "Fro\x0308hßen"))
> Fröh
>
> Maybe your point is that neither "take" function should be used with
> unicode strings, but I don't see how advocating the Text type is going
> to help with that.

We already covered this. Text inherited a list-based API, even if that
sometimes doesn't make sense.

To work with Unicode you need more specific functions for different
tasks. Text only implements a few so far, like case conversion and
case-less comparison, and asks you to use text-icu for the rest.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-26 Thread Johan Tibell
On Mon, Mar 26, 2012 at 9:59 AM, Henrik Nilsson  wrote:
> So, is the argument to deprecate Char, then? As long as Haskell
> allows Chars to be handled in isolation, it would seem impossible
> to prevent naive users from accidentally stumbling over the
> complexities of Unicode?

I haven't proposed anything at all. Someone asked why one should
prefer Text to String. I showed that the former is more correct (given
the currently available APIs) and much faster.

> There are canonical equivalence and compatibility, and each
> has two normal forms (fully composed and fully decomposed),
> and "each of these four normal forms can be used in text processing".
>
> As an example of the difference between "equivalent" and "compatible",
> the ligature "ff" is "compatible - but not canonically equivalent"
> to a sequence of two characters latin "f", meaning they "may be treated the
> same way in some applications (such as sorting and indexing), but not in
> others; and may be substituted for each other in some situations, but not in
> others".
>
> Is it realistic to think that if only Haskell used Text and not
> String = [Char], a naive user/beginner would be able to write
> correct code for all manner of text processing tasks without
> needing to understand a great deal about Unicode?
>
> I'm sorry, but I'm rather sceptical.

Why? We can hide most of these details behind the Text API. We can
pick which encoding and normal form is used internally and then have
the externally provided API for e.g. sorting do the right thing.

> So I reiterate that I see little if any gain, be it in terms of making
> life simpler for beginners, making Haskell more "multi cultural", or
> giving Haskell applications in general a performance boost, in
> deprecating String = [Char] and mandating the use of Text.
> But the costs would be massive.

I agree and thus I don't propose we do something like that.

The way this will go down is that part of the Prelude and other base
modules will eventually be replaced by more modern packages (e.g. see
system-fileio) and the use of String will decline. Unfortunately it's
a bit of a painful transition as today we need to convert back and
forth between the two string types quite a lot.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-26 Thread Johan Tibell
On Mon, Mar 26, 2012 at 9:42 AM, Christian Siefkes
 wrote:
> On 03/26/2012 05:50 PM, Johan Tibell wrote:
>> Normalization isn't quite enough unfortunately, as it does solve e.g.
>>
>>     upcase = map toUppper
>>
>> You need all-at-once functions on strings (which we could add.) I'm
>> just pointing out that most (all?) list functions do the wrong thing
>> when used on Strings.
>
> Hm, do you have any other examples besides toUpper/toLower?

length, cons, head, tail, filter, folds, anything that works on an
element-by-element basis.

> Also, that example is not really an argument against using list functions on
> strings (which, by any reasonable definition, seem to be "sequences of
> characters" -- whether that sequence is represented as a list, an array, or
> something else, seems more like an implementation detail to me).

I agree on the second part. As someone pointed out earlier, we should
be careful in using the word character as the Unicode code point
doesn't correspond well to the commonly used concept of a character.
What we have today is really:

type String = [CodePoint]

What you would normally think of as a character might consists of
several code points.

> Rather, it
> indicates the fact that Char.toUpper may have to wrong type. If its type was
> Char -> String instead of Char -> Char, it could handle things like toUppper
> 'ß' == "SS" correctly. Then stuff like
>
>        upcase = concatMap toUppper
>
> would work fine.

Yes.

> As it is, the problem seems to be with Char, not with [Char].

[Char] is a semantically OK representation of a Unicode string, using
an array like text does is simply an optimization. However, using the
list function defined by the Prelude is not a good idea if you want to
process a Unicode string correctly.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-26 Thread Johan Tibell
On Mon, Mar 26, 2012 at 8:34 AM, Malcolm Wallace  wrote:
> Yes indeed.  And I think it would be perfectly reasonable for the String (= 
> [Char]) API to have a function "normalise :: String -> String" which would 
> let the user deal with this issue as they see fit.  After all, if you are 
> aware of the difference between combining characters and normalised 
> characters, then you will want to make your own decision about what semantics 
> you want from operations like "take".

Normalization isn't quite enough unfortunately, as it does solve e.g.

upcase = map toUppper

You need all-at-once functions on strings (which we could add.) I'm
just pointing out that most (all?) list functions do the wrong thing
when used on Strings.

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-26 Thread Johan Tibell
On Mon, Mar 26, 2012 at 7:48 AM, Malcolm Wallace  wrote:
>> In the region of this side of the Atlantic Ocean where I teach, the
>> student population is very diverse
>
> Prelude> putStrLn (take 5 "Fröhßen")
> Fröhß

ghci> putStrLn "Fro\x0308hßen"
Fröhßen
ghci> putStrLn (take 5 "Fro\x0308hßen")
Fröh

Your example works because your input happens to be in a normal form.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 5:54 PM, Gabriel Dos Reis
 wrote:
> I think there is a confusion here.  A Unicode character is an abstract
> entity.  For it to exist in some concrete form in a program, you need
> an encoding.  The fact that char16_t is 16-bit wide is irrelevant to
> whether it can be used in a representation of a Unicode text, just like
> uint8_t (e.g. 'unsigned char') can be used to encode Unicode string
> despite it being only 8-bit wide.   You do not need to make the
> character type exactly equal to the type of the individual element
> in the text representation.

Well, if you have a >21-bit type you can declare its value to be a
Unicode code point (which are numbered.) Using a char* that you claim
contain utf-8 encoded data is bad for safety, as there is no guarantee
that that's indeed the case.

> Note also that an encoding itself (whether UTF-8, UTF-16, etc.) is 
> insufficient
> as far as text processing goes; you also need a localization at the
> minimum.  It is the
> combination of the two that gives some meaning to text representation
> and operations.

text does that via ICU. Some operations would be possible without
using the locale, if it wasn't for those Turkish i:s. :/

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 4:42 PM, Gabriel Dos Reis
 wrote:
> Hmm, std::u16string, std::u23string, and std::wstring are C++ standard
> types to process Unicode texts.

Note that at least u16string is too small to encode all of Unicode and
wstring might be as 16 bits is not enough to encode all of Unicode.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 3:33 PM, Freddie Manners  wrote:
> To add my tuppence-worth on this, addressed to no-one in particular:
>
> (1) I think getting hung up on UTF-8 correctness is a distraction here.  I
> can't imagine anyone suggesting that the C/C++ standards removed support for
> (char*) because it wasn't UTF-8 correct: sure, you'd recommend people use a
> different type when it matters, but the language standard itself shouldn't
> be driven by technical issues that don't affect most people most of the
> time.  I'm sure it's good engineering practice to worry about these things,
> but the standard isn't there to encourage good engineering practice.

(I assume you mean Unicode correctness. UTF-8 is only one possible
encoding. Also I'm not arguing for removing type String = [Char], I
arguing why Text is better than String.)

C++'s char* is morally equivalent of our ByteString, not Text. There's
no standardized C++ Unicode string type, ICU's UnicodeString is
perhaps the closest to one.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 3:45 PM, Isaac Dupree
 wrote:
> How is Text for small strings currently (e.g. one English word, if not one
> character)?  Can we reasonably recommend it for that?
> This recent question suggests it's still not great:
> http://stackoverflow.com/questions/9398572/memory-efficient-strings-in-haskell

It's definitely not as good as it could be with the common case being
2 bytes per code point and then some fixed overhead.

The UTF-8 GSoC project last summer was an attempt to see if we could
do better, but unfortunately GHC does a worse job streaming out of a
byte array containing utf-8 than out of a byte array containing utf-16
(due to bad branch layout.)

This resulted in some performance gains and some performance losses,
with some more wins and losses. As there are other engineering
benefits in favor of utf-16 (e.g. being able to use ICU efficiently)
we opted for not switching the decoding. If we can get GHC to the
point where it compiles an utf-8 based Text really well, we could
reconsider this decision.

There's also a design trade-off in Text that favors better asymptotic
complexity for some operations (e.g. taking substrings) that adds 2
words of overhead to every string.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 2:31 PM, Brandon Allbery  wrote:
> I was under the impression they have been very carefully designed to do the
> right thing with characters represented by multiple codepoints, which is
> something the String version *cannot* do.  It would help if Bryan were
> involved with this discussion, though.  (I'm cc:ing him on this.)  Since the
> whole point of Data.Text is to handle stuff like this properly I would be
> surprised if your assertion that
>
>> >     upcase :: String -> String
>> >     upcase = map toUpper
>>
>> This is no more incorrect than
>>    upcase = Data.Text.map toUpper
>
>
> is correct.

This is simply not possible given the Unicode specification. There's
no code point that corresponds to the two characters used to represent
an upcased version of the essets. I think the list based API predates
Bryan.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
On Sat, Mar 24, 2012 at 1:16 PM, Ian Lynagh  wrote:
> Data.Text seems to think that many of them are worth reimplementing for
> Text. It looks like someone's systematically gone through Data.List.
> And in fact, very few functions there /don't/ look like they are
> directly equivalent to list functions.

I'm not sure why the list-inspired functions are there. It doesn't
really matter. It doesn't change the fact that from a Unicode
perspective they give the wrong result in most situations.

> This is no more incorrect than
>    upcase = Data.Text.map toUpper

No and that's why Bryan added a correct case-modification, case
folding, etc to text.

> There's no reason that there couldn't be a Data.String.toUpper
> corresponding to Data.Text.toUpper.

That's true. But this isn't the point we were discussing. We were
discussing whether the simplification of treating strings as a list is
a good thing (from an educational perspective.) I pointer out that
from a correctness perspective it's wrong.

> I think Heinrich meant 20% performance in a useful program, not a
> micro-benchmark.

I that's what he meant and given that "useful program" isn't defined,
so the 20% number is completely arbitrary.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-24 Thread Johan Tibell
Hi all,

On Sat, Mar 24, 2012 at 12:39 AM, Heinrich Apfelmus
 wrote:
> Which brings me to the fundamental question behind this proposal: Why do we
> need Text at all? What are its virtues and how do they compare? What is the
> trade-off? (I'm not familiar enough with the Text library to answer these.)
>
> To put it very pointedly: is a %20 performance increase on the current
> generation of computers worth the cost in terms of ease-of-use, when the
> performance can equally be gained by buying a faster computer or more RAM?
> I'm not sure whether I even agree with this statement, but this is the
> trade-off we are deciding on.

Correctness
==

Using list-based operations on Strings are almost always wrong, as
soon as you move away from English text. You almost always have to
deal with Unicode strings as blobs, considering several code points at
once. For example,

upcase :: String -> String
upcase = map toUpper

Is terse, beautiful, and wrong, as several languages map a single
lowercase character to two uppercase characters (as I'm sure you're
aware.)

Perhaps this is OK to ignore when teaching students Haskell, but it
really hurts those who want to use Haskell as an engineering language.

Performance
===

Depending on the benchmark, the difference can be much bigger than
20%. For example, here's a comparison of decoding UTF-8 byte data into
a String vs a Text value:

benchmarking Pure/decode/Text
mean: 50.22202 us, lb 50.08306 us, ub 50.37669 us, ci 0.950
std dev: 751.1139 ns, lb 666.2243 ns, ub 865.8246 ns, ci 0.950
variance introduced by outliers: 7.553%
variance is slightly inflated by outliers

benchmarking Pure/decode/String
mean: 188.0507 us, lb 187.4970 us, ub 188.6955 us, ci 0.950
std dev: 3.053076 us, lb 2.647318 us, ub 3.606262 us, ci 0.950
variance introduced by outliers: 9.407%
variance is slightly inflated by outliers

A difference of almost 4x.

Many of the Text vs String benchmarks measure the performance of
operations ignoring both decoding and encoding, while any real
application would have to do both.

On top of that, String is more or less as optimized as it can be;
benchmarks are almost completely memory bound. Text on the other hand
still has potential of (large) improvements, as GHC doesn't general
optimal code for tight loops over arrays. For example, we know that
GHC generates bad code for decodeUtf8 as used by Text's stream fusion,
hurting any code that uses fusion.

Furthermore, the memory overhead of Text is smaller, which means that
applications that hold on to many string value will use less heap and
thus experience smaller "freezes" due major GC collections, which are
linear in the heap size.

Cheers,
Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-20 Thread Johan Tibell
On Tue, Mar 20, 2012 at 2:25 AM, Simon Marlow  wrote:
> Is there a reason not to put all these methods in the IsString class, with 
> appropriate default definitions?  You would need a UTF-8 encoder (& decoder) 
> of course, but it would reduce the burden on clients and improve backwards 
> compatibility.

That sounds fine to me. I'm leaning towards only having
unpackUTF8String (in addition to the existing method), as in the
absence of proper byte literals we would have literals which change
types, depending on which bytes they contain*. Ugh!

* Is it even possible to create non-UTF8 literals without using
escaped sequences?

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-19 Thread Johan Tibell
On Mon, Mar 19, 2012 at 2:55 PM, Daniel Peebles  wrote:
> If the input is specified to be UTF-8, wouldn't it be better to call the
> method unpackUTF8 or something like that?

Sure.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-19 Thread Johan Tibell
On Mon, Mar 19, 2012 at 9:02 AM, Christian Siefkes
 wrote:
> On 03/19/2012 04:53 PM, Johan Tibell wrote:
>> I've been thinking about this question as well. How about
>>
>> class IsString s where
>>     unpackCString :: Ptr Word8 -> CSize -> s
>
> What's the Ptr Word8 supposed to contain? A UTF-8 encoded string?

Yes.

We could make a distinction between byte and Unicode literals and have:

class IsBytes a where
unpackBytes :: Ptr Word8 -> Int -> a

class IsText a where
unpackText :: Ptr Word8 -> Int -> a

In the latter the caller guarantees that the passed in pointer points
to wellformed UTF-8 data.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-19 Thread Johan Tibell
On Mon, Mar 19, 2012 at 8:45 AM, Thomas Schilling
 wrote:
> Regarding the type class for converting to and from that type, there
> is a perhaps more complicated question: The current fromString method
> uses String as the source type which causes unnecessary overhead. This
> is unfortunate since GHC's built-in mechanism actually uses
> unpackCString[Utf8]# which constructs the inefficient String
> representation from a compact memory representation.  I think it would
> be best if the new fromString/fromText class allowed an efficient
> mechanism like that.  unpackCString# has type Addr# -> [Char] which is
> obviously GHC-specific.

I've been thinking about this question as well. How about

class IsString s where
unpackCString :: Ptr Word8 -> CSize -> s

It's morally equivalent of unpackCString#, but uses standard Haskell types.

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: String != [Char]

2012-03-19 Thread Johan Tibell
Hi Greg,

There are a few blog posts on Bryan's blog. Here are two of them:


http://www.serpentine.com/blog/2009/10/09/announcing-a-major-revision-of-the-haskell-text-library/
http://www.serpentine.com/blog/2009/12/10/the-performance-of-data-text/

Unfortunately the blog seems partly broken. Images are missing and
some articles are missing altogether (i.e. the article is there but
the actualy body text is gone.)

-- Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: New libraries process

2011-05-26 Thread Johan Tibell
On Thu, May 26, 2011 at 9:37 AM, Simon Peyton-Jones
 wrote:
> Friends
>
> Thanks to those who responded to the message below, about improving the 
> process for developing the core Haskell libraries.
>        http://www.haskell.org/haskellwiki/Library_submissions/NewDraft
>
> Feedback has been broadly positive, with constructive suggestions that we've 
> incorporated in the text.  I suggest that we leave another week for debate 
> and refinement, and (unless there are some substantial new points) adopt the 
> new process from 9 June.
>
> I hope that's agreeable. (We don't have a process for modifying the process 
> :-)

Sounds good to me.

Cheers,
Johan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Outlaw tabs

2009-01-26 Thread Johan Tibell
On Mon, Jan 26, 2009 at 9:57 AM, Peter Hercek  wrote:
> I personally do not mind them in my source code if the leading isSpace
> characters on lines are everywhere of the same kind in one source file. If
> all indentations would be done only with tabs then one can easily change the
> indent size for whole file. If monitor is big enough a bigger indent size is
> nicer (easier to see what is indented together). If monitor is smaller then
> smaller indent size is better (lines will not wrap so early).

I do for the following reason: If you use only tabs for leading
whitespace you loose the ability to align things. Here's and example
using a list (view using a fixed width font):

lst = [1, 2, 3
   4, 5, 6]

This definition uses alignment to align the first element on the first
line with the first element of the second line. You can't do this kind
of alignment using tabs.

Cheers,

Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: patch applied (haskell-prime-status): add ""Make $ left associative, like application"

2008-04-28 Thread Johan Tibell
On Mon, Apr 28, 2008 at 6:56 PM, Simon Marlow <[EMAIL PROTECTED]> wrote:
>  So I suggest we reject the proposal, and move any further discussion to
> haskell-cafe.  Ok?

Sounds good to me.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: patch applied (haskell-prime-status): add ""Make $ left associative, like application"

2008-04-24 Thread Johan Tibell
On Thu, Apr 24, 2008 at 3:41 PM, Wolfgang Jeltsch
<[EMAIL PROTECTED]> wrote:
> Am Donnerstag, 24. April 2008 09:30 schrieb Lennart Augustsson:
>
> > Haskell has now reached the point where backwards compatibility is something
>  > that must be taken very seriously.
>
>  Would you be opposed to a Haskell 2 which would break lots of things?

I would! No new language standard should break lots of things! It
could break some things and should provide easy rewrite rules for code
(or better yet a tool like Python's 2to3) to move from standard A to
standard B for most of the things that break.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Meta-point: backward compatibility

2008-04-23 Thread Johan Tibell
On Wed, Apr 23, 2008 at 4:52 PM, Niklas Broberg
<[EMAIL PROTECTED]> wrote:
>  I would hope it is both. Some changes simply cannot become current
>  practice since they would not be compatible with existing code, and
>  the only place that such changes *could* be made is in a new language
>  version. Like you say, fail in the Monad class is one such issue that
>  would not be backwards compatible, and couldn't become a current
>  practice without some help. Chicken or egg first?

You're of course right. Haskell' could be both. It probably should be
as the next Haskell standard (after Haskell') will probably be several
years in the future. It would be a shame to wake up the day after GHC
fully implements Haskell' and notice that nothing has changed and my
old annoyances are still there.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Meta-point: backward compatibility

2008-04-23 Thread Johan Tibell
On Wed, Apr 23, 2008 at 2:53 PM, Philippa Cowderoy <[EMAIL PROTECTED]> wrote:
>  Current practice often involves removing certain warts anyway - the MR
>  being a great example.

I was more thinking of things like removing fail from the Monad class
and fixing the I/O libraries to use Unicode, etc.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Meta-point: backward compatibility

2008-04-23 Thread Johan Tibell
An interesting question. What is the goal of Haskell'? Is it to, like
Python 3000, fix warts in the language in an (somewhat) incompatible
way or is it to just standardize current practice? I think we need
both, I just don't know which of the two Haskell' is.

-- Johan

On Wed, Apr 23, 2008 at 2:16 PM, Chris Smith <[EMAIL PROTECTED]> wrote:
> There appears to be some question as to the backward compatibility goals
>  of Haskell'.  Perhaps it's worth bringing out into the open.
>
>  >From conversations I've had and things I've read, I've always gathered
>  that the main goal of Haskell' is to address the slightly embarrassing
>  fact that practically no one actually writes code in Haskell, if by
>  Haskell we mean the most recent completed language specification.  This
>  obviously argues strongly for a high degree of backward compatibility.
>
>  On the other hand, I am assuming everyone agrees that we don't want to
>  replicate Java, which (in my view, anyway) is rapidly becoming obsolete
>  because of an eagerness to make the language complex, inconsistent, and
>  generally outright flawed in order to avoid even the most unlikely of
>  broken code.
>
>  --
>  Chris
>
>  ___
>  Haskell-prime mailing list
>  Haskell-prime@haskell.org
>  http://www.haskell.org/mailman/listinfo/haskell-prime
>
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Has character changed in GHC 6.8?

2008-01-23 Thread Johan Tibell
On Jan 23, 2008 2:11 PM, Magnus Therning <[EMAIL PROTECTED]> wrote:
> Yes, this reflects my recent experience, Char is not a good representation
> for an 8-bit byte.  This thread came out of my attempt to add a module to
> dataenc[1] that would make base64-string[2] obsolete.  As you probably can
> guess I came to the conclusion that a function for data encoding with type
> 'String -> String' is plain wrong. :-)

Yes. Functions that deal with bytes shouldn't use Char. Char should be
seen as and ADT representing Unicode code points. It has nothing to do
with bytes.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Has character changed in GHC 6.8?

2008-01-23 Thread Johan Tibell
> > The benefit would be that if the input is not in latin-1 an exception
> > could be thrown rather than returning a Char representing the wrong
> > Unicode code point.
>
> I'm not sure what you mean here. All 256 possible values have a meaning.

You're of course right. So we don't have a problem here. Maybe I was
thinking of an encoding (7-bit ASCII?) where some of the 256 values
are invalid.

> > My proposal is for I/O functions to specify the encoding they use if
> > they accept or return Chars (and Strings). If they deal in terms of
> > bytes (e.g. socket functions) they should accept and return Word8s.
>
> I would be more inclined to suggest they default to a particular well
> understand encoding, almost certainly UTF8. Another interface could give
> access to other encodings.

That might be a good option. However, it would be nice if beginners
could write simple console programs using System.IO and have them work
correctly even if their system's encoding is not byte compatible with
UTF-8. People who do I/O over the network etc. need to be more careful
and should specify the encoding used. How would a UTF-8 default work
on different Windows versions?

> > Optionally, text I/O functions could default to the system locale
> > setting.
>
> That is a disastrous idea.

I'm not sure about that as long as decode is called on the input to
make sure that it's a valid encoding given the input bytes. Same point
as above. What I would like to avoid is having to write:

main = do
  putStrLn systemLocalEncoding "What's your name?"
  name <- getLine systemLocalEncoding
  putStrLn systemLocalEncoding  $ "Hi " ++ name ++ "!"

I guess we could solve this by putting the functions in different modules:

System.IO  -- requires explicit encoding
System.IO.DefaultEncoding  -- implicit use of system locale setting

And have the modules export the same functions. Another option would
be to include the fact that encoding is implied in the name of the
function. Maybe we should start by giving some type signatures and
function names. That often helps my thinking. I'll try to write
something down when I get home from work.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Has character changed in GHC 6.8?

2008-01-23 Thread Johan Tibell
> > > > What *does* matter to the programmer is what encodings putStr and
> > > > getLine use. AFAIK, they use "lower 8 bits of unicode code point" which
> > > > is almost functionally equivalent to latin-1.
> > >
> > > Which is terrible! You should have to be explicit about what encoding
> > > you expect. Python 3000 does it right.
> >
> > Presumably there wasn't a sufficiently good answer available in time for
> > haskell98.
>
> Will there be one for haskell prime ?

The I/O library needs an overhaul but I'm not sure how to do this in a
backwards compatible manner which probably would be required for
inclusion in Haskell'. One could, like Python 3000, break backwards
compatibility. I'm not sure about the implications of doing this.
Maybe introducing a new System.IO.Unicode module would be an option.

If one wants to keep the interface but change the semantics slightly
one could define e.g. getChar as:

getChar :: IO Char
getChar = getWord8 >>= decodeChar latin1

Assuming latin-1 is what's used now.

The benefit would be that if the input is not in latin-1 an exception
could be thrown rather than returning a Char representing the wrong
Unicode code point.

I recommend reading about the Python I/O system overhaul for Python
3000 which is outlined in PEP 3116
http://www.python.org/dev/peps/pep-3116/

My proposal is for I/O functions to specify the encoding they use if
they accept or return Chars (and Strings). If they deal in terms of
bytes (e.g. socket functions) they should accept and return Word8s.
Optionally, text I/O functions could default to the system locale
setting.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Proposal: hands off the base! :)

2008-01-21 Thread Johan Tibell
Hi Bulat,

On Jan 17, 2008 6:18 PM, Bulat Ziganshin <[EMAIL PROTECTED]> wrote:
> step 1: create library NewArray with modules Data.NewArray.* copied
> one-to-one from Data.Array.* and publish it as version 1

Having words like "new" for the purpose of versioning is quite
confusing because a library which is new at some point will eventually
become old and then the name is misleading. Versioning doesn't belong
in module/function names IMHO.

-- Johan
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime