Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread Kevin Quick



On Tue, 31 Jan 2012 23:10:34 -0700, Anthony Clayden  
 wrote:

I'm proposing x.f is _exactly_ f x. That is, the x.f gets
desugared at an early phase in compilation.


Anthony,

I think part of the concern people are expressing here is that the above  
would imply the ability to use point-free style.  But this orthogonality  
is disavowed by your exception:



A 'one-sided dot doesn't mean anything.


I haven't read the underlying proposals, so I apologize if the following  
is covered, but my understanding of the discussion is that the x.f  
notation is intended to disambiguate f to be a field name of the type of x  
and therefore be advantageous over "f x" notation where f is presently in  
the global namespace.


With your exception, I still cannot disambiguate the following:

data Rec = { foo :: String }

foo :: Rec -> String
foo = show

rs :: [Rec]
rs = [ ... ]

bar = map foo rs

If the exception doesn't exist, then I could write one of the following to  
clarify my intent:


bar = map foo rs
baz = map .foo rs


--
-KQ

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: combinatorics

2012-01-31 Thread wren ng thornton

On 1/31/12 8:58 AM, Jean-Marie Gaillourdet wrote:

A slight variation on that approach is to use implicit parameters to 
parameterize your functions by the primes. Something allong the following lines:


That is another option. However, implicit parameters are GHC-only and 
seldom used even in GHC. The RTS-hacking I mentioned in the announcement 
would also be GHC-only, which is part of the reason I'd prefer to find a 
non-cumbersome way of dealing with the issue purely. As it stands the 
library is Haskell98 (with some trivial CPP to make the Haddocks pretty) 
and it'd be nice to stay that way.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: combinatorics

2012-01-31 Thread wren ng thornton

On 1/30/12 3:54 PM, Roman Cheplyaka wrote:

Makes sense; but doesn't making the monad abstract and putting all
functions in the monad address the fragility issue?


The primary issue with monads is that the syntax is extremely cumbersome 
for the expected use case. It'd be like paranoid C where, since order of 
evaluation is unspecified, all subexpressions are floated out into 
let-bindings. At that point (a) the verbosity is ugly, (b) the code is 
much harder to follow, and (c) it's all too easy to introduce errors 
where you use x instead of x' or the like.


The semantic model of monads just isn't a good fit for this domain. 
There really aren't any side effects going on, there's no sequencing of 
actions, there's no "little language" that's being implemented,... I 
love me some monads and all, but they just don't fit here.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: combinatorics

2012-01-31 Thread wren ng thornton

On 1/30/12 12:55 PM, Balazs Komuves wrote:


-- combinatorics 0.1.0


The combinatorics package offers efficient *exact* computation of common
combinatorial functions like the binomial coefficients and factorial.
(For fast *approximations*, see the math-functions package instead.)


Shameless self-promotion: The combinat package (which deliberately
does not try to own the valuable namespace Math.Combinatorics) is
a more extensive combinatorics library:

http://hackage.haskell.org/package/combinat

While the main focus is the generation of combinatorial objects themselves,
counting functions and common number sequences like the above are
also offered.


I came across that package when looking around, but I only noticed the 
generation of combinatorial objects rather than the counting functions.


As for namespacing, I'd be happy to move things further down to 
Math.Combinatorics.Exact (or similar)[1] it's just that 
Math.Combinatorics.* seemed not to be used by the numerous packages 
floating around this area with different purposes[2], and it seems the 
natural place for this sort of work. I think it'd be nice to get a bit 
more collaboration among folks doing this stuff so we can (a) clean up 
the namespace for this topic, and (b) reduce the amount of duplicated 
effort.



[1] I'll probably end up doing that anyways, if I follow through with 
the proposed pure solution to the space-leak issue about storing the primes.


[2] HaskellForMaths, gamma, statistics, erf, math-functions, combinat,...



Even though the binomial and factorial definition in this package are the
naive ones, a quick experiment imply that the differences start show
themselves around 100,000 factorial, or choosing 50,000 elements out
of 100,000, which is probably a rather specialized use case.


In my experiments the threshold was a bit lower, but yes it's special 
purpose for people who need exact answers for big problems.




The primes function in the combinat package is based on an old Cafe
thread, and actually seems to be faster than the one in the combinatorics
package.


The primes generator was some old code I had laying around for one of 
those online programming challenges; fast enough for the task. I'll 
probably trade it in for your algorithm though. One of the things I'm 
disappointed by about the current implementation is the memory overhead 
for storing the primes. It'd be nice to use chunked arrays of unboxed 
integers in order to remove all the pointers; but my attempt at doing so 
had catastrophic performance...


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread Anthony Clayden

> On 1/02/2012, at 11:38 AM, AntC wrote:
> > As soon as you decide to make 'virtual record selectors'
> > just ordinary  functions (so they select but not update)
> > , then you can see that field names  are also just
> ordinary functions (for selection purposes). So the
> > semantics  for field 'selection' (whether or not you use
> > dot notation) is just function  application. So
> Type-Directed Name resolution is just instance resolution.
> > So  it all gets much easier.
> 
> 
> Richard O'Keefe wrote:
> ...  Making f x
> and x.f the same is pretty appealing, but it is imaginable
> that the former might require importing the name of f from
> a module and the latter might not. That is to say, it lets
> f and .f have completely different meanings. Oh the joy! 
> Oh the improved readability!  -- on some other planet,
> maybe.
> 
Hi Richard, I'm not sure I understand what you're saying.

I'm proposing x.f is _exactly_ f x. That is, the x.f gets
desugared at an early phase in compilation.
If the one needs importing some name from a module, than so
does the other.

A 'one-sided dot doesn't mean anything. (Also, I feel
vaguely nauseous even seeing it written down.)
Under my proposal, the only thing .f could mean is:
 \z -> z.f
which desugars to
 \z -> f z
which means (by eta-reduction)
  f

And to complete the story: the only thing (x.) could mean
is:
 \g -> x.g
So a use like:
 (x.) f-- or z f, where z = (x.)
would desugar to
  f x
which is the same as x.f
A use like (x.)f (no spaces around the parens) would amount
to the same thing.


This is all so weird I'm inclined to say that one-sided dot
is probably a syntax error, and reject it. It's too
dangerously ambiguous between the syntax for 'proper' dot
notation and function composition.

Or is there something I'm not understanding?
[Good to see another NZ'er on the list, by the way.]

AntC

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Contributing to http-conduit

2012-01-31 Thread Myles C. Maxfield
Well, this is embarrassing. Please disregard my previous email. I should
learn to read the RFC *before* submitting proposals.

--Myles

On Tue, Jan 31, 2012 at 6:37 PM, Myles C. Maxfield  wrote:

> Here are my initial ideas about supporting cookies. Note that I'm using
> Chrome for ideas since it's open source.
>
>- Network/HTTP/Conduit/Cookies.hs file
>- Exporting the following symbols:
>   - type StuffedCookie = SetCookie
>  - A regular SetCookie can have Nothing for its Domain and Path
>  attributes. A StuffedCookie has to have these fields set.
>   - type CookieJar = [StuffedCookie]
>  - Chrome's cookie jar is implemented as (the C++ equivalent of)
>  Map W.Ascii StuffedCookie. The key is the "eTLD+1" of the domain, so
>  lookups for all cookies for a given domain are fast.
>  - I think I'll stay with just a list of StuffedCookies just to
>  keep it simple. Perhaps a later revision can implement the faster 
> map.
>   - getRelevantCookies :: Request m -> CookieJar -> UTCTime ->
>   (CookieJar, Cookies)
>  - Gets all the cookies from the cookie jar that should be set
>  for the given Request.
>  - The time argument is whatever "now" is (it's pulled out of the
>  function so the function can remain pure and easily testable)
>  - The function will also remove expired cookies from the cookie
>  jar (given what "now" is) and return the filtered cookie jar
>   - putRelevantCookies :: Request m -> CookieJar -> [StuffedCookie]
>   -> CookieJar
>  - Insert cookies from a server response into the cookie jar.
>  - The first argument is only used for checking to see which
>  cookies are valid (which cookies match the requested domain, etc, so
>  site1.com can't set a cookie for site2.com)
>   - stuffCookie :: Request m -> SetCookie -> StuffedCookie
>  - If the SetCookie's fields are Nothing, fill them in given the
>  Request from which it originated
>   - getCookies :: Response a -> ([SetCookie], Response a)
>  - Pull cookies out of a server response. Return the response
>  with the Set-Cookie headers filtered out
>   - putCookies :: Request a -> Cookies -> Request a
>  - A wrapper around renderCookies. Inserts some cookies into a
>  request.
>  - Doesn't overwrite cookies that are already set in the request
>   - These functions will be exported from Network.HTTP.Conduit as
>well, so callers can use them to re-implement redirection chains
>- I won't implement a cookie filtering function (like what
>Network.Browser has)
>   - If you want to have arbitrary handling of cookies, re-implement
>   redirection following. It's not very difficult if you use the API 
> provided,
>   and the 'http' function is open source so you can use that as a 
> reference.
>- I will implement the functions according to RFC 6265
>- I will also need to write the following functions. Should they also
>be exported?
>   - canonicalizeDomain :: W.Ascii -> W.Ascii
>  - turns "..a.b.c..d.com..." to "a.b.c.d.com"
>  - Technically necessary for domain matching (Chrome does it)
>  - Perhaps unnecessary for a first pass? Perhaps we can trust
>  users for now?
>   - domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
>  - Does the first domain match against the second domain?
>  - If so, return the prefix of the first that isn't in the second
>   - pathMatches :: W.Ascii -> W.Ascii -> Bool
>  - Do the paths match?
>   - In order to implement domain matching, I have to have knowledge
>of the Public Suffix 
> List
>  so
>I know that sub1.sub2.pvt.k12.wy.us can set a cookie for
>sub2.pvt.k12.wy.us but not for k12.wy.us (because pvt.k12.wy.us is a
>"suffix"). There are a variety of ways to implement this.
>   - As far as I can tell, Chrome does it by using a script (which a
>   human periodically runs) which parses the list at creates a .cc file 
> that
>   is included in the build.
>  - I might be wrong about the execution of the script; it might
>  be a build step. If it is a build step, however, it is suspicious 
> that a
>  build target would try to download a file...
>   - Any more elegant ideas?
>
> Feedback on any/all of the above would be very helpful before I go off
> into the weeds on this project.
>
> Thanks,
> Myles C. Maxfield
>
> On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman wrote:
>
>> Thanks, looks great! I've merged it into the Github tree.
>>
>> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
>>  wrote:
>> > Ah, yes, you're completely right. I completely agree that moving the
>> > function into the Maybe monad increases readability. This kind of
>> fu

Re: [Haskell-cafe] Contributing to http-conduit

2012-01-31 Thread Myles C. Maxfield
Here are my initial ideas about supporting cookies. Note that I'm using
Chrome for ideas since it's open source.

   - Network/HTTP/Conduit/Cookies.hs file
   - Exporting the following symbols:
  - type StuffedCookie = SetCookie
 - A regular SetCookie can have Nothing for its Domain and Path
 attributes. A StuffedCookie has to have these fields set.
  - type CookieJar = [StuffedCookie]
 - Chrome's cookie jar is implemented as (the C++ equivalent of)
 Map W.Ascii StuffedCookie. The key is the "eTLD+1" of the domain, so
 lookups for all cookies for a given domain are fast.
 - I think I'll stay with just a list of StuffedCookies just to
 keep it simple. Perhaps a later revision can implement the faster map.
  - getRelevantCookies :: Request m -> CookieJar -> UTCTime ->
  (CookieJar, Cookies)
 - Gets all the cookies from the cookie jar that should be set for
 the given Request.
 - The time argument is whatever "now" is (it's pulled out of the
 function so the function can remain pure and easily testable)
 - The function will also remove expired cookies from the cookie
 jar (given what "now" is) and return the filtered cookie jar
  - putRelevantCookies :: Request m -> CookieJar -> [StuffedCookie] ->
  CookieJar
 - Insert cookies from a server response into the cookie jar.
 - The first argument is only used for checking to see which
 cookies are valid (which cookies match the requested domain, etc, so
 site1.com can't set a cookie for site2.com)
  - stuffCookie :: Request m -> SetCookie -> StuffedCookie
 - If the SetCookie's fields are Nothing, fill them in given the
 Request from which it originated
  - getCookies :: Response a -> ([SetCookie], Response a)
 - Pull cookies out of a server response. Return the response with
 the Set-Cookie headers filtered out
  - putCookies :: Request a -> Cookies -> Request a
 - A wrapper around renderCookies. Inserts some cookies into a
 request.
 - Doesn't overwrite cookies that are already set in the request
  - These functions will be exported from Network.HTTP.Conduit as well,
   so callers can use them to re-implement redirection chains
   - I won't implement a cookie filtering function (like what
   Network.Browser has)
  - If you want to have arbitrary handling of cookies, re-implement
  redirection following. It's not very difficult if you use the
API provided,
  and the 'http' function is open source so you can use that as a
reference.
   - I will implement the functions according to RFC 6265
   - I will also need to write the following functions. Should they also be
   exported?
  - canonicalizeDomain :: W.Ascii -> W.Ascii
 - turns "..a.b.c..d.com..." to "a.b.c.d.com"
 - Technically necessary for domain matching (Chrome does it)
 - Perhaps unnecessary for a first pass? Perhaps we can trust users
 for now?
  - domainMatches :: W.Ascii -> W.Ascii -> Maybe W.Ascii
 - Does the first domain match against the second domain?
 - If so, return the prefix of the first that isn't in the second
  - pathMatches :: W.Ascii -> W.Ascii -> Bool
 - Do the paths match?
  - In order to implement domain matching, I have to have knowledge of
   the Public Suffix
List
so
   I know that sub1.sub2.pvt.k12.wy.us can set a cookie for
   sub2.pvt.k12.wy.us but not for k12.wy.us (because pvt.k12.wy.us is a
   "suffix"). There are a variety of ways to implement this.
  - As far as I can tell, Chrome does it by using a script (which a
  human periodically runs) which parses the list at creates a .cc file that
  is included in the build.
 - I might be wrong about the execution of the script; it might be
 a build step. If it is a build step, however, it is
suspicious that a build
 target would try to download a file...
  - Any more elegant ideas?

Feedback on any/all of the above would be very helpful before I go off into
the weeds on this project.

Thanks,
Myles C. Maxfield

On Sat, Jan 28, 2012 at 8:17 PM, Michael Snoyman wrote:

> Thanks, looks great! I've merged it into the Github tree.
>
> On Sat, Jan 28, 2012 at 8:36 PM, Myles C. Maxfield
>  wrote:
> > Ah, yes, you're completely right. I completely agree that moving the
> > function into the Maybe monad increases readability. This kind of
> function
> > is what the Maybe monad was designed for.
> >
> > Here is a revised patch.
> >
> >
> > On Sat, Jan 28, 2012 at 8:28 AM, Michael Snoyman 
> > wrote:
> >>
> >> On Sat, Jan 28, 2012 at 1:20 AM, Myles C. Maxfield
> >>  wrote:
> >> > the fromJust should never fail, beceause of the guard statement:
> >> >
> >> > | 300 <= code && code < 400 && isJust l'' && isJust l'

Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread Richard O'Keefe

On 1/02/2012, at 11:38 AM, AntC wrote:
> As soon as you decide to make 'virtual record selectors' just ordinary 
> functions (so they select but not update), then you can see that field names 
> are also just ordinary functions (for selection purposes). So the semantics 
> for field 'selection' (whether or not you use dot notation) is just function 
> application. So Type-Directed Name resolution is just instance resolution. So 
> it all gets much easier.

I'm reminded of Pop-2, where f(x) and x.f meant exactly the same thing.
Overloading was a (dynamic) property of f, not a property of dot.

Ada had two reasons for adding dot syntax, and much as I admire Ada,
I'm not sure that I agree with either of them.
One was to be more familiar to programmers from other languages, but
since there remain interesting differences between x.f in Ada and x.f
in other languages, it's not clear to me how much of a kindness that
really is.  The other is that x.f means basically what f(x) would have,
*had f(x) been legal*; the aim was to be able to use methods without
having to important everything from a module.

Now that might have some relevance to Haskell.  Making f x and x.f the
same is pretty appealing, but it is imaginable that the former might
require importing the name of f from a module and the latter might not.
That is to say, it lets f and .f have completely different meanings.
Oh the joy!  Oh the improved readability!  -- on some other planet, maybe.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] Crypto-API 0.9 Release

2012-01-31 Thread Thomas DuBuisson
Oh, sorry for the omission!  I've worked out of HEAD for long enough
that I though that was in 0.8.

On Tue, Jan 31, 2012 at 5:36 PM, Felipe Almeida Lessa
 wrote:
> Also:
>
>  * MacKey has phantom types.
>
> This seems to be the only breaking change [1].

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stuck on HXT basics

2012-01-31 Thread Albert Y. C. Lai

On 12-01-30 08:06 AM, Pēteris Paikens wrote:

import Text.XML.HXT.Core
import Text.XML.HXT.DOM.XmlTreeFilter
selectAllText   :: ArrowXml a =>  a XmlTree XmlTree
selectAllText  = deep isXText


Delete "import Text.XML.HXT.DOM.XmlTreeFilter". Change "isXText" to 
"isText". That is,


import Text.XML.HXT.Core
selectAllText :: ArrowXml a =>  a XmlTree XmlTree
selectAllText = deep isText

I am going to change that on Haskell Wiki.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] Crypto-API 0.9 Release

2012-01-31 Thread Felipe Almeida Lessa
On Tue, Jan 31, 2012 at 9:36 PM, Thomas DuBuisson
 wrote:
> Release 0.9 Changes:
> * Crypto.Classes now exports 'Data.Serialize.encode'
> * AsymCipher now has proper fundeps
> * cpolysArr is no longer one big line

Also:

 * MacKey has phantom types.

This seems to be the only breaking change [1].

Cheers,

[1] http://hdiff.luite.com/cgit/crypto-api/commit?id=0.9

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [ANN] Crypto-API 0.9 Release

2012-01-31 Thread Thomas DuBuisson
Crypto-API is a generic interface for cryptographic operations,
defining classes for hashes, ciphers, and random number generation
while also providing convenience functions such as block cipher modes
and padding. Maintainers of hash and cipher implementations are
encouraged to add instances for the classes defined in Crypto.Classes.
Crypto users are similarly encouraged to use the interfaces defined in
the Classes module.

Release 0.9 Changes:
* Crypto.Classes now exports 'Data.Serialize.encode'
* AsymCipher now has proper fundeps
* cpolysArr is no longer one big line

Cheers,
Thomas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread AntC
Donn Cave  avvanta.com> writes:

> 
> Quoth AntC  clear.net.nz>,
> ...
> > My proposal is that field selection functions be just ordinary functions, 
and 
> > dot notation be just function application(tight-binding). Then:
> >   object.fieldfuncmethod   ==> fieldfuncmethod object
> > (Subject to the tight binding for the dot.)
> > And one-sided dot application is pointless (! errk I mean 'without 
purpose', 
> > no different to writing the bare object or bare fieldfuncmethod).
> 
> That's interesting!  The wiki page on SORF (Simple Overloaded Record Fields,
> http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields)
> has some language that, as I interpreted it, meant that Has/Get syntactic
> sugar depended on the dot, so it's indispensable. 

Yes it does, and that's one of the things I didn't like - hence my counter-
proposal. In particular in SORF, the dot notation got tied into 'virtual 
record selectors'. Now 'virtual record selectors' is a good idea, but SORF 
tied it to the field selection approach, so had to go via a Has instance, 
which introduced a `set' method as well as the get, which didn't make sense, 
so SPJ ran into trouble.

Actually the TDNR proposal was better re the "power of the dot": "works with 
any function".

As soon as you decide to make 'virtual record selectors' just ordinary 
functions (so they select but not update), then you can see that field names 
are also just ordinary functions (for selection purposes). So the semantics 
for field 'selection' (whether or not you use dot notation) is just function 
application. So Type-Directed Name resolution is just instance resolution. So 
it all gets much easier.

>  Your proposal actually
> has some similar language but, I see you don't mean it that way.  That's
> great news for anyone who's really dying to get somewhere with records,
> if it means that the functionality could in principle be introduced
> independently of ...

Yes. Actually, (IMHO) the biggest block to making some progress with 
the 'cottage industry' for records (and there are heaps of ideas out there) is 
that currently the field name appearing in data decls grabs so much of the 
namespace real estate. It creates a global name (selector function) that can't 
be overloaded.

You'll see in my other posts last night (NZ time) that the first thing I think 
should happen is to switch off auto-creation of field selection functions. 
(This should have come along as an option with DisambiguateRecordFields, I 
think. http://www.haskell.org/pipermail/glasgow-haskell-users/2012-
January/021750.html)

> ... changes to the interpretation of "." that would break
> a lot of code.
> 

Yes, in principle we could introduce the semantics for field-selectors-as-
overloaded-functions without introducing any special syntax for field 
selection (dot notation or whatever). But the 'Records in Haskell' thread 
started with a Reddit/Yesod discussion about records, and the lack of dot 
notation being the last major wart in Haskell. "A sentiment open to doubt" in 
the words of the poet. It stung SPJ enough to open up the discussion (and I 
guess now is timely as 7.4.1 gets put to bed).

For me, the record/field namespacing is the major wart, polymorphism only 
slightly less, and the notation is a side-issue. But I don't want to lose the 
initiative that's built up, so I'm trying to address both at the same time.

AntC





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Haskell User Group Hamburg - Meeting on 9th Feb.

2012-01-31 Thread Andreas Baldeau
Hello cafe,

in Hamburg this month a new Haskell User Group has formed. We are trying
to establish a monthly meetup with hopefully interesting talks.

For the start we are having our first regular meetup on the 9th of
February, 19:00 in the betahaus hamburg. On this day I will give a talk
about datastructures in Haskell and give some introduction into
amortized O(1) queues and Tries.

So if you are interested to come have a look at
http://www.doodle.com/da9tr6zeynmcq3z6 (german).

Further announcements will be made via Twitter, so you may want to follow
https://twitter.com/hug_hh.

Andreas Baldeau


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Johan Tibell
On Tue, Jan 31, 2012 at 12:19 PM, Steve Severance
 wrote:
> The webpage data was split out across tens of thousands of files compressed
> binary. I used enumerator to load these files and select the appropriate
> columns. This step was performed in parallel using parMap and worked fine
> once i figured out how to add the appropriate !s.

Even though advertised as parallel programming tools, parMap and other
functions that work in parallel over *sequential* access data
structures (i.e. linked lists.) We want flat, strict, unpacked data
structures to get good performance out of parallel algorithms. DPH,
repa, and even vector show the way.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Johan Tibell
On Tue, Jan 31, 2012 at 1:22 PM, Gregory Collins
 wrote:
> I completely agree on the first part, but deepseq is not a panacea either.
> It's a big hammer and overuse can sometimes cause wasteful O(n) no-op
> traversals of already-forced data structures. I also definitely wouldn't go
> so far as to say that you can't do serious parallel development without it!

I agree. The only time I ever use deepseq is in Criterion benchmarks,
as it's a convenient way to make sure that the input data is evaluated
before the benchmark starts. If you want a data structure to be fully
evaluated, evaluate it as it's created, not after the fact.

> The only real solution to problems like these is a thorough understanding of
> Haskell's evaluation order, and how and why call-by-need is different than
> call-by-value. This is both a pedagogical problem and genuinely hard -- even
> Haskell experts like the guys at GHC HQ sometimes spend a lot of time
> chasing down space leaks. Haskell makes a trade-off here; reasoning about
> denotational semantics is much easier than in most other languages because
> of purity, but non-strict evaluation makes reasoning about operational
> semantics a little bit harder.

+1

We can do a much better job at teaching how to reason about
performance. A few rules of thumb gets you a long way. I'm (slowly)
working on improving the state of affairs here.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Gregory Collins
On Tue, Jan 31, 2012 at 9:19 PM, Steve Severance
wrote:

> The other thing is that deepseq is very important . IMHO this needs to be
> a first class language feature with all major libraries shipping with
> deepseq instances. There seems to have been some movement on this front but
> you can't do serious parallel development without it.
>

I completely agree on the first part, but deepseq is not a panacea either.
It's a big hammer and overuse can sometimes cause wasteful O(n) no-op
traversals of already-forced data structures. I also definitely wouldn't go
so far as to say that you can't do serious parallel development without it!

The only real solution to problems like these is a thorough understanding
of Haskell's evaluation order, and how and why call-by-need is different
than call-by-value. This is both a pedagogical problem and genuinely hard
-- even Haskell experts like the guys at GHC HQ sometimes spend a lot of
time chasing down space leaks. Haskell makes a trade-off here; reasoning
about denotational semantics is much easier than in most other languages
because of purity, but non-strict evaluation makes reasoning about
operational semantics a little bit harder.

In domains where you care a lot about operational semantics (like parallel
and concurrent programming, where it's absolutely critical), programmers
necessarily require a lot more experience and knowledge in order to be
effective in Haskell.

G
-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Steve Severance
Hi Everyone,
  I had a similar experience with a similar type of problem. The
application was analyzing web pages that our web crawler had collected,
well not the pages themselves but metadata about when the page was
collected.

The basic query was:

SELECT
  Domain, Date, COUNT(*)
FROM
  Pages
GROUP BY
  Domain, Date

The webpage data was split out across tens of thousands of files compressed
binary. I used enumerator to load these files and select the appropriate
columns. This step was performed in parallel using parMap and worked fine
once i figured out how to add the appropriate !s.

The second step was the group by. I built some tools across monad-par that
had the normal higher level operators like map, groupBy, filter, etc... The
typical pattern I followed was the map-reduce style pattern used in
monad-par. I was hoping to someday share this work, although I have since
abandoned work on it.

It took me a couple of weeks to get the strictness mostly right. I say
mostly because it still randomly blows up, meaning if I feed in a single
40kb file maybe 1 time in 10 it consumes all the memory on the machine in a
few seconds. There is obviously a laziness bug in there somewhere although
after working on it for a few days and failing to come up with a solid
repro case I eventually built all the web page analysis tools in scala, in
large part because I did not see a way forward and need to tie off that
work and move on.

My observations:
Combining laziness and parallelism made it very difficult to reason about
what was going on. Test cases became non-deterministic not in terms out
output in the success case but whether they ran at all.

The tooling around laziness does not give enough information about
debugging complex problems. Because of this when people ask "Is Haskell
good for parallel development?" I tell them the answer is complicated.
Haskell has excellent primitives for parallel development like the STM
which I love but it lacks a PLINQ like toolkit that is fully built out to
enable flexible parallel data processing.

The other thing is that deepseq is very important . IMHO this needs to be a
first class language feature with all major libraries shipping with deepseq
instances. There seems to have been some movement on this front but you
can't do serious parallel development without it.

Some ideas for things that might help would be a plugin for vim that showed
the level of strictness of operations and data. I am going to take another
crack at a PLINQ like library with GHC 7.4.1 in the next couple of months
using the debug symbols that Peter has been working on.

Conclusion:
Haskell was the wrong platform to be doing webpage analysis anyhow, not
because anything is wrong with the language but simply it does not have the
tooling that the JVM does. I moved all my work into Hadoop to take
advantage of multi-machine parallelism and higher level tools like Hive.
There might be a future in building haskell code that could be translated
into a Hive query.

With better tools I think that Haskell can become the goto language for
developing highly parallel software. We just need the tools to help
developers better understand the laziness of their software. There also
seems to be a documentation gap on developing data analysis or data
transformation pipelines in haskell.

Sorry for the length. I hope my experience is useful to someone.

Steve

On Tue, Jan 31, 2012 at 7:57 AM, Marc Weber  wrote:

> Excerpts from Felipe Almeida Lessa's message of Tue Jan 31 16:49:52 +0100
> 2012:
> > Just out of curiosity: did you use conduit 0.1 or 0.2?
> I updated to 0.2 today because I was looking for a monad instance for
> SequenceSink - but didn't find it cause I tried using it the wrong way
> (\state -> see last mail)
>
> I also tried json' vs json (strict and non strict versions) - didn't
> seem to make a big difference.
>
> Marc Weber
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Marc Weber
Excerpts from Felipe Almeida Lessa's message of Tue Jan 31 16:49:52 +0100 2012:
> Just out of curiosity: did you use conduit 0.1 or 0.2?
I updated to 0.2 today because I was looking for a monad instance for
SequenceSink - but didn't find it cause I tried using it the wrong way
(\state -> see last mail)

I also tried json' vs json (strict and non strict versions) - didn't
seem to make a big difference.

Marc Weber

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Felipe Almeida Lessa
On Tue, Jan 31, 2012 at 1:36 PM, Marc Weber  wrote:
> Adding a \state -> (the way Felipe Lessa told me) make is work and
> it runs in about 20sec and that although some conduit overhead is likely
> to take place.

Just out of curiosity: did you use conduit 0.1 or 0.2?

Cheers! =)

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread Donn Cave
Quoth AntC ,
...
> My proposal is that field selection functions be just ordinary functions, and 
> dot notation be just function application(tight-binding). Then:
>   object.fieldfuncmethod   ==> fieldfuncmethod object
> (Subject to the tight binding for the dot.)
> And one-sided dot application is pointless (! errk I mean 'without purpose', 
> no different to writing the bare object or bare fieldfuncmethod).

That's interesting!  The wiki page on SORF (Simple Overloaded Record Fields,
http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields)
has some language that, as I interpreted it, meant that Has/Get syntactic
sugar depended on the dot, so it's indispensable.  Your proposal actually
has some similar language but, I see you don't mean it that way.  That's
great news for anyone who's really dying to get somewhere with records,
if it means that the functionality could in principle be introduced
independently of changes to the interpretation of "." that would break
a lot of code.

Donn

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Marc Weber
>  jsonLines :: C.Resource m => C.Conduit B.ByteString m Value
>  jsonLines = C.sequenceSink () $ do
>val <- CA.sinkParser json'
>CB.dropWhile isSpace_w8
>return $ C.Emit () [val]

Adding a \state -> (the way Felipe Lessa told me) make is work and
it runs in about 20sec and that although some conduit overhead is likely
to take place.

omitting my custom data type using bytestrings operating on Value of
Aeson reduces running time to 16secs. 

PHP/C++ still wins: less than 12secs.

Now I can imagine again that even a desktop multi core system is faster
than a single threaded C application.

Thanks for your help. Maybe I can setup profiling again to understand
why its still taking little bit more time.

Marc Weber

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] help with safecopy + acid-state

2012-01-31 Thread Antoine Latter
On Tue, Jan 31, 2012 at 8:27 AM, Johannes Waldmann
 wrote:
>
>> > Can I really rename  old.T => new.T_orig ?
>> > It looks as if then tries to load the wrong acid-state snapshot.
>>
>> The name of your data type doesn't matter as acid-state doesn't store
>> that on the disk.
>
> I think it does - because file names are  state/T/*.log   and so on?
>

The function 'openLocalState' in AcidState uses the name of the passed
in state type to locate the log files on disk.

So as long as you always call 'openLocalState' with types of the same
name to represent the same state you'll be fine - this is why it is
safe to rename your old type, because you call 'openLocalState' with
the new type.

Alternatively, you can call 'openLocalStateFrom', which doesn't base
anything on names of types (you can tell because there is no
'Typeable' constraint on its arguments).

Antoine

> J.W.
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] help with safecopy + acid-state

2012-01-31 Thread Johannes Waldmann

> > Can I really rename  old.T => new.T_orig ?
> > It looks as if then tries to load the wrong acid-state snapshot.
> 
> The name of your data type doesn't matter as acid-state doesn't store
> that on the disk.

I think it does - because file names are  state/T/*.log   and so on?

J.W.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell] ANNOUNCE multiarg - parser combinators for command line parsing

2012-01-31 Thread Antoine Latter
On Mon, Jan 30, 2012 at 8:19 AM, Henning Thielemann
 wrote:
>
> On Sun, 29 Jan 2012, Simon Meier wrote:
>
>> I'm currently using Neil Mitchell's cmdargs package [1]. How does your
>> package compare to that?
>
>
> Last time I checked cmdargs it was not referential transparent. Is multiarg
> better in this respect?
>

It has since been re-architectured as an impure library around a pure
core library:

http://hackage.haskell.org/packages/archive/cmdargs/0.9.2/doc/html/System-Console-CmdArgs-Explicit.html

> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: combinatorics

2012-01-31 Thread Jean-Marie Gaillourdet
Hi, 
On 29.01.2012, at 23:30, wren ng thornton wrote:

> On 1/29/12 5:48 AM, Roman Cheplyaka wrote:
>> * wren ng thornton  [2012-01-28 23:06:08-0500]
>> 
>> Why not to make it more pure? That is, return a lazy list of Ints (but
>> not a CAF), which user can throw away by the usual GC means?
>> 
>> The functions from the other modules that use primes would have to be
>> put in a Reader monad. That would make it a little bit more awkward to
>> use, but personally I'd prefer that over memory leaks.
> 
> I'd also prefer a more pure solution, but I don't think that the Reader monad 
> is the right tool here. I played around with that approach, but it requires 
> extremely invasive changes to client code, obfuscating what should be simple 
> mathematical formulae. And, it's too fragile, exposing the implementation in 
> a way that breaks client code should I change a non-prime-using algorithm to 
> a prime-using one, or vice versa. The fragility could be partially avoided by 
> providing both prime-using and non-prime-using algorithms, but then that 
> forces users to decide which one they want--- and if their only concern is 
> performance, then they'd have to go through the code-breaking refactoring 
> anyways, just to determine which is faster for their application.
> 
> One alternative I'm exploring is, rather than (only) making the primes not a 
> CAF, instead make the prime-using functions non-CAFs as well. That is, 
> provide a makePrimeUsingFunctions function which returns a record/tuple of 
> all the functions, sharing a stream of primes. This way, after allocating the 
> functions, they can be used purely just as in the current model; and when the 
> client wants the primes to be GCed, they can drop their references to the 
> allocated functions which use those primes (allocating new functions later, 
> if necessary).

A slight variation on that approach is to use implicit parameters to 
parameterize your functions by the primes. Something allong the following lines:

> {-# LANGUAGE ImplicitParams, Rank2Types #-}
>  
>  
> data PrimeTable -- abstract
>  
> withPrimes :: ((?primes :: PrimeTable) => a) -> a
> withPrimes e = e
>   where
> ?primes = ...
>  
>  
> factorial :: (?primes :: PrimeTable) => Integer -> Integer
> factorial = ...
>  
> example = withPrimes $ ... factorial n ….

This has the advantage that the user doesn't have to bring all the elemnts of 
your tuple/record into scope. 
And you can have two modules with an almost identical content, one uses the 
implicit primes argument and the other uses a global CAF for its primes. A user 
would only have to change his type signatures and strategically add/remove a 
call to withPrimes when switching.  

Cheers,
  Jean



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Monad-control rant

2012-01-31 Thread Mikhail Vorozhtsov

On 01/29/2012 11:55 PM, Edward Z. Yang wrote:

Excerpts from Mikhail Vorozhtsov's message of Sun Jan 29 05:34:17 -0500 2012:

[snip]

I think it is one of the simplest layouts one can some up with. I'll try
to explain the motivation behind each inclusion.

ABORTS(μ) ⊆ RECOVERABLE_ZEROS(μ)


I'm sorry, I cannot understand the discussion below because you haven't
defined precisely what ABORTS means.  (See also below; I think it's
time to write something up.)

ABORTS(μ) = { abort e | e ∷ e }


Why are they not equal? After all we can always write `recover weird $
\e → abort e`, right? But zeros from `RECOVERABLE_ZEROES \ ABORTS` may
have additional effects. For example, recoverable interruptions could
permanently disable blocking operations (you can close a socket but you
can't read/write from it). Why the inclusion is not the other way
around? Well, I find the possibility of `abort e1` and `abort e2` having
different semantics (vs `recover` or `finally`) terrifying. If you can
throw unrecoverable exceptions, you should have a different function for
that.

RECOVERABLE_ZEROS(μ) ⊆ FINALIZABLE_ZEROS(μ)

If a zero is recoverable, we can always "finalize" it (by
catch-and-rethrow).

FINALIZABLE_ZEROS(μ) ⊆ ZEROS(μ)

This one is pretty obvious. One example of non-finalizable zeros is
bottoms in a non-MonadUnbottom monad (e.g. my X monad). Another would be
`System.Posix.Process.exitImmediately`.


Ugh, don't talk to me about the exit() syscall ;-)


[snip]

Yes, I think for some this is the crux of the issue. Indeed, it is why
monad-control is so appealing, it dangles in front of us the hope that
we do, in fact, only need one API.

But, if some functions in fact do need to be different between the two
cases, there's not much we can do, is there?

Yes, but on the other hand I don't want to reimplement ones that are the
same. I want to have a modular solution precisely because it allows both
sharing and extensibility.


The cardinal sin of object oriented programming is building abstractions in
deference of code reuse, not the other way around.

Stepping back a moment, I think the most useful step for you is to write up
a description of your system, incorporating the information from this 
discussion,
and once we have the actual article in hand we can iterate from there.
I'll probably release an updated (and documented) version of 
monad-abort-fd when I have enough time. At the moment I'm just 
overloaded with work.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] TCP Server

2012-01-31 Thread Michael Snoyman
On Sat, Jan 28, 2012 at 12:51 PM, Jean-Marie Gaillourdet
 wrote:
> Hello,
>
> On 27.01.2012, at 00:47, Alexander V Vershilov wrote:
>> Recently I asked about tcp server libraries [1] and there was only one
>> answer haskell-scallable-server [2], but in that package there was some
>> dependencies and server logic that are not good for my task.
>
> A simple search for "server" on Hackage turned up the following packages for 
> somewhat generic server infrastructure:
>
> http://hackage.haskell.org/package/iterio-server
> http://hackage.haskell.org/package/generic-server
> http://hackage.haskell.org/package/c10k
> http://hackage.haskell.org/package/network-server
>
> In issue 19 of The Monad Reader is an article discussing the design of the 
> following web server:
> http://hackage.haskell.org/package/mighttpd2
>
> This links might be relevant to your original question.

I just pushed a new version of network-conduit[1] that adds a
light-weight TCP server/client interface. It's very similar to how
Warp it structured (which is the underlying engine for mighttpd2). I
put together a simple example of a server[2] that simply echos back
whatever you send it, and a client[3] that sends a Fibonacci every
second. I hope this helps, let me know if you have any questions.

Michael

[1] http://hackage.haskell.org/package/network-conduit . Sorry,
Haddocks haven't generated yet
[2] 
https://github.com/snoyberg/conduit/blob/master/network-conduit/echo-server.hs
[3] https://github.com/snoyberg/conduit/blob/master/network-conduit/fibclient.hs

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Some thoughts on Type-Directed Name Resolution

2012-01-31 Thread AntC
Donn Cave  avvanta.com> writes:

> 
> On 28/01/2012 13:00, Paul R wrote:
> ...
> > All this dot syntax magic frankly frightens me. Haskell, as a pure
> > functionnal language, requires (and allows !) a programming style that
> > just does not mix well with object oriented practices. 
> 
> In the glasgow-haskell-users discussion, it has been pointed out (to 
> little apparent effect) that the current notation for access by field
> name, `field record', is naturally functional and is easier to read
> for a functionally trained eye than a postfix `record.field' alternative.
> [snip]
>   Donn
> 
Donn, I can see the argument "Haskell has never been afraid to be different. 
Just because OO does it like that, so what?"

But if you read SPJ's discussion in the TDNR proposal, there's "a cultural 
connection to OO". My post at the head of this thread put it as "focus on the 
object -> look for the action".

Of course it's easy to 'fake' postfix function application:
(.$) = flip ($)

But the trouble is that .$ binds weakly. What we want is for the dot to bind 
tighter even than function apply. So:
 crack egg.largeEnd   ==> crack (largeEnd egg)
(Where ==> means 'is syntactic sugar for'.)

We're already familiar with the tight-binding dot for qualified names. I 
suppose we're coping with the visual confusion with space-surrounded dot as 
function composition.

But I can see that "one more petit bonbon" could tip confusion over the edge.

To my eye, one-sided dot application is a bonbon too far.

My proposal is that field selection functions be just ordinary functions, and 
dot notation be just function application(tight-binding). Then:
  object.fieldfuncmethod   ==> fieldfuncmethod object
(Subject to the tight binding for the dot.)
And one-sided dot application is pointless (! errk I mean 'without purpose', 
no different to writing the bare object or bare fieldfuncmethod).

Then you can write in your familiar style, and can use polymorphic field 
selectors as plain functions (same syntax as presently).

Those under the influence of OO can write dot notation, until they discover 
the joys of pointless style.

AntC


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANNOUNCE] biostockholm 0.2

2012-01-31 Thread Felipe Almeida Lessa
On Thu, Jan 26, 2012 at 11:42 PM, Felipe Almeida Lessa
 wrote:
>  - Fast enough: the streaming interface achieves 12 MiB/s for parsing,
> which is pretty nice considering that there are some known overheads
> on its implementation.

I've just released biostockholm 0.2.1 which uses conduit 0.2.  Now the
streaming interface achieves 31 MiB/s when parsing Rfam 9.1's full
data on my computer, which is a 2.6x increase in performance!  Kudos
to Michael Snoyman who squashed the biggest "known overhead" that I've
mentioned above on this new conduit 0.2 release.

Cheers! =)

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Felipe Almeida Lessa
On Tue, Jan 31, 2012 at 6:05 AM, Marc Weber  wrote:
> I didn't say that I tried your code. I gave enumerator package a try
> counting lines which I expected to behave similar to conduits
> because both serve a similar purpose.
> Then I hit the the "sourceFile" returns chunked lines issue (reported
> it, got fixed) - 
>
> Anyway: My log files are a json dictionary on each line:
>
>  { id : "foo", ... }
>  { id : "bar", ... }
>
> Now how do I use the conduit package to split a "chunked" file into lines?
> Or should I create a new parser "many json >> newline" ?

Currently there are two solutions.  The first one is what I wrote
earlier on this thread:

 jsonLines :: C.Resource m => C.Conduit B.ByteString m Value
 jsonLines = C.sequenceSink () $ do
   val <- CA.sinkParser json'
   CB.dropWhile isSpace_w8
   return $ C.Emit () [val]

This conduit will run the json' parser (from aeson) and then drop any
whitespace after that.  Note that it will correctly parse all of your
files but will also parse some files that don't conform to your
specification.  I assume that's fine.



The other solution is going to released with conduit 0.2, probably
today.  There's a lines conduit that splits the file into lines, so
you could write jsonLines above as:

 mapJson :: C.Resource m => C.Conduit B.ByteString m Value
 mapJson = C.sequenceSink () $ do
   val <- CA.sinkParser json'
   return $ C.Emit () [val]

which doesn't need to care about newlines, and then change main to

 main = do
   ...
   ret <- forM_ fileList $ \fp -> do
 C.runResourceT $
   CB.sourceFile fp C.$=
   CB.lines C.$=  -- new line is here
   mapJson C.$=
   CL.mapM processJson C.$$
   CL.consume
   print ret


I don't know which solution would be faster.  Either way, both
solutions will probably be faster with the new conduit 0.2.


> Except that I think my processJson for this test should look like this
> because I want to count how often the clients queried the server.
> Probalby I should also be using CL.fold as shown in the test cases of
> conduit. If you tell me how you'd cope with the "one json dict on each
> line" issue I'll try to benchmark this solution as well.

This issue was already being coped with in my previous e-mail =).

> -- probably existing library functions can be used here ..
> processJson :: (M.Map T.Text Int) -> Value -> (M.Map T.Text Int)
> processJson m value = case value of
>                          Ae.Object hash_map ->
>                            case HMS.lookup (T.pack "id") hash_map of
>                              Just id_o ->
>                                case id_o of
>                                  Ae.String id -> M.insertWith' (+) id 1 m
>                                  _ -> m
>                              _ -> m
>                          _ -> m

Looks like the perfect job for CL.fold.  Just change those three last
lines in main from

  ... C.$=
  CL.mapM processJson C.$$
  CL.consume

into

  ... C.$$
  CL.fold processJson

and you should be ready to go.

Cheers!

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [Haskell Cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Gábor Lehel
On Tue, Jan 31, 2012 at 2:51 AM, Richard O'Keefe  wrote:
 > On the other hand, a designed-to-be-strict language-and-libraries
 > with close-to-Haskell *syntax* would be nice.  I recently
 > described F# as combining the beauty of Caml with the functional
 > purity of C# -- both of course are like the snakes of Ireland.

It's been mentioned, but: http://disciple.ouroborus.net/

(sorry, forgot reply-all)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Marc Weber

Using insertWith' gets time down to 30-40 secs (thus only being 3-4
times slower than PHP).
PHP still is at 13 secs, does not require installing libraries - does
not require compilation and is trivial to write.

A trivial C++ application takes 11-12secs and even with some googling
was trivial to write.

Excerpts from Felipe Almeida Lessa's message of Mon Jan 30 17:36:46 +0100 2012:
> Then please take a deeper look into my code.  What you said that
> you've tried is something else.
I didn't say that I tried your code. I gave enumerator package a try
counting lines which I expected to behave similar to conduits
because both serve a similar purpose.
Then I hit the the "sourceFile" returns chunked lines issue (reported
it, got fixed) - 

Anyway: My log files are a json dictionary on each line:

  { id : "foo", ... }
  { id : "bar", ... }

Now how do I use the conduit package to split a "chunked" file into lines?
Or should I create a new parser "many json >> newline" ?

Except that I think my processJson for this test should look like this
because I want to count how often the clients queried the server.
Probalby I should also be using CL.fold as shown in the test cases of
conduit. If you tell me how you'd cope with the "one json dict on each
line" issue I'll try to benchmark this solution as well.


-- probably existing library functions can be used here ..
processJson :: (M.Map T.Text Int) -> Value -> (M.Map T.Text Int)
processJson m value = case value of
  Ae.Object hash_map ->
case HMS.lookup (T.pack "id") hash_map of
  Just id_o ->
case id_o of
  Ae.String id -> M.insertWith' (+) id 1 m
  _ -> m
  _ -> m
  _ -> m

Marc Weber

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe