Re: [Haskell-cafe] Haddock GSOC project progress

2013-08-05 Thread S. Doaitse Swierstra
Why not use uu-parsinglib, which will tell you what is wrong and nevertheless 
will continue parsing? 

Currently Jacco Krijnen is working on an extensible version of Pandoc, based on 
the AspectAG and the Murder packages, so you can define your own plugins for 
syntax and semantics.

  Doaitse Swierstra


On Aug 1, 2013, at 1:14 , Richard A. O'Keefe o...@cs.otago.ac.nz wrote:

 
 On 31/07/2013, at 8:16 PM, Simon Hengel wrote:
 
 * There is no such thing as a parse error in Markdown, and I think we
  should try to make this true for Haddock markup, too
 
 It is very far from clear that this is a virtue in Markdown.
 In trying to learn Markdown, I found it an excessively tiresome
 defect.  Whenever I was trying to learn how to produce some
 combination of effects, instead of Markdown telling me
 at THIS point you had something I wasn't expecting, it would
 just produce incorrect output, defined as anything other than
 what I intended.  It also meant that two different Markdown
 processors would accept the same text silently but do different
 things with it.
 
 This is one of the reasons I won't use Markdown.
 
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-31 Thread Mateusz Kowalczyk
On 31/07/13 06:37, Roman Cheplyaka wrote:
 Hi Mateusz,

 This looks great — I'm especially excited about List entries no longer
 have to be separated by empty lines!
Glad to hear that.


 However, the decision to use Attoparsec (instead of Parsec, say) strikes
 me as a bit odd,
Parsec has a dependency on Data.Text that you can't easily get rid of.
With Attoparsec, I was able to simply get rid of
the modules I was not interested in (anything with Text) and only keep
the ByteString part.

 as it wasn't intended for parsing source code.
We're not parsing source code. As I mention, we get comment content out
from GHC and parse the markup there.

 In particular, I'm concerned with error messages this parser would produce.
Currently there exist only two error messages: one for when module
header parsing fails
and another one for when parsing of anything else fails. Currently the
parsing
functions have the type ‘DynFlags - String - Maybe (Doc RdrName)’ and
if we get out Nothing
then you get a generic error message and no guidance. This is also the
current behaviour.

Now, I agree that this sounds horrible BUT in actuality, there's not
much information we could ever give.
This isn't the case of inability to do so: I could simply add a (|
fail error message) to relevant parts
and it would get propagated up. The reason why I said that this isn't
much of a problem is because there are
very few cases where parsing actually can fail: in most cases if your
markup isn't valid semantically,
it's probably valid syntactically as a regular string. I mention in my
post that we will now accept a bit
wider range of syntax.

In the past, this:

 some text
  exampleExpression
 result

would fail and you would get the unhelpful error messages. With the new
parser this will simply be accepted as
a regular string. In fact, I actually can't think of a comment that
would result in parse error with the new parser.

Just to check, I just ran 500 randomly generated strings using
QuickCheck through each of the two parsing functions exposed
to the rest of the program and none of them caused a parse error. It's
up to the developer to visually inspect
that their markup produced what they wanted – we can't read minds (and
frankly, the rules are fairly simple).

 Roman

 * Mateusz Kowalczyk fuuze...@fuuzetsu.co.uk [2013-07-30 23:35:45+0100]
 Greetings cafe,

 As some of you might know, I'm hacking on Haddock as part of Google
 Summer of Code. I was recently advised to create a blog and document
 some of what I have been doing recently. You can find the blog at [1] if
 you're interested. The first post goes over the work from the last month
 or so. Future posts should be shorter and on more specific topics.
 There's an overview of what has happened/changed/will change at the
 bottom of the post if you're short on time.

 Thanks.

 [1] - http://fuuzetsu.co.uk/blog


I would also like to take this opportunity to say that there is one more
change that I forgot to mention.
Obviously invalid strings between double quotes will no longer be
treated as module names and blindly linked.
The checking will only be on the syntax of the string so it will still
create hyperlinks to syntactically valid
module names that might not actually exist.

--
Mateusz K.


0x2ADA9A97.asc
Description: application/pgp-keys
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-31 Thread Mats Rauhala
Is Data.Text as an extra dependency really that bad? Remember that you
are parsing comments, prose, human produced text, where Data.Text is way
more useful than ByteString.

-- 
Mats Rauhala
MasseR

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-31 Thread Mateusz Kowalczyk
On 31/07/13 08:21, Mats Rauhala wrote:
 Is Data.Text as an extra dependency really that bad? Remember that you
 are parsing comments, prose, human produced text, where Data.Text is way
 more useful than ByteString.
 
It has to come with GHC boot packages and it currently doesn't. I have
updated my post accordingly to mention it.


ByteString indeed has its problems (I have to be quite careful to make
sure unicode doesn't get mangled) but that's just how it is at the
moment. If Text ever makes it in, the transition will be trivial. We're
not doing anything fancy to the text we get out anyway so any
performance difference it might bring is negligable. The only difference
I can think of would be that we would no longer have to worry about
preserving unicode by hand.

It's an inconvenience, but that's about it. Nothing mission-critical.
-- 
Mateusz K.


0x2ADA9A97.asc
Description: application/pgp-keys
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-31 Thread Simon Hengel
Hi Roman,

 However, the decision to use Attoparsec (instead of Parsec, say)
 strikes me as a bit odd, as it wasn't intended for parsing source
 code. In particular, I'm concerned with error messages this parser
 would produce.

In addition to what Mateusz already said, I want to briefly summarize my
justification for using Attoparsec:

 * Attoparsec's backtracking behavior is much easier to work with than
   Parsec's

 * There is no such thing as a parse error in Markdown, and I think we
   should try to make this true for Haddock markup, too

Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-31 Thread Richard A. O'Keefe

On 31/07/2013, at 8:16 PM, Simon Hengel wrote:
 
 * There is no such thing as a parse error in Markdown, and I think we
   should try to make this true for Haddock markup, too

It is very far from clear that this is a virtue in Markdown.
In trying to learn Markdown, I found it an excessively tiresome
defect.  Whenever I was trying to learn how to produce some
combination of effects, instead of Markdown telling me
at THIS point you had something I wasn't expecting, it would
just produce incorrect output, defined as anything other than
what I intended.  It also meant that two different Markdown
processors would accept the same text silently but do different
things with it.

This is one of the reasons I won't use Markdown.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Haddock GSOC project progress

2013-07-30 Thread Mateusz Kowalczyk
Greetings cafe,

As some of you might know, I'm hacking on Haddock as part of Google
Summer of Code. I was recently advised to create a blog and document
some of what I have been doing recently. You can find the blog at [1] if
you're interested. The first post goes over the work from the last month
or so. Future posts should be shorter and on more specific topics.
There's an overview of what has happened/changed/will change at the
bottom of the post if you're short on time.

Thanks.

[1] - http://fuuzetsu.co.uk/blog

-- 
Mateusz K.


0x2ADA9A97.asc
Description: application/pgp-keys
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haddock GSOC project progress

2013-07-30 Thread Roman Cheplyaka
Hi Mateusz,

This looks great — I'm especially excited about List entries no longer
have to be separated by empty lines!

However, the decision to use Attoparsec (instead of Parsec, say) strikes
me as a bit odd, as it wasn't intended for parsing source code. In
particular, I'm concerned with error messages this parser would produce.

Roman

* Mateusz Kowalczyk fuuze...@fuuzetsu.co.uk [2013-07-30 23:35:45+0100]
 Greetings cafe,
 
 As some of you might know, I'm hacking on Haddock as part of Google
 Summer of Code. I was recently advised to create a blog and document
 some of what I have been doing recently. You can find the blog at [1] if
 you're interested. The first post goes over the work from the last month
 or so. Future posts should be shorter and on more specific topics.
 There's an overview of what has happened/changed/will change at the
 bottom of the post if you're short on time.
 
 Thanks.
 
 [1] - http://fuuzetsu.co.uk/blog

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe