Andrey, Thanks for your attempt, but it doesn't seem to work. The easy part is the headline, but the content makes trouble.
Let me write the code a bit more explicit, so you can copy and paste it: ------------------------------------------ {-# LANGUAGE FlexibleContexts #-} module Main where import Text.Parsec data Top = Top String deriving (Show) data Content = Content String deriving (Show) data Section = Section Top Content deriving (Show) headline :: Stream s m Char => ParsecT s u m Top headline = manyTill anyChar (char ':' >> newline) >>= return . Top content :: Stream s m Char => ParsecT s u m Content content = manyTill anyChar (try headline) >>= return . Content section :: Stream s m Char => ParsecT s u m Section section = do {h <- headline; c <- content; return (Section h c)} ------------------------------------------ Assume the following example text is stored in "/tmp/test.txt": --------------------------- top 1: some text ... bla top 2: more text ... bla bla --------------------------- Now I run the section parser in ghci against the above mentioned example text stored in "/tmp/test.txt": *Main> parseFromFile section "/tmp/test.txt" Right (Section (Top "top 1") (Content "")) I don't understand the behaviour of the content parser here. Why does it return ""? Or perhaps more generally, I don't understand the manyTill combinator (though I read the docs). Side remark: of cause for this little task it is probably to much effort to use parsec. However, my content in fact has an internal structure which I would like to parse further, but I deliberately abstracted from these internals as they don't effect my above stated problem. Immanuel 2013/3/3 Andrey Chudnov <achud...@gmail.com> > Immanuel, > Since a heading always starts with a new line (and ends with a colon > followed by a carriage return or just a colon?), I think it might be useful > to first separate the input into lines and then classify them depending on > whether it's a heading or not and reassemble them into the value you need. > You don't even need parsec for that. > > However, if you really want to use parsec, you can write something like > (warning, not tested): > many $ liftM2 Section headline content > where headline = anyChar `manyTill` (char ':' >> spaces >> newline) > content = anyChar `manyTill` (try $ newline >> headline) > > /Andrey > > > On 3/3/2013 10:44 AM, Immanuel Normann wrote: > >> I am trying to parse a semi structured text with parsec that basically >> should identify sections. Each section starts with a headline and has an >> unstructured content - that's all. For instance, consider the following >> example text (inside the dashed lines): >> >> --------------------------- >> >> top 1: >> >> some text ... bla >> >> top 2: >> >> more text ... bla bla >> >> >> --------------------------- >> >> This should be parsed into a structure like this: >> >> [Section (Top 1) (Content "some text ... bla"), Section (Top 1) (Content >> "more text ... bla")] >> >> Say, I have a parser "headline", but the content after a headline could >> be anything that is different from what "headline" parses. >> How could the "section" parser making use of "headline" look like? >> My idea would be to use the "manyTill" combinator, but I don"t find an >> easy solution. >> >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe