Re: [Haskell-cafe] How to split this string.

2012-01-06 Thread Steve Horne

On 06/01/2012 10:39, Jon Fairbairn wrote:
groupBy is currently implemented using span. It strikes me that we 
ought to specify some properties for what we want. Start by defining: 
pairwiseInOrderBy p l = all (uncurry p) (l `zip` drop 1 l) giving all 
(pairwiseInOrderBy p) (groupCut p l) and we would want concat 
(groupCut p l) == l (all modulo nontermination side conditions). 
Anything else? 
To be honest, I've worked out what's going on in this case and I have an 
implementation or two of what I'd want in case I need it, plus I've 
posted it in case it was useful to the OP. There's nothing I really want 
to persue any further.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-06 Thread Jon Fairbairn
Steve Horne  writes:

> On 05/01/2012 11:09, Brandon Allbery wrote:
>> On Thu, Jan 5, 2012 at 05:57, Steve Horne
>> > > wrote:
>>
>>  --  groupCut - Similar to groupBy, but where groupBy assumes an
>> equivalence relation,
>>  --  groupCut takes a function that indicates where to cut. The
>> two parameters to this
>>  --  function are always adjacent items from the list, and if the
>> function returns True,
>>  --  a cut is done between the two items.
>>
>> span/break?

> Using those, the test function won't always be passed two
> *adjacent* elements from the list. After all, they're based
> on takeWhile and dropWhile, which take unary functions,
> meaning an element has already been curried in (the starting
> element of the group).
>
> That's probably how the current groupBy is implemented - the
> approach that assumes an equivalence relation, giving
> unexpected results when the By function isn't an equivalence
> relation.

groupBy is currently implemented using span.

It strikes me that we ought to specify some properties for what
we want. Start by defining:

   pairwiseInOrderBy p l = all (uncurry p) (l `zip` drop 1 l)

giving

   all (pairwiseInOrderBy p) (groupCut p l)

and we would want

   concat (groupCut p l) == l

(all modulo nontermination side conditions). Anything else?

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Christian Maeder

Am 05.01.2012 13:04, schrieb Steve Horne:
[...]


I was going to accuse you of cheating - who says there's a spare value
to use? - but you seem to be using Maybe, so well played.

You're also using unfoldr, which I really must play with a bit - I don't
really have a feel for how unfolding works ATM.


You may prefer another variant of unfoldr (without Maybe):

unfoldr' :: ([b] -> (a, [b])) -> [b] -> [a]
unfoldr' f l = if null l then [] else
  let (a, r) = f l in a : unfoldr' f r

split' :: (a -> Bool) -> (a -> Bool) -> [a] -> [[a]]
split' e p = unfoldr' $ break' e p

C.

P.S. my break' function fails for "\r\r\n" (as the first char escapes 
the second and the second no longer the third)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Steve Horne

On 05/01/2012 11:55, Christian Maeder wrote:

Am 05.01.2012 11:57, schrieb Steve Horne:
[...]

groupCut :: (x -> x -> Bool) -> [x] -> [[x]]

[...]

How about a break function that respects an escape character (1. arg) 
(and drops the delimiter - 2. arg) and use this function for unfolding?

Interesting.

I was going to accuse you of cheating - who says there's a spare value 
to use? - but you seem to be using Maybe, so well played.


You're also using unfoldr, which I really must play with a bit - I don't 
really have a feel for how unfolding works ATM.


Thanks.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Christian Maeder

Am 05.01.2012 11:57, schrieb Steve Horne:
[...]

groupCut :: (x -> x -> Bool) -> [x] -> [[x]]

[...]

How about a break function that respects an escape character (1. arg) 
(and drops the delimiter - 2. arg) and use this function for unfolding?


import Data.List

break' :: (a -> Bool) -> (a -> Bool) -> [a] -> ([a], [a])
break' e p l = case l of
  [] -> (l, [])
  c : r
| p c -> ([], r)
| e c -> case r of
  [] -> (l, [])
  d : t -> let (f, s) = break' e p t in
(c : d : f, s)
| otherwise -> let (f, s) = break' e p r in
(c : f, s)

split' :: (a -> Bool) -> (a -> Bool) -> [a] -> [[a]]
split' e p = unfoldr $ \ l -> if null l then Nothing else
  Just $ break' e p l

*Main> split' (== '\r') (== '\n') "string1\nstring2\r\nstring3\nstring4"
["string1","string2\r\nstring3","string4"]

C.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Steve Horne

On 05/01/2012 11:09, Brandon Allbery wrote:
On Thu, Jan 5, 2012 at 05:57, Steve Horne > wrote:


 --  groupCut - Similar to groupBy, but where groupBy assumes an
equivalence relation,
 --  groupCut takes a function that indicates where to cut. The
two parameters to this
 --  function are always adjacent items from the list, and if the
function returns True,
 --  a cut is done between the two items.


span/break?


Using those, the test function won't always be passed two *adjacent* 
elements from the list. After all, they're based on takeWhile and 
dropWhile, which take unary functions, meaning an element has already 
been curried in (the starting element of the group).


That's probably how the current groupBy is implemented - the approach 
that assumes an equivalence relation, giving unexpected results when the 
By function isn't an equivalence relation.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Brandon Allbery
On Thu, Jan 5, 2012 at 05:57, Steve Horne wrote:

>  --  groupCut - Similar to groupBy, but where groupBy assumes an
> equivalence relation,
>  --  groupCut takes a function that indicates where to cut. The two
> parameters to this
>  --  function are always adjacent items from the list, and if the function
> returns True,
>  --  a cut is done between the two items.
>

span/break?

-- 
brandon s allbery  allber...@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Steve Horne

On 05/01/2012 10:02, Jon Fairbairn wrote:

Steve Horne  writes:


Personally, I think this is a tad disappointing. Given that
groupBy cannot check or enforce that it's test respects
equivalence classes, it should ideally give results that
make as much sense as possible either way. That said, even
if the test was always given adjacent elements, there's
still room for a different order of processing the list
(left-to-right or right-to-left) to give different results -
and in any case, maybe it's more efficient the way it is.

Looking back at the libraries list, I get the impression that
there was a suggestion to change the behaviour of groupBy, but
it doesn’t seem to have happened.


I've realised that the left-to-right vs. right-to-left order thing makes 
no difference - I don't know why I thought that now.


I've written an implementation, only the predicate is inverse-logic - 
True means cut-between-these rather than keep-these-together.


I keep thinking there should be a tail-recursive implementation, but the 
usual trick would either mean using ++ or difference lists or similar, 
or would deliver the results in reverse order. If anyone can think of a 
way to get the correct result in one pass through the list (assuming 
tail recursion is optimised), I'm curious.


Or... does non-strict evaluation mean I shouldn't worry about it? Maybe 
it does a good job of evaluating the head quickly anyway, as the data 
dependencies are quite localized? I've been wondering how lazy 
evaluation interacts with recursion over lists in performance terms for 
a while.


  --  groupCut - Similar to groupBy, but where groupBy assumes an 
equivalence relation,
  --  groupCut takes a function that indicates where to cut. The two 
parameters to this
  --  function are always adjacent items from the list, and if the 
function returns True,

  --  a cut is done between the two items.

  groupCut :: (x -> x -> Bool) -> [x] -> [[x]]

  groupCut f [] = []
  groupCut f xs = let (y,ys,yss) = groupCut' f xs in  (y:ys):yss

  --  arg1   - cut here test function
  --  arg2   - input list
  --  result - triple of current (head char, head group excl. head 
char, tail groups)

  --
  --  the input list must not be empty - this is handled in the 
front-end function.

  groupCut' :: (x -> x -> Bool) -> [x] -> (x, [x], [[x]])

  groupCut' f (x:[]) = (x, [], [])

  groupCut' f (x:xs) = let (y,ys,yss) = groupCut' f xs
   in  if (f x y) then (x,   [], (y:ys):yss)
  else (x, y:ys,yss)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-05 Thread Jon Fairbairn
Steve Horne  writes:

> On 02/01/2012 11:12, Jon Fairbairn wrote:
>> max  writes:
>>
>>> I want to write a function whose behavior is as follows:
>>>
>>> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
>>> "string2\r\nstring3", "string4"]
>>>
>>> Note the sequence "\r\n", which is ignored. How can I do this?
>> cabal install split
>>
>> then do something like
>>
>> import Data.List (groupBy)
>> import Data.List.Split (splitOn)
>>
>> rn '\r' '\n' = True
>> rn _ _ = False
>>
>> required_function = fmap concat . splitOn ["\n"] . groupBy rn
>>
>> (though that might be an abuse of groupBy)
>>
> Sadly, it turns out that not only is this an abuse of
> groupBy, but it has (I think) a subtle bug as a result.

It does indeed. Thanks. That was pretty much what I feared.

> Explanation (best guess) - the function passed to groupBy,
> according to the docs, is meant to test whether two values
> are 'equal'. I'm guessing the assumption is that the
> function will effectively treat values as belonging to
> equivalence classes. That implies some rules such as...

Right.  This issue has come up from time to time since groupBy
was first written, and someone pops up to justify the present
behaviour, but I can never remember why.

> In the context of this \r\n test function, this behaviour
> will I guess result in \r\n\n being combined into one group.
> The second \n will therefore not be seen as a valid
> splitting point.

Correct. In my defence, I did say “do something like” :-)

> Personally, I think this is a tad disappointing. Given that
> groupBy cannot check or enforce that it's test respects
> equivalence classes, it should ideally give results that
> make as much sense as possible either way. That said, even
> if the test was always given adjacent elements, there's
> still room for a different order of processing the list
> (left-to-right or right-to-left) to give different results -
> and in any case, maybe it's more efficient the way it is.

Looking back at the libraries list, I get the impression that
there was a suggestion to change the behaviour of groupBy, but
it doesn’t seem to have happened.

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk
http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2010-09-14)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-04 Thread AUGER Cédric
Le Wed, 04 Jan 2012 17:49:15 +,
Steve Horne  a écrit :

> On 04/01/2012 16:47, Steve Horne wrote:
> >
> >   (a == a)
> >   reflexivity : (a == b) => (b == a)
> >   transitivity : (a == b) && (b == c) => (a == c)
> >
> Oops - that's...
> 
> reflexivity :  (a == a)
> symmetry : (a == b) => (b == a)
> transitivity : (a == b) && (b == c) => (a == c)
> 
> An equivalence relation is a relation that meets all these conditions.
> 
> 

I prefer to use "transymmetry" (although I guess it is not a regular
word):

reflexivity: a ≃ a
transymmetry: ∀ a b. b≃a ⇒ ∀ c. c≃a ⇒ b≃c

so I only have 2 rules.
transymmetry is trivially derived from transitivity and symmetry.
symmetry is trivially derived from reflexivity and transymmetry.
transitivity is trivially derived from symmetry and transymmetry
 (and thus from transymmetry and reflexivity)

> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-04 Thread Christian Maeder

Am 04.01.2012 17:47, schrieb Steve Horne:

On 02/01/2012 11:12, Jon Fairbairn wrote:

max writes:


I want to write a function whose behavior is as follows:

foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
"string2\r\nstring3", "string4"]

Note the sequence "\r\n", which is ignored. How can I do this?


Why do you have these (unhealthy) different kinds of line breaks (Unix 
and Windows style) in your string in the first place?


I hope, not by something calling "unlines" (or intercalate "\n") earlier.

Cheers Christian

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-04 Thread Steve Horne

On 04/01/2012 16:47, Steve Horne wrote:


  (a == a)
  reflexivity : (a == b) => (b == a)
  transitivity : (a == b) && (b == c) => (a == c)


Oops - that's...

reflexivity :  (a == a)
symmetry : (a == b) => (b == a)
transitivity : (a == b) && (b == c) => (a == c)

An equivalence relation is a relation that meets all these conditions.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-04 Thread Steve Horne

On 02/01/2012 11:12, Jon Fairbairn wrote:

max  writes:


I want to write a function whose behavior is as follows:

foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
"string2\r\nstring3", "string4"]

Note the sequence "\r\n", which is ignored. How can I do this?

cabal install split

then do something like

import Data.List (groupBy)
import Data.List.Split (splitOn)

rn '\r' '\n' = True
rn _ _ = False

required_function = fmap concat . splitOn ["\n"] . groupBy rn

(though that might be an abuse of groupBy)

Sadly, it turns out that not only is this an abuse of groupBy, but it 
has (I think) a subtle bug as a result.


I was inspired by this to try some other groupBy stuff, and it didn't 
work. After scratching my head a bit, I tried the following...


Prelude> import Data.List
Prelude Data.List> groupBy (<) [1,2,3,2,1,2,3,2,1]
[[1,2,3,2],[1,2,3,2],[1]]

That wasn't exactly the result I was expecting :-(

Explanation (best guess) - the function passed to groupBy, according to 
the docs, is meant to test whether two values are 'equal'. I'm guessing 
the assumption is that the function will effectively treat values as 
belonging to equivalence classes. That implies some rules such as...


  (a == a)
  reflexivity : (a == b) => (b == a)
  transitivity : (a == b) && (b == c) => (a == c)

I'm not quite certain I got those names right, and I can't remember the 
name of the first rule at all, sorry.


The third rule is probably to blame here. By the rules, groupBy doesn't 
need to compare adjacent items. When it starts a new group, it seems to 
always use the first item in that new group until it finds a mismatch. 
In my test, that means it's always comparing with 1 - the second 2 is 
included in each group because although (3 < 2) is False, groupBy isn't 
testing that - it's testing (1 < 2).


In the context of this \r\n test function, this behaviour will I guess 
result in \r\n\n being combined into one group. The second \n will 
therefore not be seen as a valid splitting point.



Personally, I think this is a tad disappointing. Given that groupBy 
cannot check or enforce that it's test respects equivalence classes, it 
should ideally give results that make as much sense as possible either 
way. That said, even if the test was always given adjacent elements, 
there's still room for a different order of processing the list 
(left-to-right or right-to-left) to give different results - and in any 
case, maybe it's more efficient the way it is.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-03 Thread Jonathan Frywater
If you're interested in learning parsec, RWH covered this topic in depth in
Chapter 16, Choices and Errors:
http://book.realworldhaskell.org/read/using-parsec.html.

On Mon, Jan 2, 2012 at 3:44 AM, max  wrote:

> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Markus Läll
String is really for small strings. Text is more efficent and also has
more functionality, including most, if not all, of the functions
defined for String.

On Mon, Jan 2, 2012 at 3:12 PM, Anupam Jain  wrote:
> On Mon, Jan 2, 2012 at 5:52 PM, Felipe Almeida Lessa
>  wrote:
>> On Mon, Jan 2, 2012 at 10:12 AM, max  wrote:
>>> This is the simplest solution of the proposed, in my opinion. Thank you
>>> very much.
>>
>> Better yet, don't use String and use Text.  Then you just need
>> T.splitOn "\r\n" [1].
>
> That is actually the opposite of what the OP wants, however it's
> interesting that Text has a function like that and not the String
> functions in the standard
> library.
>
> -- Anupam
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Markus Läll

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Anupam Jain
On Mon, Jan 2, 2012 at 5:52 PM, Felipe Almeida Lessa
 wrote:
> On Mon, Jan 2, 2012 at 10:12 AM, max  wrote:
>> This is the simplest solution of the proposed, in my opinion. Thank you
>> very much.
>
> Better yet, don't use String and use Text.  Then you just need
> T.splitOn "\r\n" [1].

That is actually the opposite of what the OP wants, however it's
interesting that Text has a function like that and not the String
functions in the standard
library.

-- Anupam

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Felipe Almeida Lessa
On Mon, Jan 2, 2012 at 10:12 AM, max  wrote:
> This is the simplest solution of the proposed, in my opinion. Thank you
> very much.

Better yet, don't use String and use Text.  Then you just need
T.splitOn "\r\n" [1].

Cheers,

[1] 
http://hackage.haskell.org/packages/archive/text/0.11.1.12/doc/html/Data-Text.html#v:splitOn

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread max
В Mon, 02 Jan 2012 11:12:49 +
Jon Fairbairn  пишет:

> max  writes:
> 
> > I want to write a function whose behavior is as follows:
> >
> > foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> > "string2\r\nstring3", "string4"]
> >
> > Note the sequence "\r\n", which is ignored. How can I do this?
> 
> cabal install split
> 
> then do something like
> 
>import Data.List (groupBy)
>import Data.List.Split (splitOn)
> 
>rn '\r' '\n' = True
>rn _ _ = False
> 
>required_function = fmap concat . splitOn ["\n"] . groupBy rn
> 
> (though that might be an abuse of groupBy)
> 

This is the simplest solution of the proposed, in my opinion. Thank you
very much.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread emacsray
On Mon, Jan 02, 2012 at 12:44:23PM +0300, max wrote:
> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

A short yet requiring regex solution:

  > import Text.Regex.PCRE
  > match (makeRegex "(?:[^\r\n]+|\r\n)+" :: Regex) "b\nc\r\n\n\r\n\nd" :: 
[[String]]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Jon Fairbairn
max  writes:

> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?

cabal install split

then do something like

   import Data.List (groupBy)
   import Data.List.Split (splitOn)

   rn '\r' '\n' = True
   rn _ _ = False

   required_function = fmap concat . splitOn ["\n"] . groupBy rn

(though that might be an abuse of groupBy)

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Yves Parès
Okay, so it doesn't handle different line-endings.

I have a more general solution (statefulSplit)
http://hpaste.org/55980

I cannot test it as I don't have an interpreter at hand, but if someone
has, I'd be glad to have comments.
(It might be more readable by using the State monad)

2012/1/2 max 

> В Mon, 2 Jan 2012 10:45:18 +0100
> Yves Parès  пишет:
>
> Prelude> lines "string1\nstring2\r\nstring3\nstring4"
> ["string1","string2\r","string3","string4"]
>
> > Doesn't the function "lines" handle different line-endings?
> > (In the Prelude and in Data.List)
> >
> > If not, doing this with parsec would be easy (yet maybe slightly
> > overkill...)
> >
> >
> > 2012/1/2 max 
> >
> > > I want to write a function whose behavior is as follows:
> > >
> > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> > > "string2\r\nstring3", "string4"]
> > >
> > > Note the sequence "\r\n", which is ignored. How can I do this?
> > >
> > > ___
> > > Haskell-Cafe mailing list
> > > Haskell-Cafe@haskell.org
> > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > >
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread emacsray
On Mon, Jan 02, 2012 at 12:44:23PM +0300, max wrote:
> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

unixLines :: String -> [String]
unixLines xs = reverse . map reverse $ go xs "" []
  where
go [] l ls = l:ls
go ('\r':'\n':xs) l ls = go xs ('\n':'\r':l) ls
go ('\n':xs) l ls = go xs "" (l:ls)
go (x:xs) l ls = go xs (x:l) ls

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Anupam Jain
On Mon, Jan 2, 2012 at 3:14 PM, max  wrote:
> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?

Here's a simple way (may not be the most efficient) -

import Data.List (isSuffixOf)

split = reverse . foldl f [] . lines
  where
f [] w = [w]
f (x:xs) w = if "\r" `isSuffixOf` x then ((x++"\n"++w):xs) else (w:x:xs)

Testing -

ghci> split "ab\r\ncd\nefgh\nhijk"
["ab\r\ncd","efgh","hijk"]


-- Anupam

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Steve Horne

On 02/01/2012 09:44, max wrote:

I want to write a function whose behavior is as follows:

foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
"string2\r\nstring3", "string4"]

Note the sequence "\r\n", which is ignored. How can I do this?
Doing it probably the hard way (and getting it wrong) looks like the 
following...


--  Function to accept (normally) a single character. Special-cases
--  \r\n. Refuses to accept \n. Result is either an empty list, or
--  an (accepted, remaining) pair.
parseTok :: String -> [(String, String)]

parseTok "" = []
parseTok (c1:c2:cs) | ((c1 == '\r') && (c2 == '\n')) = [(c1:c2:[], cs)]
parseTok (c:cs) | (c /= '\n')= [(c:[], cs)]
| True   = []

--  Accept a sequence of those (mostly single) characters
parseItem :: String -> [(String, String)]

parseItem "" = [("","")]
parseItem cs = [(j1s ++ j2s, k2s)
 | (j1s,k1s) <- parseTok  cs
 , (j2s,k2s) <- parseItem k1s
   ]

--  Accept a whole list of strings
parseAll :: String -> [([String], String)]

parseAll [] = [([],"")]
parseAll cs = [(j1s:j2s,k2s)
| (j1s,k1s) <- parseItem cs
, (j2s,k2s) <- parseAll  k1s
  ]

--  Get the first valid result, which should have consumed the
--  whole string but this isn't checked. No check for existence either.
parse :: String -> [String]
parse cs = fst (head (parseAll cs))

I got it wrong in that this never consumes the \n between items, so 
it'll all go horribly wrong. There's a good chance there's a typo or two 
as well. The basic idea should be clear, though - maybe I should fix it 
but I've got some other things to do at the moment. Think of the \n as a 
separator, or as a prefix to every "item" but the first. Alternatively, 
treat it as a prefix to *every* item, and artificially add an initial 
one to the string in the top-level parse function. The use tail etc to 
remove that from the first item.


See http://channel9.msdn.com/Tags/haskell - there's a series of 13 
videos by Dr. Erik Meijer. The eighth in the series covers this basic 
technique - it calls them monadic and uses the do notation and that 
confused me slightly at first, it's the *list* type which is monadic in 
this case and (as you can see) I prefer to use list comprehensions 
rather than do notation.


There may be a simpler way, though - there's still a fair bit of Haskell 
and its ecosystem I need to figure out. There's a tool called alex, for 
instance, but I've not used it.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Christian Maeder

Am 02.01.2012 10:44, schrieb max:

I want to write a function whose behavior is as follows:

foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
"string2\r\nstring3", "string4"]

Note the sequence "\r\n", which is ignored. How can I do this?


replace the sequence by something unique first, i.e. a single "\r" (and 
revert this change later).


(Replacing a single character is easier using concatMap).

HTH Christian

-- | replace first (non-empty) sublist with second one in third
-- argument list
replace :: Eq a => [a] -> [a] -> [a] -> [a]
replace sl r = case sl of
  [] -> error "replace: empty list"
  _ -> concat . unfoldr (\ l -> case l of
[] -> Nothing
hd : tl -> Just $ case stripPrefix sl l of
  Nothing -> ([hd], tl)
  Just rt -> (r, rt))


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Simon Hengel
> Doesn't the function "lines" handle different line-endings?
> (In the Prelude and in Data.List)
It does not ignore "\r\n".

Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread max
В Mon, 2 Jan 2012 10:45:18 +0100
Yves Parès  пишет:

Prelude> lines "string1\nstring2\r\nstring3\nstring4"
["string1","string2\r","string3","string4"]

> Doesn't the function "lines" handle different line-endings?
> (In the Prelude and in Data.List)
> 
> If not, doing this with parsec would be easy (yet maybe slightly
> overkill...)
> 
> 
> 2012/1/2 max 
> 
> > I want to write a function whose behavior is as follows:
> >
> > foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> > "string2\r\nstring3", "string4"]
> >
> > Note the sequence "\r\n", which is ignored. How can I do this?
> >
> > ___
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to split this string.

2012-01-02 Thread Yves Parès
Doesn't the function "lines" handle different line-endings?
(In the Prelude and in Data.List)

If not, doing this with parsec would be easy (yet maybe slightly
overkill...)


2012/1/2 max 

> I want to write a function whose behavior is as follows:
>
> foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
> "string2\r\nstring3", "string4"]
>
> Note the sequence "\r\n", which is ignored. How can I do this?
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] How to split this string.

2012-01-02 Thread max
I want to write a function whose behavior is as follows:

foo "string1\nstring2\r\nstring3\nstring4" = ["string1",
"string2\r\nstring3", "string4"]

Note the sequence "\r\n", which is ignored. How can I do this?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe