Re: [Haskell-cafe] Reading integers

2006-09-08 Thread Bertram Felgenhauer
Hi,

I've commited some small improvements to the readInteger code, which
speed up handling negative numbers and also the compatability wrappers
very slightly.

My last numbers comparing the reads wrappers to the basic readInteger
code turned out to be unfair; apparently ghc managed to share some
of the calculations of these functions somehow. I've changed the
benchmark code to wrap the new functions into NOINLINE helpers, so
that source of errors is eliminated.

Now the reads overhead looks much less scary then before, on the order
of 10% for small integers.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Reading integers

2006-09-07 Thread Bulat Ziganshin
Hello Neil,

Thursday, September 7, 2006, 1:45:03 AM, you wrote:

 Currently I have a single module that provides reading operations
 for Integers and Ints. I'm not quite sure what to do with it.

 Get it into base!  Where it is, or what its called is less relevant -
 perhaps entirely decoupled from the Read class, and I wouldn't have
 thought reading octal integers etc was useful - just simple bog
 standard integers. I'd really like just readInt/readInteger in
 Numeric, or Prelude, or possibly even a new Read style module.

i'd just defined the following function, mainly because standard
'read' machinery occupies whole 50 kb in exe-file

readI = foldl f 0
  where f m c | isDigit c  =  fromIntegral (ord c - ord '0') + (m * 10)
  | otherwise  =  error (Non-digit ++[c]++ in readI)



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] Reading integers

2006-09-07 Thread Lennart Augustsson

What about negative numbers?

Also, don't use (ord c - ord '0'), it doesn't work with Unicode digits.
That's why there is a digitToInt function.

-- Lennart

On Sep 7, 2006, at 02:12 , Bulat Ziganshin wrote:


readI = foldl f 0
  where f m c | isDigit c  =  fromIntegral (ord c - ord '0') + (m *  
10)

  | otherwise  =  error (Non-digit ++[c]++ in readI)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] Reading integers

2006-09-07 Thread Bertram Felgenhauer
Lennart Augustsson wrote:
 Also, don't use (ord c - ord '0'), it doesn't work with Unicode digits.
 That's why there is a digitToInt function.

FWIW, in Haskell 98, isDigit and digitToInt are defined as

isDigit c=  c = '0'  c = '9'

digitToInt :: Char - Int
digitToInt c
  | isDigit c=  fromEnum c - fromEnum '0'
  | c = 'a'  c = 'f' =  fromEnum c - fromEnum 'a' + 10
  | c = 'A'  c = 'F' =  fromEnum c - fromEnum 'A' + 10
  | otherwise=  error Char.digitToInt: not a digit

making this a bit of a red herring. I can't comment on why Bulat doesn't
like negative numbers.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[4]: [Haskell-cafe] Reading integers

2006-09-07 Thread Bulat Ziganshin
Hello Bertram,

Thursday, September 7, 2006, 6:01:17 PM, you wrote:

 Also, don't use (ord c - ord '0'), it doesn't work with Unicode digits.
 That's why there is a digitToInt function.

 making this a bit of a red herring. I can't comment on why Bulat doesn't
 like negative numbers.

because my program can't split files into chunks of -10 files or -1 mbytes ;)


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Reading integers

2006-09-07 Thread Bertram Felgenhauer
Hi,

I've implemented replacements for readsPrec for Int and Integer in
a module called Compat (in my darcs repository [1]) and measured
their performance.

There's some overhead when compared to the plain versions.
For Integer, it's about 30% for single digit numbers, going down
to about 10% at 10 digits, and about 1.3% at 100.

It's still a big win over the existing instances at a small loss
in compatibility to Haskell 98. As far as I can make out, the only
difference is that it fails to diverge in some cases where reads
would diverge (see my initial mail for an example.)


I've also received a question from Ketil Malde off list, asking
(in reply to Neil Mitchell) what the advantages would be of having
this in base.

Personally I think it should go into base if we can agree on replacing
existing code (the Read instances or something in Numeric or maybe both).
This would have the advantage of making quite a few programs more
efficient just by recompiling them.

If we can't agree on that, the code will be made available as an extra
module but I wouldn't see any advantage at having it in base. Having
it distributed with ghc would have some advantages (mainly convenience
for users and a slightly better chance of avoiding forks).


In other news, darcs send will now send the patches to me - in case
anyone has any.

regards,

Bertram

[1] I'll say again that browsing directly doesn't work, darcs is fine
though, get the code with
darcs get http://int-e.home.tlink.de/haskell/readinteger
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] Reading integers

2006-09-07 Thread Brian Hulley

Bertram Felgenhauer wrote:

I can't comment on why Bulat doesn't like negative numbers.


Neither it seems, did the original Haskell committee - that's why we have to 
muddle along with a confusing unary minus operator instead of proper 
negative literals - see the thread beginning 
http://www.haskell.org/pipermail/haskell-cafe/2006-August/017410.html for 
the latest exciting attempt to get rid of unary minus... ;-)


Regards, Brian.
--
Man simply could not have the right interest in beauty
 if in his life of soul he were not entangled in a world
of hideously ugly spider-like beings. - Rudolf Steiner 16/12/1922

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Reading integers

2006-09-06 Thread Bertram Felgenhauer
Hi,

prompted by a discussion about http://www.spoj.pl/problems/MUL/ on
#haskell I implemented some faster routines for reading integers,
and managed to produce a program that passes this problem.

Currently I have a single module that provides reading operations
for Integers and Ints. I'm not quite sure what to do with it.

Ideally, read should use faster code for parsing numbers. This is hard
to do though, because read for numbers is specified in terms of lex,
which handles arbitrary tokens. One nasty side effect of this, besides
bad performance, is that for example

   read ('':reapeat 'a') :: Int

diverges, even though there's obviously no possible parse. Incorporating
faster integer reading code into read without breaking backward
compatibility is not trivial.

The next obvious place to put the functions is the Numeric module.
This sounds like a good idea until one realizes that readSigned (which
is the obvious thing to use for signed integers) again uses lex, with
all the complications above.

A third option is to put it all in a separate module. This would
probably be hard to find, so few people would use it.

I'm pondering breaking compatibility with Haskell 98 to some extent
and implementing  read instances for Int and Integer that don't diverge
on weird inputs, but handle all valid numbers that Haskell 98 supports.

This still requires some work to support octal and hexadecimal numbers,
whitespace and parentheses.

What do you think?


The code (quite unpolished) is available with

  darcs get http://int-e.home.tlink.de/haskell/readinteger

(browsing directly doesn't work)

It also includes a simple benchmark for comparing reads, Numeric.readDec
and my readInteger functions. Results for my computer are included as
well.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Reading integers

2006-09-06 Thread Neil Mitchell

Hi Bertram,


Currently I have a single module that provides reading operations
for Integers and Ints. I'm not quite sure what to do with it.


Get it into base! Where it is, or what its called is less relevant -
perhaps entirely decoupled from the Read class, and I wouldn't have
thought reading octal integers etc was useful - just simple bog
standard integers. I'd really like just readInt/readInteger in
Numeric, or Prelude, or possibly even a new Read style module.


I'm pondering breaking compatibility with Haskell 98 to some extent
and implementing  read instances for Int and Integer that don't diverge
on weird inputs, but handle all valid numbers that Haskell 98 supports.


Thats nice, but a stand alone readInt that is not tied to reads, and
the whole leftover parse thing can be dropped - probably for better
performance. Of course, making an existing read faster is also good.


It also includes a simple benchmark for comparing reads, Numeric.readDec
and my readInteger functions. Results for my computer are included as
well.


Any chance of a very quick summary of the results? i.e. on average x%
faster, but y% faster for integers over size z? Saves everyone darcs's
in. As far as I am concerned, the code is a black box, but the
statistics are something everyone will want.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Reading integers

2006-09-06 Thread Bertram Felgenhauer
Neil Mitchell wrote:
 Currently I have a single module that provides reading operations
 for Integers and Ints. I'm not quite sure what to do with it.

 Get it into base! Where it is, or what its called is less relevant -

Ok, one vote for base.

 I'm pondering breaking compatibility with Haskell 98 to some extent
 and implementing  read instances for Int and Integer that don't diverge
 on weird inputs, but handle all valid numbers that Haskell 98 supports.
 
 Thats nice, but a stand alone readInt that is not tied to reads, and
 the whole leftover parse thing can be dropped - probably for better
 performance.

Performance won't be much better I think. A first attempt at this was
actually slower (I have not yet investigated why), and the functions are
more useful with leftover parsing, in my opinion (in the MUL problem it
saved a call to words - which bought a little bit of extra performance).

 Of course, making an existing read faster is also good.

I think it's very desirable because that's what people use most often.

 Any chance of a very quick summary of the results? i.e. on average x%
 faster, but y% faster for integers over size z? Saves everyone darcs's
 in. As far as I am concerned, the code is a black box, but the
 statistics are something everyone will want.

For small integers (absolute value less than 1000), Numeric.readDec
is 5 to 10 times faster than reads and my readInteger is 5 to 8
times faster than readDec.

For medium integers (about 100 digits), readDec and reads are about
the same speed while readInteger is about 10 times faster.

For larger integers the speedup becomes larger, too. At 100k digits,
it reaches 100 (timed using ghci - but hopefully the number is correct
anyway.)

All measurements were done with ghc 6.5, compiling with -O.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe