Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Don Stewart
ketil:
> 
> Hi,
> 
> I'm currently working on a program that parses a large binary file and
> produces various textual outputs extracted from it.  Simple enough.
> 
> But: since we're talking large amounts of data, I'd like to have
> reasonable performance.  
> 
> Reading the binary file is very efficient thanks to Data.Binary.
> However, output is a different matter.  Currently, my code looks
> something like:
> 
>   summarize :: Foo -> ByteString
>   summarize f = let f1 = accessor f
> f2 = expression f
>:
> in B.concat [f1,pack "\t",pack (show f2),...]
> 
> which isn't particularly elegant, and builds a temporary ByteString
> that usually only get passed to B.putStrLn.  I can suffer the
> inelegance were it only fast - but this ends up taking the better part
> of the execution time.

Why not use Data.Binary for output too? It is rather efficient at
output -- using a continuation-like system to fill buffers gradually.

--   Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Ketil Malde
Duncan Coutts  writes:

> Have you considered using Data.Binary to output the data too? It has a
> pretty efficient underlying monoid for accumulating output data in a
> buffer. You'd want some wrapper functions over the top to make it a bit
> nicer for your use case, but it should work and should be quick.

I've used Data.Binary.Builder to generate the output, which is quite
nice as an interface.  Currently, I've managed to shave off a few
percent off the time - nothing radical yet, but there's a lot of room
for tuning various convenience functions in there.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Johan Tibell
On Mon, Feb 9, 2009 at 1:22 PM, Ketil Malde  wrote:
> Johan Tibell  writes:
>> If so, you might want to use `writev` to avoid extra copying.
>
> Is there a Haskell binding somewhere, or do I need to FFI the system
> call?  Googling 'writev haskell' didn't turn up anything useful.

To my knowledge there's no binding out there. We will include one for
sockets in the next release of network-bytestring. You might find the
code here useful if you want to write your own:

http://github.com/tibbe/network-bytestring/blob/c13d8fab5179e6afbcdebac95d4993ac57f04689/Network/Socket/ByteString/Internal.hs

Cheers,

Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Ketil Malde
Bulat Ziganshin  writes:

>> in B.concat [f1,pack "\t",pack (show f2),...]

> i'm not a BS expert but it seems that you produce Strings using show
> and then convert them to BS. of course this is inefficient - you need
> to replace show with BS analog

Do these analogous functions exist, or must I roll my own.

I've also looked a bit at Data.Binary.Builder, perhaps this is the way
to go?  Will look more closely.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Duncan Coutts
On Mon, 2009-02-09 at 12:49 +0100, Ketil Malde wrote:
> Hi,
> 
> I'm currently working on a program that parses a large binary file and
> produces various textual outputs extracted from it.  Simple enough.
> 
> But: since we're talking large amounts of data, I'd like to have
> reasonable performance.  
> 
> Reading the binary file is very efficient thanks to Data.Binary.
> However, output is a different matter.  Currently, my code looks
> something like:

Have you considered using Data.Binary to output the data too? It has a
pretty efficient underlying monoid for accumulating output data in a
buffer. You'd want some wrapper functions over the top to make it a bit
nicer for your use case, but it should work and should be quick.

It generates a lazy bytestring, but does so with a few large chunks so
the IO will still be quick.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Eugene Kirpichov
+1; it's obviously the packing that causes sloth.
Memoize the "pack "\t"" etc. stuff , and write bytestring replacements
for show for your data.
I guess you can use the Put monad instead of B.concat for that, by the way.

2009/2/9 Bulat Ziganshin :
> Hello Ketil,
>
> Monday, February 9, 2009, 2:49:05 PM, you wrote:
>
>> in B.concat [f1,pack "\t",pack (show f2),...]
>
>> inelegance were it only fast - but this ends up taking the better part
>> of the execution time.
>
> i'm not a BS expert but it seems that you produce Strings using show
> and then convert them to BS. of course this is inefficient - you need
> to replace show with BS analog
>
> --
> Best regards,
>  Bulatmailto:bulat.zigans...@gmail.com
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Ketil Malde
Johan Tibell  writes:

> Is building the strict ByteString what takes the most time? 

Yes.

> If so, you might want to use `writev` to avoid extra copying. 

Is there a Haskell binding somewhere, or do I need to FFI the system
call?  Googling 'writev haskell' didn't turn up anything useful.

> Does your data support incremental processing so that you could
> produce output before all input has been parsed?

Typically, yes.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Bulat Ziganshin
Hello Ketil,

Monday, February 9, 2009, 2:49:05 PM, you wrote:

> in B.concat [f1,pack "\t",pack (show f2),...]

> inelegance were it only fast - but this ends up taking the better part
> of the execution time.

i'm not a BS expert but it seems that you produce Strings using show
and then convert them to BS. of course this is inefficient - you need
to replace show with BS analog

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Efficient string output

2009-02-09 Thread Johan Tibell
On Mon, Feb 9, 2009 at 12:49 PM, Ketil Malde  wrote:
> Reading the binary file is very efficient thanks to Data.Binary.
> However, output is a different matter.  Currently, my code looks
> something like:
>
>  summarize :: Foo -> ByteString
>  summarize f = let f1 = accessor f
>f2 = expression f
>   :
>in B.concat [f1,pack "\t",pack (show f2),...]
>
> which isn't particularly elegant, and builds a temporary ByteString
> that usually only get passed to B.putStrLn.  I can suffer the
> inelegance were it only fast - but this ends up taking the better part
> of the execution time.

Is building the strict ByteString what takes the most time? If so, you
might want to use `writev` to avoid extra copying. Does your data
support incremental processing so that you could produce output before
all input has been parsed?

Cheers,

Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe