Re: [Haskell] Re: state of HaXml?

2007-01-11 Thread Sven Panne
Am Donnerstag, 11. Januar 2007 06:05 schrieb Samuel Bronson:
 Yeah, what I mean is that the garbage collector does not *look* for
 unreachable filehandles to close, or get run when many filehandles
 have been allocated. It only runs finalizers when it happens upon
 things with finalizers, it doesn't have any idea what they are for.

What could actually be done for open(2) and similar OS calls is that in case 
of an EMFILE/ENFILE error condition enough GC is triggered that unused file 
handles are closed, and then the open(2) is retried. This does not solve all 
problems mentioned, but is a step into the right direction. Memory should not 
be the only resource GC cares about...

Cheers,
   S.
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2007-01-10 Thread Malcolm Wallace
Samuel Bronson [EMAIL PROTECTED] wrote:

  Can I just leave it hanging and rely on the garbage collector to
  close it in the fullness of time?
 
 Actually, hGetContents closes the handle when it gets an EOF.
 
 If it never does get EOF (because you never use all of the data), the
 garbage collector *might* close the handle, but I haven't heard of a
 garbage collector that was aware of the value of resources other than RAM

Actually, I'm pretty sure that most Haskell RTS implementations have a
finalizer attached to all file handles.  Once the file handle is no
longer reachable from the program graph (even if its data has not been
fully consumed), the GC will close the file (via the finalizer) before
reaping the memory associated with the handle.

Regards,
Malcolm
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2007-01-10 Thread Samuel Bronson

On 1/10/07, Malcolm Wallace [EMAIL PROTECTED] wrote:

Samuel Bronson [EMAIL PROTECTED] wrote:

  Can I just leave it hanging and rely on the garbage collector to
  close it in the fullness of time?

 Actually, hGetContents closes the handle when it gets an EOF.

 If it never does get EOF (because you never use all of the data), the
 garbage collector *might* close the handle, but I haven't heard of a
 garbage collector that was aware of the value of resources other than RAM

Actually, I'm pretty sure that most Haskell RTS implementations have a
finalizer attached to all file handles.  Once the file handle is no
longer reachable from the program graph (even if its data has not been
fully consumed), the GC will close the file (via the finalizer) before
reaping the memory associated with the handle.


Yeah, what I mean is that the garbage collector does not *look* for
unreachable filehandles to close, or get run when many filehandles
have been allocated. It only runs finalizers when it happens upon
things with finalizers, it doesn't have any idea what they are for.
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell-cafe] Re: [Haskell] Re: state of HaXml?

2007-01-10 Thread Taral

On 1/10/07, Malcolm Wallace [EMAIL PROTECTED] wrote:

Actually, I'm pretty sure that most Haskell RTS implementations have a
finalizer attached to all file handles.  Once the file handle is no
longer reachable from the program graph (even if its data has not been
fully consumed), the GC will close the file (via the finalizer) before
reaping the memory associated with the handle.


That's not the point. The GC will only close the file when the heap is
under pressure. It does not aggressively close the file, so the file
may stay open for longer than the user likes. For a read-only
operation, this shouldn't matter, however on some platforms an open
file handle can prevent deletion of the file.

--
Taral [EMAIL PROTECTED]
You can't prove anything.
   -- Gödel's Incompetence Theorem
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: [Haskell] Re: state of HaXml?

2007-01-10 Thread Bryan O'Sullivan

Taral wrote:


For a read-only
operation, this shouldn't matter, however on some platforms an open
file handle can prevent deletion of the file.


You'd be referring to Windows, then, where you can't rename or remove a 
file if someone has opened it.


A partial defence against this is to pass FILE_SHARE_DELETE to 
CreateFile, if your Haskell runtime is using the win32 file API.  The 
semantics are a bit strange, but it's less hostile than the default 
behaviour.


If your favourite runtime is going through stdio, you're stuck (the 
strong-stomached can use CreateFile, then turn the handle into a FILE*, 
but this behaves peculiarly).


b

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re: state of HaXml?

2007-01-09 Thread Samuel Bronson

On 1/4/07, Norman Ramsey [EMAIL PROTECTED] wrote:

  There seems to be a misunderstanding here: readFile in itself is not the
  solution.  readFile is defined thus:
 
  readFile name=  openFile name ReadMode = hGetContents
 
  and the original code was this:
 
 load fn = do handle - IO.openFile fn IO.ReadMode
  contents - IO.hGetContents handle
  IO.hClose handle
  return $ XP.xmlParse fn contents
 
  Sure, you can replace the openFile/hGetContents pair by readFile, but the
  real problem is the presence of the hClose.  Removing that will solve your
  problem (but note that you now have no control over when the file is
  actually closed).

Can I just leave it hanging and rely on the garbage collector to close
it in the fullness of time?


Actually, hGetContents closes the handle when it gets an EOF.

If it never does get EOF (because you never use all of the data), the
garbage collector *might* close the handle, but I haven't heard of a
garbage collector that was aware of the value of resources other than
RAM (that is, they don't go out of their way to run finalizers and
free up handles to OS resources). Java has the same problem, though
I'm not sure if its file handles *have* finalizers, and Python does
too, except the refcounting in CPython right now hides it.
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2007-01-06 Thread Stefan Karrmann
My 2 cent:

Why does seq not help? See code below.

Simon Marlow (Thu, Jan 04, 2007 at 03:08:45PM +):
 and the original code was this:
 
   load fn = do handle - IO.openFile fn IO.ReadMode
contents - IO.hGetContents handle
IO.hClose handle
return $ XP.xmlParse fn contents
 
 Sure, you can replace the openFile/hGetContents pair by readFile, but the 
 real problem is the presence of the hClose.  Removing that will solve your 
 problem (but note that you now have no control over when the file is 
 actually closed).

load fn = do handle - IO.openFile fn IO.ReadMode
 contents - IO.hGetContents handle
 let res = XP.xmlParse fn contents
 seq res $ IO.hClose handle -- maybe use deepSeq
 return $ res

load fn = do handle - IO.openFile fn IO.ReadMode
 contents - IO.hGetContents handle
 let len = length contents
 seq len $ IO.hClose handle
 return $ XP.xmlParse fn contents

Cheers,
-- 
Stefan

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2007-01-06 Thread Taral

On 1/4/07, Stefan Karrmann [EMAIL PROTECTED] wrote:

My 2 cent:

Why does seq not help? See code below.


The short answer is because it only forces the head of the value, not
the entire value. You need deepSeq for that.


load fn = do handle - IO.openFile fn IO.ReadMode
 contents - IO.hGetContents handle
 let len = length contents
 seq len $ IO.hClose handle
 return $ XP.xmlParse fn contents


This works, because to get the head of len (an integer) you need the
whole of contents.

--
Taral [EMAIL PROTECTED]
You can't prove anything.
   -- Gödel's Incompetence Theorem
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell-cafe] RE: [Haskell] Re: state of HaXml?

2007-01-05 Thread Simon Marlow
[ moving to haskell-café... ]

Norman Ramsey wrote:
   There seems to be a misunderstanding here: readFile in
 itself is not the
   solution.  readFile is defined thus:
  
   readFile name=  openFile name ReadMode = hGetContents  
   and the original code was this:
  
  load fn = do handle - IO.openFile fn IO.ReadMode
   contents - IO.hGetContents handle
   IO.hClose handle
   return $ XP.xmlParse fn contents  
   Sure, you can replace the openFile/hGetContents pair by
 readFile, but the
   real problem is the presence of the hClose.  Removing that
 will solve your
   problem (but note that you now have no control over when
 the file is
   actually closed).

 Can I just leave it hanging and rely on the garbage collector to
 close it in the fullness of time?

Yes.  The problem I was alluding to arises when you have many lazilly-closed 
files, and you run into the system's open file limit because the runtime 
doesn't close them eagerly enough.  To be sure of closing the file at the right 
time, you need to force the entire file to be read (e.g. by forcing the result 
of the parse), then close the handle.

 Because of laziness, I believe there's no point in my writing the
 following:

  load fn = do handle - IO.openFile fn IO.ReadMode
   contents - IO.hGetContents handle
   let xml = XP.xmlParse fn contents
   IO.hClose handle
   return xml

 Is that correct?

Yes.

Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re: state of HaXml?

2007-01-04 Thread Norman Ramsey
  There seems to be a misunderstanding here: readFile in itself is not the 
  solution.  readFile is defined thus:
  
  readFile name=  openFile name ReadMode = hGetContents
  
  and the original code was this:
  
 load fn = do handle - IO.openFile fn IO.ReadMode
  contents - IO.hGetContents handle
  IO.hClose handle
  return $ XP.xmlParse fn contents
  
  Sure, you can replace the openFile/hGetContents pair by readFile, but the
  real problem is the presence of the hClose.  Removing that will solve your
  problem (but note that you now have no control over when the file is
  actually closed).

Can I just leave it hanging and rely on the garbage collector to close
it in the fullness of time?

Because of laziness, I believe there's no point in my writing the
following:

 load fn = do handle - IO.openFile fn IO.ReadMode
  contents - IO.hGetContents handle
  let xml = XP.xmlParse fn contents
  IO.hClose handle
  return xml

Is that correct?


Norman
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell-cafe] Re: [Haskell] Re: state of HaXml?

2007-01-04 Thread Donald Bruce Stewart
nr:
   Sure, you can replace the openFile/hGetContents pair by readFile, but the
   real problem is the presence of the hClose.  Removing that will solve your
   problem (but note that you now have no control over when the file is
   actually closed).
 
 Can I just leave it hanging and rely on the garbage collector to close
 it in the fullness of time?

Yeah, once your program has demanded the entire file, it'll close the
Handle.
  
 Because of laziness, I believe there's no point in my writing the
 following:
 
  load fn = do handle - IO.openFile fn IO.ReadMode
   contents - IO.hGetContents handle
   let xml = XP.xmlParse fn contents
   IO.hClose handle
   return xml
 
 Is that correct?

Yep.  Its not neccessary in the usual programming cases to explicitly
close the handle.

IF you start really hammering the filesystem do you start to care about
ensuring files are closed (so you don't hang on to too many FDs). Or if
you start mutating files on disk. For these situations there are strict
readFiles, or Data.ByteString.readFile

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re: state of HaXml?

2006-12-30 Thread Lennart Kolmodin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Wagner Ferenc wrote:
 [EMAIL PROTECTED] (Norman Ramsey) writes:
 
   load :: String - IO X.Document
   load fn = do handle - IO.openFile fn IO.ReadMode
contents - IO.hGetContents handle
IO.hClose handle
return $ XP.xmlParse fn contents
 
 Try not closing the handle before parsing.

I think it's a little more to it than that.

The parsing is performed lazily and therefore it's hard to try to close
the handle manually, even if HaXml would parse all of the file at once
(unless you force the evaluation).

The simplest thing is to use readFile (from the Prelude) instead of
using handles. readFile will take care of everything for you when the
time is right.

load fn = do
  contents - readFile fn
  return $ XP.xmlParse fn contents

That's it!

Cheers,
  Lennart Kolmodin
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFljzs4txYG4KUCuERAuqkAJ9sBHWqo8cViDoqiYIGaBmBQ2/mngCgiZxG
HJT4vGCs/LT/aA6hsjMfYgc=
=EoQ5
-END PGP SIGNATURE-
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2006-12-30 Thread Norman Ramsey
  The simplest thing is to use readFile (from the Prelude) instead of
  using handles. readFile will take care of everything for you when the
  time is right.

Thanks---I'll try it.  Somehow my hoogle query missed readFile...
undoubtedly because I asked for 'String - IO String' instead
of 'FilePath - IO String'.  Dunno if this is a bug or a feature,
since as far as the compiler is concerned, FilePath and String are the
same type...


Norman
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] Re: state of HaXml?

2006-12-30 Thread Neil Mitchell

Hi


of 'FilePath - IO String'.  Dunno if this is a bug or a feature,
since as far as the compiler is concerned, FilePath and String are the
same type...


Consider it a bug. Hoogle currently doesn't support type aliases.
Version 4 will support them perfectly. As it turns out for type
searching aliases are not merely equal, consider:

String - IO () -- putStr
FilePath - IO () -- createDirectory

Here you'd like to get out different answers depend on your question,
but in both cases the other should appear in the results too, just a
little further down. This is what Hoogle 4 does in the development
version.

Thanks

Neil
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell