On Tue, Jul 30, 2013 at 12:23 AM, Jonathan S. Shapiro <[email protected]>wrote:

> On Mon, Jul 29, 2013 at 7:54 AM, Ben Kloosterman <[email protected]>wrote:
>
>> The other huge issue with GCs which relates to C4 is unless you run on a
>> GC
>>  based OS  (Which won't happen soon ) everything has to be pinned and or
>> copied to and from GC managed space.
>
>
> That actually has to be done regardless, for security reasons. Things need
> to be pinned while they are being copied across spaces. In Singularity you
> had the shared heap (did they call it that or "exchange heap?"), but it's
> not enough to make data immobile; you also need to know that it is
> immutable or you have to copy it.
>
> That said, it's not hard to imagine a system having a "block heap" in
> which all blocks are of some normative size and compaction is not performed
> on that heap. That would deal with the I/O problem, for example.
>
> The real performance issue with I/O is that the concurrency contracts of
> JVM and CLR mean that you usually can't optimize the range checks out of
> the inner loop. It's rarely the IO *per se* that gets you.
>
>
>
Its not acually the pinning that is the problem  but how its dealt with in
the GC .. Its perfectly fine to go unsafe stac alloc  ( or a non GC /
static buffer manager ) and than access the buffer . As there is 1 reader
and then 1 writer in sequence or the reverse there is no issue with a
exclusive blocking lock.  Howver with a GC   pin  , there  can be  nasty
issues eg you have a pointer in a nursery that is now fixed ... what
happens if there is a long timeout etc and now the nursery is full . ..this
plays havock with most collectors so the solution is  to either

1)  is when the  kernel has completed the IO then allocate a pinned buffer
 copy it from the kernel  ...then relase the pin .  eg ensure  it is  short
lived .
2) Do it it native and access the unsafe data through a trusted mechanism
and treat the pointer independent of teh GC

Yes you can use a block heap .. but  this creates more complexity in the
GC  ( and we have a lot already ) destinguishing block pointers  . Also i
 had this as a "buffer manager"  in a project i was working on , which was
just a service ( seperate process)  and managed such buffers since the
access is controlled and its byte[] there is no reason not to use shared
memory  , note this buffer manager was written in unsafe and passed
UintPtr.  It is not suitable however for most  kernel calls eg  in win32 to
open a file you need to sysalloc a string etc etc .. this is why i said you
really need a GC aware OS...

I dont think range checks are anywhere  near as big an issue  ( except for
pointless microbenches which i admit are important for PR)

eg look at mono

   using (FileStream fs = File.Open(path, FileMode.Open,
FileAccess.Write, FileShare.None))
        {

        }


even the first line path will need to do something like this in win32
- Sysalloc string
- Convert from UTF16 to USC2 hopefully directly into the sysalloc string (
this is an unsafe operation ) . Without unsafe you have another copy .

even worse an  actual read  loop will normally go something like this

Driver in OS reads data into kernel
OS puts byte[]  in a C buffer
Data is read  by the file stream direct from the  C buffer ( an examble of
the unmanged wrapper)
Data is encoded from byte[] to something usefull ( normally a string in
micro benches which requires further allocation ) .

And a write will be worse
string is encode to byte[]
C  creates a buffer byte[]
C# byte[] is written  to the C buffer ( without pinning it looks like as
its a c funtion)
C buffer is copied to kernel

This copying on write  dwarfs the range checks  note here the read can be a
wrapper but the wirte needs a copy ( or you need the user to manage an
unsafe buffer)  and part of the problem is most  managed systems are built
on C  ( eg mono) and doesnt do the syscalls direct .  In fact the mono code
uses unsafe without range checks to do the io

https://github.com/mono/mono/blob/master/mcs/class/corlib/System.IO/FileStream.cs
https://github.com/mono/mono/blob/8752d92528a1d77d71d9bc7882b04df64e40c452/mcs/class/corlib/System.IO/MonoIO.cs

I also note reading 1 byte at the time can be  terrible if you dont use teh
right  stream function... it can  creates an array and buffer   of size 1
with range check in some cases.
No pinning is apprantly used so this mono code may  fail with a relocating
GC . ( this surprised me so i roughly checked the C code as well  ) . Dont
fully understand but dont have the time ...
https://github.com/mono/mono/blob/c00c3f14824d50955ba1096fae550657f5f057e5/mono/io-layer/io.c


re range check so if you had a [SIngleThread] attrib  to guide the compiler
you could eliminate range checks  ?  Seems like a no brainer to me ..
especially for single threaded message pump type services.

Ben
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to