Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays

2008-05-08 Thread Luis F. Ortiz
On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote:
 Hi Luis,

 To have your patch integrated, some changes are needed. First, we  
 want to default to 32bits sized arrays on 64bits machines, so your  
 changes must be conditionally compiled. To help with that some  
 changed to your patch are due. Next are some comments about it:

 Instead of replacing guint32 for gsize, it's better to create a new  
 type, let's say array_size_t. This would reduce conditional  
 compilation to fewer places.

Rodrigo:
Permit me to reply to your message in parts, so that I clear on what  
needs to be done, from your point of view.

1) Add support for a new configuration argument to configure.in  
(AC_ARG_ENABLE(big-arrays,[ --enable-bug-arrays Enable allocation and  
indexing of arrays  31 bits in size.],...))
2) Add a compile time define (AC_DEFINE(BIG_ARRAYS,1,[Enable  
allocation and indexing of arrays  31 bits in size.]))
3) Use said compile time define to typedef array_size_t to either  
guint32 or guint64 depending on the state of BIG_ARRAYS in object.h
4) Use array_size_t where I used gsize in my patch.

Questions:

A) Do you agree then that the correct size types are either guint32 or  
guint64?
Or would gint32 and gint64 be better choices as they would agree  
with the input arguments to ves_icall_System_Array_CreateInstanceImpl  
functions and the usage in socket-io.c?

B) Should the CreateInstanceImpl64 method definition in icall-def.h  
exist if BIG_ARRAYS is not defined.

C) In the case where BIG_ARRAYS is not defined,  should  
ves_icall_System_Array_CreateInstanceImpl64() still exist and make  
sure all arguments fit into the small guint32?

D) mono_array_size_t or array_size_t?   Without the mono prefix, i  
fear that some third-party library might also use the obvious name.   
Should the BIG_ARRAYS definition also use a MONO_ prefix?

/Ortiz/Luis





___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays

2008-05-08 Thread Luis F. Ortiz

On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote:


Hi Luis,

To have your patch integrated, some changes are needed. First, we  
want to default to 32bits sized arrays on 64bits machines, so your  
changes must be conditionally compiled. To help with that some  
changed to your patch are due. Next are some comments about it:

...
 /* helper macros to check for overflow when calculating the size of  
arrays */

-#define MYGUINT32_MAX 4294967295U
+#if (GLIB_SIZEOF_SIZE_T  4 )
+#define MYGUINT32_MAX 0xUL
+#define MYGUINT_MAX MYGUINT32_MAX

This #if seens bogus, don't you mean if ((GLIB_SIZEOF_SIZE_T ==  
4 ) as mono never supported 16bits machines.
The macros can be unified by using MYGUINT_MAX and the  
'array_size_t' type I talked before. The definition of MYGUINT_MAX
should be put together in the same place we define 'array_size_t'.  
And we could go with a meaningful name, don't you think?


The whole MYGUINT32 thing was there before made my changes and appears  
in other places in interpreter/interp.c as well.
I don't understand why the GLIB define of G_MAXUINT32 was not used  
instead these private definitions.


I get nervous when I see magic constants, either in hexadecimal or  
decimal.


I would rather see a definition of MONO_ARRAY_MAX_INDEX set to either  
G_MAXINT32 or G_MAXINT64 and a definition
of MONO_ARRAY_MAX_SIZE set to either G_MAXUNIT32 or G_MAXUNIT64, in  
object.h .


Then this code could lose the #if here and the magic numbers would be  
kept in one place.


/Ortiz/Luis


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays

2008-05-08 Thread Luis F. Ortiz
On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote:
 Hi Luis,

 To have your patch integrated, some changes are needed. First, we  
 want to default to 32bits sized arrays on 64bits machines, so your  
 changes must be conditionally compiled. To help with that some  
 changed to your patch are due. Next are some comments about it:

 ...

 -if (CHECK_MUL_OVERFLOW_UN (n, elem_size))
 -mono_gc_out_of_memory (MYGUINT32_MAX);
 +if (CHECK_MUL_OVERFLOW_UN (n, elem_size)) {
 +g_print(CHECK_MUL_OVERFLOW_UN(%zd,%zd) failed\n,n,  
 elem_size);
 +mono_gc_out_of_memory (MYGUINT_MAX);
 +}

 If you find that keeping such debug code is really important, you  
 should follow the same pattern of the rest of the project. Take a  
 look at how DEBUG_IMT is used on object.c.

This was an oversight on my part and should have never made it out.

 @@ -3548,34 +3559,30 @@
  /* A single dimensional array with a 0 lower bound is the same  
 as an szarray */
  if (array_class-rank == 1  ((array_class-byval_arg.type ==  
 MONO_TYPE_SZARRAY) || (lower_bounds  lower_bounds [0] == 0))) {
  len = lengths [0];
 -if ((int) len  0)
 -arith_overflow ();
 Why are you removing overflow checks here?

That is an error on my part.  It should have remained.

 @@ -562,6 +607,26 @@
  if (this-bounds == NULL)
  return this-max_length;

 +length = this-bounds [dimension].length;
 +if (length  G_MAXINT32)
 +mono_raise_exception (mono_get_exception_overflow ());
 +
 +return length;
 +}

 Why throwing an exception here? I'm not sure this is the way to go,  
 unfortunately this is an area underspecified on ecma. Not that  
 truncating is a good option either.

Now that you are making me look at this code, there is a bug there. 
When there are no bounds, the length temporary should be set to this- 
 max_length and the
same overflow  test applied.   The overflow test should be  
conditionally compiled when some preprocessor variable is defined, as  
per your previous suggestion.

If a big array is passed into a function that is not prepared to  
handle it, when it tries to get the length, it could end up with a  
negative number or a positive number
which is waay to small.   Though ECMA has under-specified here,  
raising an exception seems to only possible course.

/Ortiz/Luis
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays

2008-05-08 Thread Luis F. Ortiz

On May 8, 2008, at 9:29 AM, Rodrigo Kumpera wrote:
One important thing I forgot. If you break your patch into a few  
smaller ones the review process will be a lot easier to every one  
involved.
The first one can introduce new types and configuration foo; then  
other to fix codegen and Array methods and; at last, a bunch of  
fixes to classlib issues -like sockets, file i/o and so on.


Cheers,
Rodrigo


OK, here is another stab at the changes.

This modifies the following files:
configure.in
mono/metadata/object.c
mono/metadata/object.h
mono/metadata/icall-def.h
mono/metadata/icall.c
mono/metadata/socket-io.c

These files:
 1) Add a new configuration option, --enable-big-arrays, which  
defines conditionally defines MONO_BIG_ARRAYS in config.h,

 2) Add in mono/metadata/object.h :
  A) A new typedef for mono_array_size_t to be either guint32 or  
guint64
  B) A new #define for MONO_ARRAY_MAX_INDEX for the biggest valid  
array index, i.e. G_MAXINTxx
  C) A new #define for MONO_ARRAY_MAX_SIZE for the biggest valid  
array allocation, i.e. G_MAXUINTxx

  D) The above all controlled by MONO_BIG_ARRAYS
 3) Modify the definitions of mono_array_new(),  
mono_array_new_full(), and mono_array_new_specific() to match,
 4) Modify the usages of those functions in the metadata directory to  
match,
 5) Add range checking in ves_icall_System_Array_CreateInstanceImpl64  
so it works with or without MONO_BIG_ARRAYS,

 6) Attempt to address all the mistakes you pointed out.

These changes, in addition to the other changes I made, pass make  
check with the exception of Tests.test_0_regress_75832(), as  
previously confessed.




Rodrigo1.patch
Description: Binary data




/Ortiz/Luis

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice

2007-09-06 Thread Luis F. Ortiz
On Sep 5, 2007, at 1:12 AM, Jeroen Frijters wrote:
 Luis F. Ortiz wrote:
 On Sep 4, 2007, at 6:51 AM, Marek Safar wrote:
 Hi Luis,
 1)  MCS assumed that the arguments to NEWARR were always U4 or I4,
 which does not seem
  to be the case as far as ECMA-335v4.
 ...
 A)  Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of
 OpCodes.Conv_Ovf_U4/I4
  for array size arguments,
 Any mcs patches with self contained tests are welcome. However, I
 think I fix this issue 2-3 weeks ago. Please use SVN version  
 instead.

 I see you did do serious surgery to the code in question in revision
 84357.
 I have a question: ¿How literally should ECMA-335 be taken?
 If one were to take section 4.20 (newarr - create a zero-based, one-
 dimensional array)
 literally:

  Valid array indexes are 0 = index  numElems. ...
  Verifiability:  .numElems shall be of type native int or int32.

 then I would be inclined to say some work might still be necessary.
 The implication would be that unsigned arguments are not proper and
 that a 32 bit implementation should be limited to 2^32 - 1 elements.

 No. The verifier doesn't care about signed vs. unsigned. So the  
 above quote really means native [u]int or [u]int32. For details  
 see the section Verification Types in the ECMA CLI spec.

Thanks for taking the time to straighten me out, Jeroen.

--Luis
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice

2007-09-06 Thread Luis F. Ortiz
On Sep 5, 2007, at 5:37 AM, Robert Jordan wrote:
 Luis F. Ortiz wrote:
 But at least now there is only one method to change, though I  
 suspect it
 will take
 more than a few minutes for me to get the SVN version to compile from
 scratch.

 The big array support leads to an API breakage on 64 systems:

 Unlike other API structs, MonoArray's internals are publicly exposed
 by macros. Since its max_length member has to be extended to 64 bit,
 the ABI will change and all applications using libmono have to be
 recompiled.

 IMO, this cannot be done before Mono 2.0, especially when the benefits
 are so tiny, but feel free to provide patches.

You are quire correct, API breakage is to be avoided, thanks for  
pointing out this issue. I'm not sure what release/branching  
structure is used by the mono project, but I suspect it is a live  
trunk with branches used to record releases.  Thus these changes  
could be set up to work off off a ./configure option (say --with- 
large-objects) which could result something like MONO_LARGE_OBJECTS  
being defined and conditional compiles used while the 1.2.X series is  
being worked on.  That way, the API breakage will be restricted to  
those who want to experiment.  Then, when 2.0.0 is in active  
development, the default condition for the configure switch can be  
flipped to ON and then later, once everyone is comfy with the  
changes, the conditional code can be ripped out.

--Luis



___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice

2007-09-06 Thread Luis F. Ortiz

Jonathan:
 Sorry for the delay in my response; first off, the test  
application I wrote was buggy and lied about how much memory was  
being allocated.   I had to rewrite it.  It now clearly can not  
allocate arrays larger than 2GB, though I have yet to pin down the  
exact amounts, as it seems that my test program gives me numbers that  
are different, depending on which C# compiler I use (csc or mcs) and  
the CLR in use under XP64.


 I'll get down to the bottom of this and reply later.

--Luis

On Sep 3, 2007, at 5:26 PM, Jonathan Chambers wrote:

Hello,

I don't have an answer to your problem, but I did have another  
question. I can't find a link to the documentation in msdn, but I  
thought arrays were limited to 2 GB in the .Net runtime (even on 64- 
bit). Do you not see this behavior?


http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx

Thanks,
Jonathan

On 9/3/07, Luis F. Ortiz [EMAIL PROTECTED] wrote:
Folks:

I was porting a small test application that was written in C# that
allocated
an array with a large number of elements ( 2^32).  While it
compiled and ran
in Visual Studio's C# under WindowsXP-64bit without a hitch, when I
compiled
and ran it under mcs/mono (1.2.4) on the same hardware, but booted up
under
Fedora7-x86_64, I ran into a few problems.

Digging into it a bit, I discovered that:

1)  MCS assumed that the arguments to NEWARR were always U4 or I4,
which does not seem
 to be the case as far as ECMA-335v4.
2)  MONO assumes that array lengths and bounds can always be
represented as guint32's,
 [ like mono_array_new_specific (MonoVTable *vtable, guint32 n) ]


Would folks object to a series of patches to:

A)  Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of
OpCodes.Conv_Ovf_U4/I4
 for array size arguments,
B)  Modify mono/metadata/object.h to change the base type for array
lengths/bounds to
 XXX instead of guint32,
C)  Change mono/metadata/object.c to change the functions that create/
access arrays to
 take XXX instead of guint32 length/bounds arguments.  Also
perhaps update some of
 the lower level object allocation functions to use XXX as needed.
D)  Modify the execution of NEWARR be able to deal with I/U native
types.  No doubt something
 in the JIT needs tweaking as well.
E)  Double check that array indexing not only takes native int types,
but uses them right.
F)  Add a few new test cases for large array allocation.
G)  Whatever else needs to be done that I'll bump into after I try
making the above changes
 and realize what I forgot.  Warnings of known hazards gratefully
accepted.

My big question is what the right type for the size ought to be?
I've seen int, size_t, gsize,
and guint32 used in Mono.  I suspect that int and guint32 are wrong
from a portability perspective,
but would prefer that someone more intimately involved with Mono to
say if guint or gsize or size_t
were preferred. I'm guessing Bug 81774 is related to this as well.

BTW, is this too big for a newby to tackle?

--Luis F. Ortiz
Follower of the Selfish Way
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list



___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice

2007-09-04 Thread Luis F. Ortiz

On Sep 4, 2007, at 6:51 AM, Marek Safar wrote:

Hi Luis,

1)  MCS assumed that the arguments to NEWARR were always U4 or I4,
which does not seem
 to be the case as far as ECMA-335v4.
...
A)  Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of
OpCodes.Conv_Ovf_U4/I4
 for array size arguments,

Any mcs patches with self contained tests are welcome. However,
I think I fix this issue 2-3 weeks ago. Please use SVN version  
instead.


I see you did do serious surgery to the code in question in revision  
84357.

I have a question: ¿How literally should ECMA-335 be taken?
If one were to take section 4.20 (newarr – create a zero-based, one- 
dimensional array)

literally:

Valid array indexes are 0 = index  numElems. ...
Verifiability:  .numElems shall be of type native int or int32.

then I would be inclined to say some work might still be necessary.
The implication would be that unsigned arguments are not proper and that
a 32 bit implementation should be limited to 2^32 - 1 elements.

So perhaps uint32_type's should be converted via Conv_Ovf_I4,
and uint64_type's via Conv_Ovf_I.

But at least now there is only one method to change, though I suspect  
it will take
more than a few minutes for me to get the SVN version to compile from  
scratch.


--Luis F. Ortiz



EmitArray.diff
Description: Binary data
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


[Mono-dev] Big Arrays, Many Changes --- Request for Advice

2007-09-03 Thread Luis F. Ortiz
Folks:

I was porting a small test application that was written in C# that  
allocated
an array with a large number of elements ( 2^32).  While it   
compiled and ran
in Visual Studio's C# under WindowsXP-64bit without a hitch, when I  
compiled
and ran it under mcs/mono (1.2.4) on the same hardware, but booted up  
under
Fedora7-x86_64, I ran into a few problems.

Digging into it a bit, I discovered that:

1)  MCS assumed that the arguments to NEWARR were always U4 or I4,  
which does not seem
 to be the case as far as ECMA-335v4.
2)  MONO assumes that array lengths and bounds can always be  
represented as guint32's,
 [ like mono_array_new_specific (MonoVTable *vtable, guint32 n) ]


Would folks object to a series of patches to:

A)  Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of  
OpCodes.Conv_Ovf_U4/I4
 for array size arguments,
B)  Modify mono/metadata/object.h to change the base type for array  
lengths/bounds to
 XXX instead of guint32,
C)  Change mono/metadata/object.c to change the functions that create/ 
access arrays to
 take XXX instead of guint32 length/bounds arguments.  Also  
perhaps update some of
 the lower level object allocation functions to use XXX as needed.
D)  Modify the execution of NEWARR be able to deal with I/U native  
types.  No doubt something
 in the JIT needs tweaking as well.
E)  Double check that array indexing not only takes native int types,  
but uses them right.
F)  Add a few new test cases for large array allocation.
G)  Whatever else needs to be done that I'll bump into after I try  
making the above changes
 and realize what I forgot.  Warnings of known hazards gratefully  
accepted.

My big question is what the right type for the size ought to be?   
I've seen int, size_t, gsize,
and guint32 used in Mono.  I suspect that int and guint32 are wrong  
from a portability perspective,
but would prefer that someone more intimately involved with Mono to  
say if guint or gsize or size_t
were preferred. I'm guessing Bug 81774 is related to this as well.

BTW, is this too big for a newby to tackle?

--Luis F. Ortiz
Follower of the Selfish Way
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list