Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays
On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote: Hi Luis, To have your patch integrated, some changes are needed. First, we want to default to 32bits sized arrays on 64bits machines, so your changes must be conditionally compiled. To help with that some changed to your patch are due. Next are some comments about it: Instead of replacing guint32 for gsize, it's better to create a new type, let's say array_size_t. This would reduce conditional compilation to fewer places. Rodrigo: Permit me to reply to your message in parts, so that I clear on what needs to be done, from your point of view. 1) Add support for a new configuration argument to configure.in (AC_ARG_ENABLE(big-arrays,[ --enable-bug-arrays Enable allocation and indexing of arrays 31 bits in size.],...)) 2) Add a compile time define (AC_DEFINE(BIG_ARRAYS,1,[Enable allocation and indexing of arrays 31 bits in size.])) 3) Use said compile time define to typedef array_size_t to either guint32 or guint64 depending on the state of BIG_ARRAYS in object.h 4) Use array_size_t where I used gsize in my patch. Questions: A) Do you agree then that the correct size types are either guint32 or guint64? Or would gint32 and gint64 be better choices as they would agree with the input arguments to ves_icall_System_Array_CreateInstanceImpl functions and the usage in socket-io.c? B) Should the CreateInstanceImpl64 method definition in icall-def.h exist if BIG_ARRAYS is not defined. C) In the case where BIG_ARRAYS is not defined, should ves_icall_System_Array_CreateInstanceImpl64() still exist and make sure all arguments fit into the small guint32? D) mono_array_size_t or array_size_t? Without the mono prefix, i fear that some third-party library might also use the obvious name. Should the BIG_ARRAYS definition also use a MONO_ prefix? /Ortiz/Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays
On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote: Hi Luis, To have your patch integrated, some changes are needed. First, we want to default to 32bits sized arrays on 64bits machines, so your changes must be conditionally compiled. To help with that some changed to your patch are due. Next are some comments about it: ... /* helper macros to check for overflow when calculating the size of arrays */ -#define MYGUINT32_MAX 4294967295U +#if (GLIB_SIZEOF_SIZE_T 4 ) +#define MYGUINT32_MAX 0xUL +#define MYGUINT_MAX MYGUINT32_MAX This #if seens bogus, don't you mean if ((GLIB_SIZEOF_SIZE_T == 4 ) as mono never supported 16bits machines. The macros can be unified by using MYGUINT_MAX and the 'array_size_t' type I talked before. The definition of MYGUINT_MAX should be put together in the same place we define 'array_size_t'. And we could go with a meaningful name, don't you think? The whole MYGUINT32 thing was there before made my changes and appears in other places in interpreter/interp.c as well. I don't understand why the GLIB define of G_MAXUINT32 was not used instead these private definitions. I get nervous when I see magic constants, either in hexadecimal or decimal. I would rather see a definition of MONO_ARRAY_MAX_INDEX set to either G_MAXINT32 or G_MAXINT64 and a definition of MONO_ARRAY_MAX_SIZE set to either G_MAXUNIT32 or G_MAXUNIT64, in object.h . Then this code could lose the #if here and the magic numbers would be kept in one place. /Ortiz/Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays
On May 7, 2008, at 10:50 PM, Rodrigo Kumpera wrote: Hi Luis, To have your patch integrated, some changes are needed. First, we want to default to 32bits sized arrays on 64bits machines, so your changes must be conditionally compiled. To help with that some changed to your patch are due. Next are some comments about it: ... -if (CHECK_MUL_OVERFLOW_UN (n, elem_size)) -mono_gc_out_of_memory (MYGUINT32_MAX); +if (CHECK_MUL_OVERFLOW_UN (n, elem_size)) { +g_print(CHECK_MUL_OVERFLOW_UN(%zd,%zd) failed\n,n, elem_size); +mono_gc_out_of_memory (MYGUINT_MAX); +} If you find that keeping such debug code is really important, you should follow the same pattern of the rest of the project. Take a look at how DEBUG_IMT is used on object.c. This was an oversight on my part and should have never made it out. @@ -3548,34 +3559,30 @@ /* A single dimensional array with a 0 lower bound is the same as an szarray */ if (array_class-rank == 1 ((array_class-byval_arg.type == MONO_TYPE_SZARRAY) || (lower_bounds lower_bounds [0] == 0))) { len = lengths [0]; -if ((int) len 0) -arith_overflow (); Why are you removing overflow checks here? That is an error on my part. It should have remained. @@ -562,6 +607,26 @@ if (this-bounds == NULL) return this-max_length; +length = this-bounds [dimension].length; +if (length G_MAXINT32) +mono_raise_exception (mono_get_exception_overflow ()); + +return length; +} Why throwing an exception here? I'm not sure this is the way to go, unfortunately this is an area underspecified on ecma. Not that truncating is a good option either. Now that you are making me look at this code, there is a bug there. When there are no bounds, the length temporary should be set to this- max_length and the same overflow test applied. The overflow test should be conditionally compiled when some preprocessor variable is defined, as per your previous suggestion. If a big array is passed into a function that is not prepared to handle it, when it tries to get the length, it could end up with a negative number or a positive number which is waay to small. Though ECMA has under-specified here, raising an exception seems to only possible course. /Ortiz/Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] How To Convince Mono To Allocate Big Arrays
On May 8, 2008, at 9:29 AM, Rodrigo Kumpera wrote: One important thing I forgot. If you break your patch into a few smaller ones the review process will be a lot easier to every one involved. The first one can introduce new types and configuration foo; then other to fix codegen and Array methods and; at last, a bunch of fixes to classlib issues -like sockets, file i/o and so on. Cheers, Rodrigo OK, here is another stab at the changes. This modifies the following files: configure.in mono/metadata/object.c mono/metadata/object.h mono/metadata/icall-def.h mono/metadata/icall.c mono/metadata/socket-io.c These files: 1) Add a new configuration option, --enable-big-arrays, which defines conditionally defines MONO_BIG_ARRAYS in config.h, 2) Add in mono/metadata/object.h : A) A new typedef for mono_array_size_t to be either guint32 or guint64 B) A new #define for MONO_ARRAY_MAX_INDEX for the biggest valid array index, i.e. G_MAXINTxx C) A new #define for MONO_ARRAY_MAX_SIZE for the biggest valid array allocation, i.e. G_MAXUINTxx D) The above all controlled by MONO_BIG_ARRAYS 3) Modify the definitions of mono_array_new(), mono_array_new_full(), and mono_array_new_specific() to match, 4) Modify the usages of those functions in the metadata directory to match, 5) Add range checking in ves_icall_System_Array_CreateInstanceImpl64 so it works with or without MONO_BIG_ARRAYS, 6) Attempt to address all the mistakes you pointed out. These changes, in addition to the other changes I made, pass make check with the exception of Tests.test_0_regress_75832(), as previously confessed. Rodrigo1.patch Description: Binary data /Ortiz/Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice
On Sep 5, 2007, at 1:12 AM, Jeroen Frijters wrote: Luis F. Ortiz wrote: On Sep 4, 2007, at 6:51 AM, Marek Safar wrote: Hi Luis, 1) MCS assumed that the arguments to NEWARR were always U4 or I4, which does not seem to be the case as far as ECMA-335v4. ... A) Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of OpCodes.Conv_Ovf_U4/I4 for array size arguments, Any mcs patches with self contained tests are welcome. However, I think I fix this issue 2-3 weeks ago. Please use SVN version instead. I see you did do serious surgery to the code in question in revision 84357. I have a question: ¿How literally should ECMA-335 be taken? If one were to take section 4.20 (newarr - create a zero-based, one- dimensional array) literally: Valid array indexes are 0 = index numElems. ... Verifiability: .numElems shall be of type native int or int32. then I would be inclined to say some work might still be necessary. The implication would be that unsigned arguments are not proper and that a 32 bit implementation should be limited to 2^32 - 1 elements. No. The verifier doesn't care about signed vs. unsigned. So the above quote really means native [u]int or [u]int32. For details see the section Verification Types in the ECMA CLI spec. Thanks for taking the time to straighten me out, Jeroen. --Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice
On Sep 5, 2007, at 5:37 AM, Robert Jordan wrote: Luis F. Ortiz wrote: But at least now there is only one method to change, though I suspect it will take more than a few minutes for me to get the SVN version to compile from scratch. The big array support leads to an API breakage on 64 systems: Unlike other API structs, MonoArray's internals are publicly exposed by macros. Since its max_length member has to be extended to 64 bit, the ABI will change and all applications using libmono have to be recompiled. IMO, this cannot be done before Mono 2.0, especially when the benefits are so tiny, but feel free to provide patches. You are quire correct, API breakage is to be avoided, thanks for pointing out this issue. I'm not sure what release/branching structure is used by the mono project, but I suspect it is a live trunk with branches used to record releases. Thus these changes could be set up to work off off a ./configure option (say --with- large-objects) which could result something like MONO_LARGE_OBJECTS being defined and conditional compiles used while the 1.2.X series is being worked on. That way, the API breakage will be restricted to those who want to experiment. Then, when 2.0.0 is in active development, the default condition for the configure switch can be flipped to ON and then later, once everyone is comfy with the changes, the conditional code can be ripped out. --Luis ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice
Jonathan: Sorry for the delay in my response; first off, the test application I wrote was buggy and lied about how much memory was being allocated. I had to rewrite it. It now clearly can not allocate arrays larger than 2GB, though I have yet to pin down the exact amounts, as it seems that my test program gives me numbers that are different, depending on which C# compiler I use (csc or mcs) and the CLR in use under XP64. I'll get down to the bottom of this and reply later. --Luis On Sep 3, 2007, at 5:26 PM, Jonathan Chambers wrote: Hello, I don't have an answer to your problem, but I did have another question. I can't find a link to the documentation in msdn, but I thought arrays were limited to 2 GB in the .Net runtime (even on 64- bit). Do you not see this behavior? http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx Thanks, Jonathan On 9/3/07, Luis F. Ortiz [EMAIL PROTECTED] wrote: Folks: I was porting a small test application that was written in C# that allocated an array with a large number of elements ( 2^32). While it compiled and ran in Visual Studio's C# under WindowsXP-64bit without a hitch, when I compiled and ran it under mcs/mono (1.2.4) on the same hardware, but booted up under Fedora7-x86_64, I ran into a few problems. Digging into it a bit, I discovered that: 1) MCS assumed that the arguments to NEWARR were always U4 or I4, which does not seem to be the case as far as ECMA-335v4. 2) MONO assumes that array lengths and bounds can always be represented as guint32's, [ like mono_array_new_specific (MonoVTable *vtable, guint32 n) ] Would folks object to a series of patches to: A) Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of OpCodes.Conv_Ovf_U4/I4 for array size arguments, B) Modify mono/metadata/object.h to change the base type for array lengths/bounds to XXX instead of guint32, C) Change mono/metadata/object.c to change the functions that create/ access arrays to take XXX instead of guint32 length/bounds arguments. Also perhaps update some of the lower level object allocation functions to use XXX as needed. D) Modify the execution of NEWARR be able to deal with I/U native types. No doubt something in the JIT needs tweaking as well. E) Double check that array indexing not only takes native int types, but uses them right. F) Add a few new test cases for large array allocation. G) Whatever else needs to be done that I'll bump into after I try making the above changes and realize what I forgot. Warnings of known hazards gratefully accepted. My big question is what the right type for the size ought to be? I've seen int, size_t, gsize, and guint32 used in Mono. I suspect that int and guint32 are wrong from a portability perspective, but would prefer that someone more intimately involved with Mono to say if guint or gsize or size_t were preferred. I'm guessing Bug 81774 is related to this as well. BTW, is this too big for a newby to tackle? --Luis F. Ortiz Follower of the Selfish Way ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Big Arrays, Many Changes --- Request for Advice
On Sep 4, 2007, at 6:51 AM, Marek Safar wrote: Hi Luis, 1) MCS assumed that the arguments to NEWARR were always U4 or I4, which does not seem to be the case as far as ECMA-335v4. ... A) Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of OpCodes.Conv_Ovf_U4/I4 for array size arguments, Any mcs patches with self contained tests are welcome. However, I think I fix this issue 2-3 weeks ago. Please use SVN version instead. I see you did do serious surgery to the code in question in revision 84357. I have a question: ¿How literally should ECMA-335 be taken? If one were to take section 4.20 (newarr – create a zero-based, one- dimensional array) literally: Valid array indexes are 0 = index numElems. ... Verifiability: .numElems shall be of type native int or int32. then I would be inclined to say some work might still be necessary. The implication would be that unsigned arguments are not proper and that a 32 bit implementation should be limited to 2^32 - 1 elements. So perhaps uint32_type's should be converted via Conv_Ovf_I4, and uint64_type's via Conv_Ovf_I. But at least now there is only one method to change, though I suspect it will take more than a few minutes for me to get the SVN version to compile from scratch. --Luis F. Ortiz EmitArray.diff Description: Binary data ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Big Arrays, Many Changes --- Request for Advice
Folks: I was porting a small test application that was written in C# that allocated an array with a large number of elements ( 2^32). While it compiled and ran in Visual Studio's C# under WindowsXP-64bit without a hitch, when I compiled and ran it under mcs/mono (1.2.4) on the same hardware, but booted up under Fedora7-x86_64, I ran into a few problems. Digging into it a bit, I discovered that: 1) MCS assumed that the arguments to NEWARR were always U4 or I4, which does not seem to be the case as far as ECMA-335v4. 2) MONO assumes that array lengths and bounds can always be represented as guint32's, [ like mono_array_new_specific (MonoVTable *vtable, guint32 n) ] Would folks object to a series of patches to: A) Fix mcs/expression.cs to emit OpCodes.Conv_Ovf_U/I instead of OpCodes.Conv_Ovf_U4/I4 for array size arguments, B) Modify mono/metadata/object.h to change the base type for array lengths/bounds to XXX instead of guint32, C) Change mono/metadata/object.c to change the functions that create/ access arrays to take XXX instead of guint32 length/bounds arguments. Also perhaps update some of the lower level object allocation functions to use XXX as needed. D) Modify the execution of NEWARR be able to deal with I/U native types. No doubt something in the JIT needs tweaking as well. E) Double check that array indexing not only takes native int types, but uses them right. F) Add a few new test cases for large array allocation. G) Whatever else needs to be done that I'll bump into after I try making the above changes and realize what I forgot. Warnings of known hazards gratefully accepted. My big question is what the right type for the size ought to be? I've seen int, size_t, gsize, and guint32 used in Mono. I suspect that int and guint32 are wrong from a portability perspective, but would prefer that someone more intimately involved with Mono to say if guint or gsize or size_t were preferred. I'm guessing Bug 81774 is related to this as well. BTW, is this too big for a newby to tackle? --Luis F. Ortiz Follower of the Selfish Way ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list