To be honest, I was rather surprised it did. By glancing at code in 
the core directroy, something apparently used to hold dimension sizes 
are type int. Anyways, my point was to see if I can create something 
more than 4GB, your limit. My (big) data is >3D, typically, image 
series. Anyways, if you need somebody with enough RAM to run some tests, 
go ahead and tell me. BTW, a patch would come very handy, I tend to 
forget how to do these things, but running diff somehow.

It apparently is a bug or flawed design, and providing something that 
pushes the cases where it hits is better than the state it is, although 
it is not a fix.

On an unrelated note, is there a plan to support using multiple CPUs, 
parallelisation of some kind, or did I miss something and it is 
available already?
Cheers
Ingo

On 08/04/2010 09:31 PM, Jarle Brinchmann wrote:
> That's good to hear Ingo, but be warned that it will break down in some cases 
> - it is not a lasting fix. However it does indicate that this is a reasonable 
> direction to go.
>
> As Chris pointed out to me, in certain cases the pdl allocation will break. 
> Creating one very long 1D pdl is a particular case. Unfortunately fixing that 
> particular nastiness is a lot more work and brings up problems for PDL::PP.  
> However this simple solution is probably ok in quite a few cases as long as 
> the piddles are 2D or higher dimensional, but do say if you come across some 
> oddities because this should be fixed properly and getting together a good 
> set of test cases would be very valuable for this!
>
>       Cheers
>               Jarle.
>
>
> On 4 Aug 2010, at 20:14, Ingo Schmid wrote:
>
>> Hi Jarle,
>>
>> I can confirm that this works, so far, even for>4GB. Excellent news!
>> Cheers
>> Ingo
>>
>> On 08/04/2010 04:43 PM, Jarle Brinchmann wrote:
>>> Hi Chris,
>>>
>>> It seems sufficient (to me) to merely change pdl_grow in pdlhash.c. This at 
>>> least seems to work for me now with up to 4Gb pdls (after which my machine 
>>> grinds to a halt). I merely changed the int's to STRLENs and put a couple 
>>> of casts in. Seems ok but I haven't had a chance to test it extensively as 
>>> my computer has 4Gb memory and I have other things to do :) However PDL 
>>> passes all tests with this small modification and since the modification is 
>>> all within one subroutine I believe it should be fine.
>>>
>>> PS: I am not checking this into the repository for now - it needs some 
>>> independent testing.
>>>
>>>     Cheers,
>>>             Jarle.
>>>
>>>
>>> void pdl_grow (pdl* a, int newsize) {
>>>
>>>     SV* foo;
>>>     HV* hash;
>>>
>>>     STRLEN nbytes;
>>>     STRLEN ncurr;
>>>     STRLEN len;
>>>
>>>     if(a->state&   PDL_DONTTOUCHDATA) {
>>>             die("Trying to touch data of an untouchable (mmapped?) pdl");
>>>     }
>>>
>>>     if(a->datasv == NULL)
>>>             a->datasv = newSVpv("",0);
>>>
>>>     foo = a->datasv;
>>>
>>>     nbytes = (STRLEN) newsize * pdl_howbig(a->datatype);
>>>
>>>     ncurr  = SvCUR( foo );
>>>     if (ncurr == nbytes)
>>>        return;    /* Nothing to be done */
>>>
>>> /* We don't want to do this: if someone is resizing it
>>>   * but wanting to preserve data.. */
>>> #ifdef FEOIJFOESIJFOJE
>>>     if (ncurr>nbytes)  /* Nuke back to zero */
>>>        sv_setpvn(foo,"",0);
>>> #endif
>>>     if(nbytes>   (1024*1024*1024)) {
>>>       SV *sv = get_sv("PDL::BIGPDL",0);
>>>       if(sv == NULL || !(SvTRUE(sv)))
>>>             die("Probably false alloc of over 1Gb PDL! (set $PDL::BIGPDL = 
>>> 1 to enable)");
>>>       fflush(stdout);
>>>     }
>>>
>>>     {
>>>       void *p;
>>>       p = SvGROW ( foo, nbytes );
>>>       SvCUR_set( foo, nbytes );
>>>     }
>>>     a->data = (void *) SvPV( foo, len ); a->nvals = newsize;
>>> }
>>>
>>>
>>>
>>> On 4 Aug 2010, at 15:52, Chris Marshall wrote:
>>>
>>>> On 8/4/2010 6:22 AM, Ingo Schmid wrote:
>>>>> first of all, thanks for the many replies. I was not aware that this
>>>>> issue was unknown. I can try debugging the issue,  I have access to
>>>>> enough memory, but little to no knowledge of perls internals.
>>>>>
>>>>>
>>>>> I ran the following test:
>>>>> for $i (0..2**27) { $str.='abcdefghijklmnopqrstuvwxyz0123456789';},
>>>>> that's a bit more than 2**32 (4GB). Took a few seconds to run.
>>>>> Then
>>>>> perldl>    p length ($str)
>>>>> 4831838244
>>>>>
>>>>> perldl>    p length ($str)/1024/1024/1024
>>>>> 4.50000003352761
>>>>>
>>>>> So I conclude it is not an underlying perl/string limitation, correct?
>>>>> Ingo
>>>> Thanks for running the check.  It appears confirmed that
>>>> this is a limitation of the current PDL allocation routines
>>>> that call the perl api SvGROW() but with a size as an
>>>> int rather than STRLEN type.  That puts the limit at 2**31-1
>>>> for piddle sizes.
>>>>
>>>> The fix will be to change the usage of the allocation to
>>>> use the proper STRLEN type.  Unfortunately, it is intimately
>>>> related to the working of PDL at the lowest level so the
>>>> change may break things elsewhere that use the piddles.
>>>>
>>>> It might be possible to have a shorter term fix with the
>>>> int type replaced by unsigned int to push the limit to
>>>> 4GB per piddle.  Since it is the same word length, that
>>>> could improve things in the short term.
>>>>
>>>> However, we're in the final stages of the pre PDL-2.4.7
>>>> release process so this might have to wait until after
>>>> August to be looked at in more detail.  In the meantime,
>>>> I'll open a feature request on sf.net for the support
>>>> of larger piddles.
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>>> PS: My machine is unstable gentoo ~amd64, we have ubuntu boxes also.
>>>>>
>>>>> uname  -a
>>>>>
>>>>> Linux spectre 2.6.33-gentoo-r2 #4 SMP Thu Jul 29 12:26:35 CEST 2010
>>>>> x86_64 Intel(R) Xeon(R) CPU W3520 @ 2.67GHz GenuineIntel GNU/Linux
>>>>>
>>>>> 12GB RAM
>>>>>
>>>>> perl -V:
>>>>>
>>>>>
>>>>> Summary of my perl5 (revision 5 version 12 subversion 1) configuration:
>>>>>
>>>>>     Platform:
>>>>>       osname=linux, osvers=2.6.33-gentoo, archname=x86_64-linux
>>>>>       uname='linux spectre 2.6.33-gentoo #2 smp tue apr 6 10:24:11 cest
>>>>> 2010 x86_64 intel(r) xeon(r) cpu w3520 @ 2.67ghz genuineintel gnulinux '
>>>>>       config_args='-des -Duseshrplib -Darchname=x86_64-linux
>>>>> -Dcc=x86_64-pc-linux-gnu-gcc -Doptimize=-O2 -pipe -march=core2
>>>>> -fomit-frame-pointer -msse4 -msse4.1 -msse4.2 -mcx16 -msahf
>>>>> -Dprefix=/usr -Dsiteprefix=/usr -Dvendorprefix=/usr
>>>>> -Dprivlib=/usr/lib64/perl5/5.12.1
>>>>> -Darchlib=/usr/lib64/perl5/5.12.1/x86_64-linux
>>>>> -Dsitelib=/usr/lib64/perl5/site_perl/5.12.1
>>>>> -Dsitearch=/usr/lib64/perl5/site_perl/5.12.1/x86_64-linux
>>>>> -Dvendorlib=/usr/lib64/perl5/vendor_perl/5.12.1
>>>>> -Dvendorarch=/usr/lib64/perl5/vendor_perl/5.12.1/x86_64-linux
>>>>> -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3
>>>>> -Dsiteman1dir=/usr/share/man/man1 -Dsiteman3dir=/usr/share/man/man3
>>>>> -Dvendorman1dir=/usr/share/man/man1 -Dvendorman3dir=/usr/share/man/man3
>>>>> -Dman1ext=1 -Dman3ext=3pm -Dlibperl=libperl.so.5.12.1 -Dlocincpth=
>>>>> -Duselargefiles -Dd_semctl_semun -Dcf_by=Gentoo -Dmyhostname=localhost
>>>>> -dperladmin=r...@localhost -Dinstallusrbinperl=n -Ud_csh -Uusenm
>>>>> -Di_ndbm -Di_gdbm -Di_db -Dinc_version_list=5.12.0 5.12.0/x86_64-linux
>>>>> -Dusrinc=/usr/include -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64'
>>>>>       hint=recommended, useposix=true, d_sigaction=define
>>>>>       useithreads=undef, usemultiplicity=undef
>>>>>       useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
>>>>>       use64bitint=define, use64bitall=define, uselongdouble=undef
>>>>>       usemymalloc=n, bincompat5005=undef
>>>>>     Compiler:
>>>>>       cc='x86_64-pc-linux-gnu-gcc', ccflags ='-fno-strict-aliasing -pipe
>>>>> -fstack-protector -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
>>>>>       optimize='-O2 -pipe -march=core2 -fomit-frame-pointer -msse4
>>>>> -msse4.1 -msse4.2 -mcx16 -msahf',
>>>>>       cppflags='-fno-strict-aliasing -pipe -fstack-protector'
>>>>>       ccversion='', gccversion='4.4.4', gccosandvers=''
>>>>>       intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
>>>>>       d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
>>>>>       ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
>>>>> lseeksize=8
>>>>>       alignbytes=8, prototype=define
>>>>>     Linker and Libraries:
>>>>>       ld='x86_64-pc-linux-gnu-gcc', ldflags =' -fstack-protector'
>>>>>       libpth=/usr/local/lib64 /lib64 /usr/lib64
>>>>>       libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
>>>>>       perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
>>>>>       libc=/lib/libc-2.11.2.so, so=so, useshrplib=true,
>>>>> libperl=libperl.so.5.12.1
>>>>>       gnulibc_version='2.11.2'
>>>>>     Dynamic Linking:
>>>>>       dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
>>>>>       cccdlflags='-fPIC', lddlflags='-shared -O2 -pipe -march=core2
>>>>> -fomit-frame-pointer -msse4 -msse4.1 -msse4.2 -mcx16 -msahf
>>>>> -fstack-protector'
>>>>>
>>>>>
>>>>> Characteristics of this binary (from libperl):
>>>>>     Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
>>>>> USE_64_BIT_ALL
>>>>>                           USE_64_BIT_INT USE_LARGE_FILES USE_PERLIO
>>>>>                           USE_PERL_ATOF
>>>>>     Locally applied patches:
>>>>>       0001-gentoo_MakeMaker-RUNPATH.diff
>>>>>       0002-gentoo_config__over.diff
>>>>>       0003-gentoo_cpan__definstalldirs.diff
>>>>>       0004-gentoo_cpanplus__definstalldirs.diff
>>>>>       0005-gentoo_create-libperl-soname.diff
>>>>>       0006-gentoo_MakeMaker-delete__packlist.diff
>>>>>       0007-fixes_8d66b3f9__h2hp__fix.diff
>>>>>       0008-fixes_ef9df645__glob__crashes__when__File__Glob__is__empty.diff
>>>>>
>>>>> 0009-fixes_e3d01d03__Naif__calls__segfault__T__PRTOBJ__of__the__stock__typemap.diff
>>>>>     Built under linux
>>>>>     Compiled at Jul 26 2010 11:18:49
>>>>>     @INC:
>>>>>       /usr/lib64/perl5/site_perl/5.12.1/x86_64-linux
>>>>>       /usr/lib64/perl5/site_perl/5.12.1
>>>>>       /usr/lib64/perl5/vendor_perl/5.12.1/x86_64-linux
>>>>>       /usr/lib64/perl5/vendor_perl/5.12.1
>>>>>       /usr/lib64/perl5/5.12.1/x86_64-linux
>>>>>       /usr/lib64/perl5/5.12.1
>>>>>       /usr/lib64/perl5/site_perl
>>>>>       /usr/lib64/perl5/vendor_perl
>>>>>
>>>>> On 08/04/2010 03:59 AM, Chris Marshall wrote:
>>>>>> On 8/3/2010 9:43 PM, Christian Soeller wrote:
>>>>>>> Is it possible to change things so that 64 bit sizes
>>>>>>>   can be passed in the two places you identified and see
>>>>>>>   if that works?
>>>>>>> I appreciate that things could still fall over in various
>>>>>>>   PP autogenerated code pieces if ints are used for offset
>>>>>>>   calculations in slice and other vaffine operations.
>>>>>> It could work.  The problem is it needs someone with
>>>>>> a 64bit OS, lots of memory, and a willingness to
>>>>>> debug the issue.  I don't have any *large* memory
>>>>>> systems at the moment.  Not that I wouldn't like to
>>>>>> have one.  :-)
>>>>>>
>>>>>> --Chris
>>>>>>
>>>>>>
>>>>>>> On 4/08/2010, at 12:46 PM, Chris Marshall wrote:
>>>>>>>
>>>>>>>> I took a further look at the SvGROW calls in
>>>>>>>> PDL/Basic/Core routines and the two that I found
>>>>>>>> both use int type for their sizes.  That would
>>>>>>>> limit a piddle size to<2**31 or about 2GB.
>>>>>>>>
>>>>>>>> It looks like 64bit support for PDL may need
>>>>>>>> to be added to the list for the future.  I don't
>>>>>>>> know the scope of the changes that would be
>>>>>>>> required to support larger PDL data objects.
>>>>>>>>
>>>>>>>> --Chris
>>>>>>>>
>>>>>>>> On 8/3/2010 8:39 PM, Chris Marshall wrote:
>>>>>>>>> On 8/3/2010 8:30 PM, P Kishor wrote:
>>>>>>>>>> On Tue, Aug 3, 2010 at 7:19 PM, Chris Marshall<[email protected]>    
>>>>>>>>>>     wrote:
>>>>>>>>>>> On 8/3/2010 8:01 PM, P Kishor wrote:
>>>>>>>>>>>> also on 64-bit Snow Leopard (Mac OS X 10.6.4)
>>>>>>>>>>>>
>>>>>>>>>>>> punk...@lucknow ~$ perl -MPDL -e '$PDL::BIGPDL=1; $x = 
>>>>>>>>>>>> sequence(float,
>>>>>>>>>>>> 23171, 23171); print $x->info("%M")."\n"'
>>>>>>>>>>>> perl(85899) malloc: *** mmap(size=18446744071562166272) failed 
>>>>>>>>>>>> (error
>>>>>>>>>>>> code=12)
>>>>>>>>>>>> *** error: can't allocate region
>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>>>>> Out of memory!
>>>>>>>>>>>> punk...@lucknow ~$
>>>>>>>>>>> What is perl -V?
>>>>>>>>> I looked at the PDL/Basic/Core stuff and it looks like
>>>>>>>>> if SvGROW can handle a>2GB string then, in principle,
>>>>>>>>> PDL should be able to handle piddles of that size.
>>>>>>>>>
>>>>>>>>> Could you see if you can create a string more than 2GB
>>>>>>>>> long?  It might take a while but it will tell us if the
>>>>>>>>> limit is perl or PDL.  Since the PDL routines for growing
>>>>>>>>> a new piddle use 4byte ints for their sizes (rather than
>>>>>>>>> size_t objects), it is pretty clear that there is a bug
>>>>>>>>> in the PDL allocation stuff if perl can handle the longer
>>>>>>>>> strings.
>>>>>>>>>
>>>>>>>>> --Chris
>>>>>>>> _______________________________________________
>>>>>>>> Perldl mailing list
>>>>>>>> [email protected]
>>>>>>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>>> _______________________________________________
>>>> Perldl mailing list
>>>> [email protected]
>>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to