To be honest, I was rather surprised it did. By glancing at code in
the core directroy, something apparently used to hold dimension sizes
are type int. Anyways, my point was to see if I can create something
more than 4GB, your limit. My (big) data is >3D, typically, image
series. Anyways, if you need somebody with enough RAM to run some tests,
go ahead and tell me. BTW, a patch would come very handy, I tend to
forget how to do these things, but running diff somehow.
It apparently is a bug or flawed design, and providing something that
pushes the cases where it hits is better than the state it is, although
it is not a fix.
On an unrelated note, is there a plan to support using multiple CPUs,
parallelisation of some kind, or did I miss something and it is
available already?
Cheers
Ingo
On 08/04/2010 09:31 PM, Jarle Brinchmann wrote:
> That's good to hear Ingo, but be warned that it will break down in some cases
> - it is not a lasting fix. However it does indicate that this is a reasonable
> direction to go.
>
> As Chris pointed out to me, in certain cases the pdl allocation will break.
> Creating one very long 1D pdl is a particular case. Unfortunately fixing that
> particular nastiness is a lot more work and brings up problems for PDL::PP.
> However this simple solution is probably ok in quite a few cases as long as
> the piddles are 2D or higher dimensional, but do say if you come across some
> oddities because this should be fixed properly and getting together a good
> set of test cases would be very valuable for this!
>
> Cheers
> Jarle.
>
>
> On 4 Aug 2010, at 20:14, Ingo Schmid wrote:
>
>> Hi Jarle,
>>
>> I can confirm that this works, so far, even for>4GB. Excellent news!
>> Cheers
>> Ingo
>>
>> On 08/04/2010 04:43 PM, Jarle Brinchmann wrote:
>>> Hi Chris,
>>>
>>> It seems sufficient (to me) to merely change pdl_grow in pdlhash.c. This at
>>> least seems to work for me now with up to 4Gb pdls (after which my machine
>>> grinds to a halt). I merely changed the int's to STRLENs and put a couple
>>> of casts in. Seems ok but I haven't had a chance to test it extensively as
>>> my computer has 4Gb memory and I have other things to do :) However PDL
>>> passes all tests with this small modification and since the modification is
>>> all within one subroutine I believe it should be fine.
>>>
>>> PS: I am not checking this into the repository for now - it needs some
>>> independent testing.
>>>
>>> Cheers,
>>> Jarle.
>>>
>>>
>>> void pdl_grow (pdl* a, int newsize) {
>>>
>>> SV* foo;
>>> HV* hash;
>>>
>>> STRLEN nbytes;
>>> STRLEN ncurr;
>>> STRLEN len;
>>>
>>> if(a->state& PDL_DONTTOUCHDATA) {
>>> die("Trying to touch data of an untouchable (mmapped?) pdl");
>>> }
>>>
>>> if(a->datasv == NULL)
>>> a->datasv = newSVpv("",0);
>>>
>>> foo = a->datasv;
>>>
>>> nbytes = (STRLEN) newsize * pdl_howbig(a->datatype);
>>>
>>> ncurr = SvCUR( foo );
>>> if (ncurr == nbytes)
>>> return; /* Nothing to be done */
>>>
>>> /* We don't want to do this: if someone is resizing it
>>> * but wanting to preserve data.. */
>>> #ifdef FEOIJFOESIJFOJE
>>> if (ncurr>nbytes) /* Nuke back to zero */
>>> sv_setpvn(foo,"",0);
>>> #endif
>>> if(nbytes> (1024*1024*1024)) {
>>> SV *sv = get_sv("PDL::BIGPDL",0);
>>> if(sv == NULL || !(SvTRUE(sv)))
>>> die("Probably false alloc of over 1Gb PDL! (set $PDL::BIGPDL =
>>> 1 to enable)");
>>> fflush(stdout);
>>> }
>>>
>>> {
>>> void *p;
>>> p = SvGROW ( foo, nbytes );
>>> SvCUR_set( foo, nbytes );
>>> }
>>> a->data = (void *) SvPV( foo, len ); a->nvals = newsize;
>>> }
>>>
>>>
>>>
>>> On 4 Aug 2010, at 15:52, Chris Marshall wrote:
>>>
>>>> On 8/4/2010 6:22 AM, Ingo Schmid wrote:
>>>>> first of all, thanks for the many replies. I was not aware that this
>>>>> issue was unknown. I can try debugging the issue, I have access to
>>>>> enough memory, but little to no knowledge of perls internals.
>>>>>
>>>>>
>>>>> I ran the following test:
>>>>> for $i (0..2**27) { $str.='abcdefghijklmnopqrstuvwxyz0123456789';},
>>>>> that's a bit more than 2**32 (4GB). Took a few seconds to run.
>>>>> Then
>>>>> perldl> p length ($str)
>>>>> 4831838244
>>>>>
>>>>> perldl> p length ($str)/1024/1024/1024
>>>>> 4.50000003352761
>>>>>
>>>>> So I conclude it is not an underlying perl/string limitation, correct?
>>>>> Ingo
>>>> Thanks for running the check. It appears confirmed that
>>>> this is a limitation of the current PDL allocation routines
>>>> that call the perl api SvGROW() but with a size as an
>>>> int rather than STRLEN type. That puts the limit at 2**31-1
>>>> for piddle sizes.
>>>>
>>>> The fix will be to change the usage of the allocation to
>>>> use the proper STRLEN type. Unfortunately, it is intimately
>>>> related to the working of PDL at the lowest level so the
>>>> change may break things elsewhere that use the piddles.
>>>>
>>>> It might be possible to have a shorter term fix with the
>>>> int type replaced by unsigned int to push the limit to
>>>> 4GB per piddle. Since it is the same word length, that
>>>> could improve things in the short term.
>>>>
>>>> However, we're in the final stages of the pre PDL-2.4.7
>>>> release process so this might have to wait until after
>>>> August to be looked at in more detail. In the meantime,
>>>> I'll open a feature request on sf.net for the support
>>>> of larger piddles.
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>>> PS: My machine is unstable gentoo ~amd64, we have ubuntu boxes also.
>>>>>
>>>>> uname -a
>>>>>
>>>>> Linux spectre 2.6.33-gentoo-r2 #4 SMP Thu Jul 29 12:26:35 CEST 2010
>>>>> x86_64 Intel(R) Xeon(R) CPU W3520 @ 2.67GHz GenuineIntel GNU/Linux
>>>>>
>>>>> 12GB RAM
>>>>>
>>>>> perl -V:
>>>>>
>>>>>
>>>>> Summary of my perl5 (revision 5 version 12 subversion 1) configuration:
>>>>>
>>>>> Platform:
>>>>> osname=linux, osvers=2.6.33-gentoo, archname=x86_64-linux
>>>>> uname='linux spectre 2.6.33-gentoo #2 smp tue apr 6 10:24:11 cest
>>>>> 2010 x86_64 intel(r) xeon(r) cpu w3520 @ 2.67ghz genuineintel gnulinux '
>>>>> config_args='-des -Duseshrplib -Darchname=x86_64-linux
>>>>> -Dcc=x86_64-pc-linux-gnu-gcc -Doptimize=-O2 -pipe -march=core2
>>>>> -fomit-frame-pointer -msse4 -msse4.1 -msse4.2 -mcx16 -msahf
>>>>> -Dprefix=/usr -Dsiteprefix=/usr -Dvendorprefix=/usr
>>>>> -Dprivlib=/usr/lib64/perl5/5.12.1
>>>>> -Darchlib=/usr/lib64/perl5/5.12.1/x86_64-linux
>>>>> -Dsitelib=/usr/lib64/perl5/site_perl/5.12.1
>>>>> -Dsitearch=/usr/lib64/perl5/site_perl/5.12.1/x86_64-linux
>>>>> -Dvendorlib=/usr/lib64/perl5/vendor_perl/5.12.1
>>>>> -Dvendorarch=/usr/lib64/perl5/vendor_perl/5.12.1/x86_64-linux
>>>>> -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3
>>>>> -Dsiteman1dir=/usr/share/man/man1 -Dsiteman3dir=/usr/share/man/man3
>>>>> -Dvendorman1dir=/usr/share/man/man1 -Dvendorman3dir=/usr/share/man/man3
>>>>> -Dman1ext=1 -Dman3ext=3pm -Dlibperl=libperl.so.5.12.1 -Dlocincpth=
>>>>> -Duselargefiles -Dd_semctl_semun -Dcf_by=Gentoo -Dmyhostname=localhost
>>>>> -dperladmin=r...@localhost -Dinstallusrbinperl=n -Ud_csh -Uusenm
>>>>> -Di_ndbm -Di_gdbm -Di_db -Dinc_version_list=5.12.0 5.12.0/x86_64-linux
>>>>> -Dusrinc=/usr/include -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64'
>>>>> hint=recommended, useposix=true, d_sigaction=define
>>>>> useithreads=undef, usemultiplicity=undef
>>>>> useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
>>>>> use64bitint=define, use64bitall=define, uselongdouble=undef
>>>>> usemymalloc=n, bincompat5005=undef
>>>>> Compiler:
>>>>> cc='x86_64-pc-linux-gnu-gcc', ccflags ='-fno-strict-aliasing -pipe
>>>>> -fstack-protector -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
>>>>> optimize='-O2 -pipe -march=core2 -fomit-frame-pointer -msse4
>>>>> -msse4.1 -msse4.2 -mcx16 -msahf',
>>>>> cppflags='-fno-strict-aliasing -pipe -fstack-protector'
>>>>> ccversion='', gccversion='4.4.4', gccosandvers=''
>>>>> intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
>>>>> d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
>>>>> ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
>>>>> lseeksize=8
>>>>> alignbytes=8, prototype=define
>>>>> Linker and Libraries:
>>>>> ld='x86_64-pc-linux-gnu-gcc', ldflags =' -fstack-protector'
>>>>> libpth=/usr/local/lib64 /lib64 /usr/lib64
>>>>> libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
>>>>> perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
>>>>> libc=/lib/libc-2.11.2.so, so=so, useshrplib=true,
>>>>> libperl=libperl.so.5.12.1
>>>>> gnulibc_version='2.11.2'
>>>>> Dynamic Linking:
>>>>> dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
>>>>> cccdlflags='-fPIC', lddlflags='-shared -O2 -pipe -march=core2
>>>>> -fomit-frame-pointer -msse4 -msse4.1 -msse4.2 -mcx16 -msahf
>>>>> -fstack-protector'
>>>>>
>>>>>
>>>>> Characteristics of this binary (from libperl):
>>>>> Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
>>>>> USE_64_BIT_ALL
>>>>> USE_64_BIT_INT USE_LARGE_FILES USE_PERLIO
>>>>> USE_PERL_ATOF
>>>>> Locally applied patches:
>>>>> 0001-gentoo_MakeMaker-RUNPATH.diff
>>>>> 0002-gentoo_config__over.diff
>>>>> 0003-gentoo_cpan__definstalldirs.diff
>>>>> 0004-gentoo_cpanplus__definstalldirs.diff
>>>>> 0005-gentoo_create-libperl-soname.diff
>>>>> 0006-gentoo_MakeMaker-delete__packlist.diff
>>>>> 0007-fixes_8d66b3f9__h2hp__fix.diff
>>>>> 0008-fixes_ef9df645__glob__crashes__when__File__Glob__is__empty.diff
>>>>>
>>>>> 0009-fixes_e3d01d03__Naif__calls__segfault__T__PRTOBJ__of__the__stock__typemap.diff
>>>>> Built under linux
>>>>> Compiled at Jul 26 2010 11:18:49
>>>>> @INC:
>>>>> /usr/lib64/perl5/site_perl/5.12.1/x86_64-linux
>>>>> /usr/lib64/perl5/site_perl/5.12.1
>>>>> /usr/lib64/perl5/vendor_perl/5.12.1/x86_64-linux
>>>>> /usr/lib64/perl5/vendor_perl/5.12.1
>>>>> /usr/lib64/perl5/5.12.1/x86_64-linux
>>>>> /usr/lib64/perl5/5.12.1
>>>>> /usr/lib64/perl5/site_perl
>>>>> /usr/lib64/perl5/vendor_perl
>>>>>
>>>>> On 08/04/2010 03:59 AM, Chris Marshall wrote:
>>>>>> On 8/3/2010 9:43 PM, Christian Soeller wrote:
>>>>>>> Is it possible to change things so that 64 bit sizes
>>>>>>> can be passed in the two places you identified and see
>>>>>>> if that works?
>>>>>>> I appreciate that things could still fall over in various
>>>>>>> PP autogenerated code pieces if ints are used for offset
>>>>>>> calculations in slice and other vaffine operations.
>>>>>> It could work. The problem is it needs someone with
>>>>>> a 64bit OS, lots of memory, and a willingness to
>>>>>> debug the issue. I don't have any *large* memory
>>>>>> systems at the moment. Not that I wouldn't like to
>>>>>> have one. :-)
>>>>>>
>>>>>> --Chris
>>>>>>
>>>>>>
>>>>>>> On 4/08/2010, at 12:46 PM, Chris Marshall wrote:
>>>>>>>
>>>>>>>> I took a further look at the SvGROW calls in
>>>>>>>> PDL/Basic/Core routines and the two that I found
>>>>>>>> both use int type for their sizes. That would
>>>>>>>> limit a piddle size to<2**31 or about 2GB.
>>>>>>>>
>>>>>>>> It looks like 64bit support for PDL may need
>>>>>>>> to be added to the list for the future. I don't
>>>>>>>> know the scope of the changes that would be
>>>>>>>> required to support larger PDL data objects.
>>>>>>>>
>>>>>>>> --Chris
>>>>>>>>
>>>>>>>> On 8/3/2010 8:39 PM, Chris Marshall wrote:
>>>>>>>>> On 8/3/2010 8:30 PM, P Kishor wrote:
>>>>>>>>>> On Tue, Aug 3, 2010 at 7:19 PM, Chris Marshall<[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>> On 8/3/2010 8:01 PM, P Kishor wrote:
>>>>>>>>>>>> also on 64-bit Snow Leopard (Mac OS X 10.6.4)
>>>>>>>>>>>>
>>>>>>>>>>>> punk...@lucknow ~$ perl -MPDL -e '$PDL::BIGPDL=1; $x =
>>>>>>>>>>>> sequence(float,
>>>>>>>>>>>> 23171, 23171); print $x->info("%M")."\n"'
>>>>>>>>>>>> perl(85899) malloc: *** mmap(size=18446744071562166272) failed
>>>>>>>>>>>> (error
>>>>>>>>>>>> code=12)
>>>>>>>>>>>> *** error: can't allocate region
>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>>>>> Out of memory!
>>>>>>>>>>>> punk...@lucknow ~$
>>>>>>>>>>> What is perl -V?
>>>>>>>>> I looked at the PDL/Basic/Core stuff and it looks like
>>>>>>>>> if SvGROW can handle a>2GB string then, in principle,
>>>>>>>>> PDL should be able to handle piddles of that size.
>>>>>>>>>
>>>>>>>>> Could you see if you can create a string more than 2GB
>>>>>>>>> long? It might take a while but it will tell us if the
>>>>>>>>> limit is perl or PDL. Since the PDL routines for growing
>>>>>>>>> a new piddle use 4byte ints for their sizes (rather than
>>>>>>>>> size_t objects), it is pretty clear that there is a bug
>>>>>>>>> in the PDL allocation stuff if perl can handle the longer
>>>>>>>>> strings.
>>>>>>>>>
>>>>>>>>> --Chris
>>>>>>>> _______________________________________________
>>>>>>>> Perldl mailing list
>>>>>>>> [email protected]
>>>>>>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>>> _______________________________________________
>>>> Perldl mailing list
>>>> [email protected]
>>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl