Re: [Perldl] matching vectors inside a PDL
. The specific code was as follows: # sigs is byte, with dimensions about 40 x 15 x 26 # present is byte, with dimension of 15 $sigs-xchg(0,1)-inplace *= $present; I had tried numerous ways of using inplace in that line, and none of them avoided the complaint that it had run out of memory (although the memory usage prior to that command was about 10%). So if it's not generating a double intermediate, I don't see why it would run out of memory (it shouldn't have exceeded about 20% or so). I finally got it to work by splitting the structures up into slices of about 20K rows each, and doing the calculation that way. Other approaches? Ken -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Wednesday, November 19, 2014 4:23 PM To: LYONS, KENNETH B (KENNETH) Subject: Re: [Perldl] matching vectors inside a PDL re-cc-ing the perldl list Thanks for the background. If you hit a snag, feel free to post to the perldl list. We're usually able to help for specific problems especially if accompanied by code demonstrating the problem: If I have a big byte piddle $a and I multiply it in-place my PDL session crashes because of a huge intermediate temp: pdl $a = (10*random(100))-floor-byte; pdl $a-inplace-mult(5,0); Error message here or crash Without a specific example, I would guess that the problem is the piddle you are multiplying by (or perl scalar) is of type double which would result in an intermediate temp of double type which would then collapse down to a byte piddle again at the end. If both arguments to multiply are of byte type, you can avoid the big double intermediate temp. E.g. pdl p pdl(5)-type double pdl p byte(5)-type byte Improved type support is planned for the PDL3 work. My initial ideas for bitfield support can be seen here: http://mailman.jach.hawaii.edu/pipermail/pdl-porters/2013-December/006132.html Hope this helps, Chris On Wed, Nov 19, 2014 at 1:42 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Chris In answer to your question: my path in was as follows: I wanted to find a way to implement an LP on a medium-size problem (~10K variables), and the rest of my code was in perl--so I went looking for an LP implementation in perl. I was expecting to find a C-compiled module that would do an LP specifically. I found some instances of that sort of thing, but I also ran across one using PDL. It didn't do quite what I wanted, but when I saw the PDL site, it was obvious this was something I needed to know about. I wound up writing my own simplex implementation in PDL to do specifically what I needed, and that worked great--and I was pretty blown away at the speed. So then I started looking into how I could back up and get the datasets I was dealing with implemented as PDLs to start with. So I've got a good bit of code now using PDL, not just the little simplex program (which was only a few dozen lines--that was pretty easy to implement in PDL.) I continue to have issues with the documentation, though. Just as one example from today: the mult function seems to claim that you can get it to operate in-place. And for me that was important, because I'm dealing with a large dataset (of byte variables). But, not only does mult by itself cause an error because it isn't exported, but when I try to use it as PDL::Ops::Mult(inplace ...), or PDL::Mult(inplace ...), or as $piddle-inplace-mult(...), it completely fails to avoid generating a large intermediate. That was clobbering my program, repetitively, so I finally punted and decided to break that step up into segments with only a few hundred thousand elements to multiply in each (using slices), and that got me around the problem. But there was nothing in the documentation that seemed to suggest that would be necessary. It also seemed, although I didn't document this carefully, that changing the default PDL type didn't have any impact on the size of that temporary intermediate (I think it was using double no matter what I did--whereas using byte would have been fine.) I'd love it, in this context, if there were a PDL type of bit by the way, since that's actually what this problem is using--it's a 3D binary matrix, of ones and zeroes, with up to ~3*10^^9 elements. When the number of elements goes above ~200M, when I'm using bytes, I have to do things to break it up and process one segment at a time, and it would be nice if that weren't necessary--but there is evidently no implementation of a bit type in PDL. Ken -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Saturday, November 15, 2014 11:42 AM To: Derek Lamb Cc: LYONS, KENNETH B (KENNETH); perldl Subject: Re: [Perldl] matching vectors inside a PDL Hi Ken and welcome to the PDL community! On Nov 14, 2014, at 1:33 PM, LYONS, KENNETH B
Re: [Perldl] matching vectors inside a PDL
in this case that's how it's supposed to work, right? How do I cc it to get the thread to match up?) Actually, all the pdls involved are byte type. I was assuming when I saw the errors occurring that it was somehow generating a double intermediate, because it should have had plenty of room if it stayed as byte. The specific code was as follows: # sigs is byte, with dimensions about 40 x 15 x 26 # present is byte, with dimension of 15 $sigs-xchg(0,1)-inplace *= $present; I had tried numerous ways of using inplace in that line, and none of them avoided the complaint that it had run out of memory (although the memory usage prior to that command was about 10%). So if it's not generating a double intermediate, I don't see why it would run out of memory (it shouldn't have exceeded about 20% or so). I finally got it to work by splitting the structures up into slices of about 20K rows each, and doing the calculation that way. Other approaches? Ken -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Wednesday, November 19, 2014 4:23 PM To: LYONS, KENNETH B (KENNETH) Subject: Re: [Perldl] matching vectors inside a PDL re-cc-ing the perldl list Thanks for the background. If you hit a snag, feel free to post to the perldl list. We're usually able to help for specific problems especially if accompanied by code demonstrating the problem: If I have a big byte piddle $a and I multiply it in-place my PDL session crashes because of a huge intermediate temp: pdl $a = (10*random(100))-floor-byte; pdl $a-inplace-mult(5,0); Error message here or crash Without a specific example, I would guess that the problem is the piddle you are multiplying by (or perl scalar) is of type double which would result in an intermediate temp of double type which would then collapse down to a byte piddle again at the end. If both arguments to multiply are of byte type, you can avoid the big double intermediate temp. E.g. pdl p pdl(5)-type double pdl p byte(5)-type byte Improved type support is planned for the PDL3 work. My initial ideas for bitfield support can be seen here: http://mailman.jach.hawaii.edu/pipermail/pdl-porters/2013-December/006132.html Hope this helps, Chris On Wed, Nov 19, 2014 at 1:42 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Chris In answer to your question: my path in was as follows: I wanted to find a way to implement an LP on a medium-size problem (~10K variables), and the rest of my code was in perl--so I went looking for an LP implementation in perl. I was expecting to find a C-compiled module that would do an LP specifically. I found some instances of that sort of thing, but I also ran across one using PDL. It didn't do quite what I wanted, but when I saw the PDL site, it was obvious this was something I needed to know about. I wound up writing my own simplex implementation in PDL to do specifically what I needed, and that worked great--and I was pretty blown away at the speed. So then I started looking into how I could back up and get the datasets I was dealing with implemented as PDLs to start with. So I've got a good bit of code now using PDL, not just the little simplex program (which was only a few dozen lines--that was pretty easy to implement in PDL.) I continue to have issues with the documentation, though. Just as one example from today: the mult function seems to claim that you can get it to operate in-place. And for me that was important, because I'm dealing with a large dataset (of byte variables). But, not only does mult by itself cause an error because it isn't exported, but when I try to use it as PDL::Ops::Mult(inplace ...), or PDL::Mult(inplace ...), or as $piddle-inplace-mult(...), it completely fails to avoid generating a large intermediate. That was clobbering my program, repetitively, so I finally punted and decided to break that step up into segments with only a few hundred thousand elements to multiply in each (using slices), and that got me around the problem. But there was nothing in the documentation that seemed to suggest that would be necessary. It also seemed, although I didn't document this carefully, that changing the default PDL type didn't have any impact on the size of that temporary intermediate (I think it was using double no matter what I did--whereas using byte would have been fine.) I'd love it, in this context, if there were a PDL type of bit by the way, since that's actually what this problem is using--it's a 3D binary matrix, of ones and zeroes, with up to ~3*10^^9 elements. When the number of elements goes above ~200M, when I'm using bytes, I have to do things to break it up and process one segment at a time, and it would be nice if that weren't necessary--but there is evidently no implementation of a bit type in PDL. Ken
Re: [Perldl] matching vectors inside a PDL
Chris I didn't know perldl was on my system! It got squirreled away in the perl directory, outside my path. I found it with locate. Here's the output: Summary of my PDL configuration VERSION: PDL v2.007 (supports bad values) $%PDL::Config = { 'BADVAL_PER_PDL' = '0', 'WITH_PROJ' = '0', 'PDL_CONFIG_VERSION' = '0.005', 'POSIX_THREADS_INC' = undef, 'FFTW_TYPE' = 'double', 'PDL_BUILD_DIR' = '/home/vip/.cpan/build/PDL-2.007', 'FFTW_LIBS' = undef, 'WITH_FFTW' = '0', 'GSL_LIBS' = undef, 'WITH_IO_BROWSER' = '0', 'PROJ_INC' = undef, 'WHERE_PLPLOT_INCLUDE' = undef, 'HTML_DOCS' = '1', 'SKIP_KNOWN_PROBLEMS' = '0', 'WHERE_PLPLOT_LIBS' = undef, 'WITH_3D' = '0', 'WITH_POSIX_THREADS' = '1', 'POGL_VERSION' = '0.6702', 'FFTW_INC' = undef, 'HIDE_TRYLINK' = '1', 'HDF_INC' = undef, 'WITH_HDF' = '0', 'POGL_WINDOW_TYPE' = 'glut', 'WITH_GD' = '1', 'WITH_BADVAL' = '1', 'FITS_LEGACY' = '1', 'WITH_SLATEC' = '0', 'BADVAL_USENAN' = '0', 'WITH_DEVEL_REPL' = '1', 'TEMPDIR' = '/tmp', 'PROJ_LIBS' = undef, 'USE_POGL' = '0', 'PDL_BUILD_VERSION' = '2.007', 'GD_LIBS' = undef, 'GSL_INC' = undef, 'GD_INC' = undef, 'WITH_GSL' = '0', 'OPTIMIZE' = undef, 'PDLDOC_IGNORE_AUTOLOADER' = '0', 'HDF_LIBS' = undef, 'POSIX_THREADS_LIBS' = undef, 'MALLOCDBG' = {}, 'WITH_MINUIT' = '0', 'WITH_PLPLOT' = '0', 'MINUIT_LIB' = undef }; Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.9-55.0.2.elsmp, archname=x86_64-linux uname='linux x.removed.by.me 2.6.9-55.0.2.elsmp #1 smp tue jun 26 14:14:47 edt 2007 x86_64 x86_64 x86_64 gnulinux ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-8)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Friday, November 21, 2014 9:35 AM To: LYONS, KENNETH B (KENNETH) Cc: Derek Lamb; perldl Subject: Re: [Perldl] matching vectors inside a PDL Hi Ken- I am unable to generate the error with PDL-2.007 either. My system has 8GiB of memory and the PDL build is using the 64bit index support. What are the specs of your linux box and could you please send the output of the 'perldl -V' command. If you built PDL from sources, then the build log should indicate whether 64bit index support was enabled. If you $Config{ivsize} is 8 then you should have 64bit support as well. --Chris On Fri, Nov 21, 2014 at 8:02 AM, Chris Marshall devel.chm...@gmail.com wrote: Ken- If you could make a short script that generates the problem along with the output/error messages, that would help. Do you have $PDL::BIGPDL set? Might try with that set to 1. I'll try the problem code on PDL-2.007 to see if that is the reason for the differences. --Chris On Thu, Nov 20, 2014 at 6:18 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Chris I'm running perl
Re: [Perldl] matching vectors inside a PDL
Ken- You should also have pdl2 on your system as well. If you have the needed prerequisite module Devel::REPL installed, then you pdl2 will give you the new PDL shell, otherwise, it falls back to the perldl shell transparently. I don't see anything funny so the next step is to get a short code snippet that reproduces the problem and error for us to try to reproduce. If this is a bug, I would like to fix it but we'll need to be able to reproduce the problem. Please send information on your system: uname -a output, amount of memory,... --Chris On Fri, Nov 21, 2014 at 1:09 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Chris I didn't know perldl was on my system! It got squirreled away in the perl directory, outside my path. I found it with locate. Here's the output: Summary of my PDL configuration VERSION: PDL v2.007 (supports bad values) $%PDL::Config = { 'BADVAL_PER_PDL' = '0', 'WITH_PROJ' = '0', 'PDL_CONFIG_VERSION' = '0.005', 'POSIX_THREADS_INC' = undef, 'FFTW_TYPE' = 'double', 'PDL_BUILD_DIR' = '/home/vip/.cpan/build/PDL-2.007', 'FFTW_LIBS' = undef, 'WITH_FFTW' = '0', 'GSL_LIBS' = undef, 'WITH_IO_BROWSER' = '0', 'PROJ_INC' = undef, 'WHERE_PLPLOT_INCLUDE' = undef, 'HTML_DOCS' = '1', 'SKIP_KNOWN_PROBLEMS' = '0', 'WHERE_PLPLOT_LIBS' = undef, 'WITH_3D' = '0', 'WITH_POSIX_THREADS' = '1', 'POGL_VERSION' = '0.6702', 'FFTW_INC' = undef, 'HIDE_TRYLINK' = '1', 'HDF_INC' = undef, 'WITH_HDF' = '0', 'POGL_WINDOW_TYPE' = 'glut', 'WITH_GD' = '1', 'WITH_BADVAL' = '1', 'FITS_LEGACY' = '1', 'WITH_SLATEC' = '0', 'BADVAL_USENAN' = '0', 'WITH_DEVEL_REPL' = '1', 'TEMPDIR' = '/tmp', 'PROJ_LIBS' = undef, 'USE_POGL' = '0', 'PDL_BUILD_VERSION' = '2.007', 'GD_LIBS' = undef, 'GSL_INC' = undef, 'GD_INC' = undef, 'WITH_GSL' = '0', 'OPTIMIZE' = undef, 'PDLDOC_IGNORE_AUTOLOADER' = '0', 'HDF_LIBS' = undef, 'POSIX_THREADS_LIBS' = undef, 'MALLOCDBG' = {}, 'WITH_MINUIT' = '0', 'WITH_PLPLOT' = '0', 'MINUIT_LIB' = undef }; Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.9-55.0.2.elsmp, archname=x86_64-linux uname='linux x.removed.by.me 2.6.9-55.0.2.elsmp #1 smp tue jun 26 14:14:47 edt 2007 x86_64 x86_64 x86_64 gnulinux ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-8)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.3.4.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.3.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Friday, November 21, 2014 9:35 AM To: LYONS, KENNETH B (KENNETH) Cc: Derek Lamb; perldl Subject: Re: [Perldl] matching vectors inside a PDL Hi Ken- I am unable to generate the error with PDL-2.007 either. My system has 8GiB of memory and the PDL build is using the 64bit index support. What are the specs of your linux box
Re: [Perldl] matching vectors inside a PDL
Uname doesn't give the amount of memory: 2.6.9-103.ELsmp #1 SMP Fri Dec 9 04:43:08 EST 2011 x86_64 x86_64 x86_64 GNU/Linux But /usr/bin/top shows it's 4 GB (which I think is buried down in the thread below somewhere). I set up some code to reproduce it, and discovered what the problem was. The simple operation using operands that were explicitly generated as byte worked fine. The operand $present, in what I had shown you before, though, was generated by a line like this (where $sigs is a byte PDL): $present = ((sumover($sigs0)-xchg(0,1)-sumover) ($ndays*0.9); And, it seems that it winds up as type *double* because of the *comparison* to a float value! If I replace ($ndays*0.9) with int($ndays*0.9), then it generates the operand as Long D. Which also fails, of course. This behavior is something I would regard as a bug, since I'd expect the result to be of type byte, if it started that way, and the only operation was a comparison, but the folks who have specified the language have to decide how they want it to work. But operationally, is there a way to force that intermediate to stay as byte? I also tried this, by the way, to see if using a byte PDL as the comparison operand would change things: $test = zeros byte, 3; $present = ((sumover($sigs)0)-xchg(0,1)-sumover) $test # now the operand is explicitly of type byte! And the result is again Long D. Now that really looks like a bug to me. It turns out that even just sumover($sigs)0 produces a result that is Long D. Is there any way to control that behavior? Ken -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Friday, November 21, 2014 1:17 PM To: LYONS, KENNETH B (KENNETH) Cc: Derek Lamb; perldl Subject: Re: [Perldl] matching vectors inside a PDL Ken- You should also have pdl2 on your system as well. If you have the needed prerequisite module Devel::REPL installed, then you pdl2 will give you the new PDL shell, otherwise, it falls back to the perldl shell transparently. I don't see anything funny so the next step is to get a short code snippet that reproduces the problem and error for us to try to reproduce. If this is a bug, I would like to fix it but we'll need to be able to reproduce the problem. Please send information on your system: uname -a output, amount of memory,... --Chris On Fri, Nov 21, 2014 at 1:09 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Chris I didn't know perldl was on my system! It got squirreled away in the perl directory, outside my path. I found it with locate. Here's the output: Summary of my PDL configuration VERSION: PDL v2.007 (supports bad values) $%PDL::Config = { 'BADVAL_PER_PDL' = '0', 'WITH_PROJ' = '0', 'PDL_CONFIG_VERSION' = '0.005', 'POSIX_THREADS_INC' = undef, 'FFTW_TYPE' = 'double', 'PDL_BUILD_DIR' = '/home/vip/.cpan/build/PDL-2.007', 'FFTW_LIBS' = undef, 'WITH_FFTW' = '0', 'GSL_LIBS' = undef, 'WITH_IO_BROWSER' = '0', 'PROJ_INC' = undef, 'WHERE_PLPLOT_INCLUDE' = undef, 'HTML_DOCS' = '1', 'SKIP_KNOWN_PROBLEMS' = '0', 'WHERE_PLPLOT_LIBS' = undef, 'WITH_3D' = '0', 'WITH_POSIX_THREADS' = '1', 'POGL_VERSION' = '0.6702', 'FFTW_INC' = undef, 'HIDE_TRYLINK' = '1', 'HDF_INC' = undef, 'WITH_HDF' = '0', 'POGL_WINDOW_TYPE' = 'glut', 'WITH_GD' = '1', 'WITH_BADVAL' = '1', 'FITS_LEGACY' = '1', 'WITH_SLATEC' = '0', 'BADVAL_USENAN' = '0', 'WITH_DEVEL_REPL' = '1', 'TEMPDIR' = '/tmp', 'PROJ_LIBS' = undef, 'USE_POGL' = '0', 'PDL_BUILD_VERSION' = '2.007', 'GD_LIBS' = undef, 'GSL_INC' = undef, 'GD_INC' = undef, 'WITH_GSL' = '0', 'OPTIMIZE' = undef, 'PDLDOC_IGNORE_AUTOLOADER' = '0', 'HDF_LIBS' = undef, 'POSIX_THREADS_LIBS' = undef, 'MALLOCDBG' = {}, 'WITH_MINUIT' = '0', 'WITH_PLPLOT' = '0', 'MINUIT_LIB' = undef }; Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.9-55.0.2.elsmp, archname=x86_64-linux uname='linux x.removed.by.me 2.6.9-55.0.2.elsmp #1 smp tue jun 26 14:14:47 edt 2007 x86_64 x86_64 x86_64 gnulinux ' config_args='' hint=recommended, useposix=true, d_sigaction=define
Re: [Perldl] matching vectors inside a PDL
Chris I'm running perl 5.8.8 on a rather old linux system. I installed the perl modules rather recently from the PDL site, so I'd expect they are up to date with whatever is there. From the names of the files, I'd say it's 2.007. I've tried a variety of ways of using the inplace method, and none of them produced a perl error akin to what you got below. The errors were coming out of the PDL module itself, complaining about the size of the piddle being over 1GB. Given the dimensions of the piddle that is being calculated (around 200 MB), that shouldn't have happened--unless it's using doubles, which would make it ~1.6 GB. Like I said, I got around the problem in kind of a hack, by just slicing things up 20K rows at a time--but I'd really like to find a way to do it right! Among the things I tried were these: $sigs-xchg(0,1) *= $present; $sigs-xchg(0,1)-inplace-mult($present,0); PDL::Ops::mult(inplace $sigs-xchg(0,1), $present, 0); $sigs-xchg(0,1)-inplace *= present; None of which got around the error. Below is what finally worked (but only by occupying more memory than it should): ($psize) = $present-dims; $STEPSIZE = 2; for ($p = 0; $p $psize; $p += $STEPSIZE) { # note: it's known that $present and $sigs have the same size! my $start = $p; my $end = $start+$STEPSIZE-1; $end = $psize-1 if $end = $psize; $sigs-xchg(0,1)-slice($start:$end,:,:) *= $present-slice($start:$end); } Like I said, it's a bit of a hack. But it does wind up doing the appropriate filtering on the $sigs matrix. Ken p.s. I don't know if it makes a difference, per se, but you are evidently operating in an interactive environment, not an actual perl script. I'm using this to automate thru a very large body of data, eventually be run automatically on a daily basis, so it's written as a script that calls the PDL modules. The error I refer to above was appearing in the error output of the perl command. KL Below is the remainder of the thread that was mostly sidebar: - Hi Ken- You could sync up with the message I forwarded to perldl by replying with this message to that thread. The main reason for keeping the discussion on the list is so that others can benefit from the discussion and/or offer other points of view/facts/... I tried the following in pdl2 and was not able to generate an error. You are right that all byte args shouldn't be expanded to double intermediates. I'm using PDL-2.007_03 on cygwin64/win7 and the *= works fine but I get an error with the inplace construct (not the same as yours) pdl $sigs = (10*random(40,15,26))-floor-byte pdl $present = (20*random(15))-floor-byte pdl $ns = $sigs-copy pdl ?vars PDL variables in package main:: Name Type Dimension Flow State Mem $nsByte D [40,15,26]P 148.77MB $present Byte D [15] P0.14MB $sigs Byte D [40,15,26]P 148.77MB pdl $sigs-xchg(0,1) *= $present # works pdl $sigs = $ns-copy pdl $sigs-xchg(0,1)-inplace *= $present Runtime error: Can't modify non-lvalue subroutine call at (eval 484) line 5. What is your os/platform specs and what version of PDL are you using? --Chris On Thu, Nov 20, 2014 at 2:47 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: (Didn't understand your first line, as there was no cc on this message? I pretty much automatically avoid ever using reply-all, but I guess in this case that's how it's supposed to work, right? How do I cc it to get the thread to match up?) Actually, all the pdls involved are byte type. I was assuming when I saw the errors occurring that it was somehow generating a double intermediate, because it should have had plenty of room if it stayed as byte. The specific code was as follows: # sigs is byte, with dimensions about 40 x 15 x 26 # present is byte, with dimension of 15 $sigs-xchg(0,1)-inplace *= $present; I had tried numerous ways of using inplace in that line, and none of them avoided the complaint that it had run out of memory (although the memory usage prior to that command was about 10%). So if it's not generating a double intermediate, I don't see why it would run out of memory (it shouldn't have exceeded about 20% or so). I finally got it to work by splitting the structures up into slices of about 20K rows each, and doing the calculation that way. Other approaches? Ken -Original Message- From: Chris Marshall [mailto:devel.chm...@gmail.com] Sent: Wednesday, November 19, 2014 4:23 PM To: LYONS, KENNETH B (KENNETH) Subject: Re: [Perldl] matching vectors inside a PDL re-cc-ing the perldl list Thanks for the background. If you hit a snag, feel free to post
Re: [Perldl] matching vectors inside a PDL
moin Kenneth, your might be interested in the PDL::VectorValued module ( https://metacpan.org/pod/PDL::VectorValued), which actually includes a vesearchvec() which ought to do what you want O(log n), provided your vector-list is sorted. marmosets, Bryan On Fri, Nov 14, 2014 at 2:20 AM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: I need to be able to match a vector inside a PDL, and can’t find a way to do it. The existence of qsortvec and uniqvec functions implies that such a comparison function exists (since you’d need to do that to sort) but the documentation doesn’t give any info on it. More specifically, if I have an nxm PDL $P, containing vectors of length n in the first dimension, and an nx1 PDL representing a test vector, $test, I want to be able to get the indices along the 2nd dimension where the vector in the PDL matches the test one. I would expect that such a function, which I’ll provisionally name findveci, would operate as $findresult = $P-findveci($test) Where $findresult would be a 1-dimensional PDL giving the set of indices along the second dimension of $P that match the vector $test. I should note that a similar purpose would be served by a function uniqveci (which, although an obvious extension of the set that *are* available, also seems not to exist), since you could combine that with qsortvec to do what I’m talking about. At present, I’ve resorted to pulling the vectors into perl lists and doing the matching there. But that’s far slower, and it seems wrong to have to do it that way. Any suggestions? ___ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl -- Bryan Jurish There is *always* one more bug. moocow.bov...@gmail.com -Lubarsky's Law of Cybernetic Entomology ___ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
Re: [Perldl] matching vectors inside a PDL
Hey Kenneth, FYI, when you type ?? list at the pdl prompt (or pdldoc -a list at your system's command prompt), it'll give a list of every occurrence of the word list. As you say, this is useless for the word list, but is useful for other words that are less common. BUT, if you know the name of your function, the just get the documentation itself with a single ?, i.e. ? list (or simply pdldoc list at your system's command prompt). Hope that helps! David On Fri, Nov 14, 2014 at 5:31 PM, Derek Lamb de...@boulder.swri.edu wrote: Hi Ken, Please cc the list on your replies—others may have more insight than I. == does compare element-by-element. If you pass in a simple Perl scalar ( p $P==4 ), you will get a piddle of 1s where $P is 4. If you pass in a 1-element piddle ( p $P==pdl(4) ), you will get the same thing. If you pass in a vector $test whose dimension #0 matches the dimension #0 of $P (and any subsequent dimensions are of size 1), then it will do an element-by-element comparison of each element of $P with the corresponding element in $test. Notice in the intermediate results I gave, there are some rows that have all 1s (where all elements of that row of $P are equal to the corresponding element in $test), there are some rows that have all 0s (where no elements of that row of $P are equal to the corresponding element in $test), but that there are ALSO some rows that have one or two 1s, where that element of $P matched with the corresponding element of $test, but others did not. The test for VECTOR EQUALITY was done with the sumover(stuff) == 3 or andover() functions. So in this sense ==, , =, , and = all function exactly the same way. Re: documentation. There are two commands, help (aliased to ? in the pdl shell), and apropos (aliased to ?? in the pdl shell). help is akin to UNIX man, apropos is akin to UNIX apropos. What are you doing to get several hundred entries for a query of 'list'? In the pdl shell, using apropos I get 30 entries, and doing ?list brings me right to the documentation for the PDL function list. cheers, Derek On Nov 14, 2014, at 1:33 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Yes, most of this I knew, but thanks. It’s because of that behavior of and , that you mentioned, that I thought that ‘==’ would compare element by element instead of on the whole vector. Have you ever tried, for example, to search the documentation for, say, the function “list”? it gives you every occurrence of the word “list” in the documents (which, needless to say, is rather voluminous, and the first few hundred entries have nothing to do with the function!) there should be some analog of the “man” command in unix that gives you information about the **function** without all the other garbage. I think it’s just doing something akin to a grep thru the documents. It’s horribly designed in that regard. The software itself is great, and I’m very happy with the results, but finding the simplest little thing in the docs can be a total pain! Ken *From:* Derek Lamb [mailto:de...@boulder.swri.edu de...@boulder.swri.edu ] *Sent:* Friday, November 14, 2014 2:16 PM *To:* LYONS, KENNETH B (KENNETH) *Cc:* perldl *Subject:* Re: [Perldl] matching vectors inside a PDL No problem. Glad to help. The documentation for the basic PDL operators is in PDL::Ops, which you can get by doing pdl ?ops That will show you all the basic math operators, most of which have both named functions (e.g., 'divide'), overloaded operators (e.g., '/'), and can modify the original piddle in place if you use the 'inplace' syntax. So as long as your piddles have the same dimensions, or are at least thread-compatible (i.e., have the same 0th dimension, which is what the example I gave you did), it will work. If you play with '', and '' you'll see that the test is done element-by-element, just like it was for '==' (so it's not a lexical comparison). If you want to know if all elements of $P are less than the corresponding element of $test, then you'll need to collapse along the 0th dimension. Same for if you want to know if only some of the elements of $P are less than $test. pdl ??sort or pdl apropos sort searches the documentation for functions whose name or description matches 'sort'. If you're new to PDL, check out the PDL Book, which you can download from pdl.perl.org. That and the First Steps document should be enough to give you the lay of the land. As you've noticed, the mailing list is also pretty responsive. cheers, Derek On Nov 14, 2014, at 11:37 AM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Hey, thanks. This looks exactly right. I hadn’t realized that the == operator would work with **2** PDL operands! (boy do I wish this stuff had better documentation!). I’ll try it out, but from the looks of what you presented here, this looks like just the ticket. Do you know if you can
Re: [Perldl] matching vectors inside a PDL
No problem. Glad to help. The documentation for the basic PDL operators is in PDL::Ops, which you can get by doing pdl ?ops That will show you all the basic math operators, most of which have both named functions (e.g., 'divide'), overloaded operators (e.g., '/'), and can modify the original piddle in place if you use the 'inplace' syntax. So as long as your piddles have the same dimensions, or are at least thread-compatible (i.e., have the same 0th dimension, which is what the example I gave you did), it will work. If you play with '', and '' you'll see that the test is done element-by-element, just like it was for '==' (so it's not a lexical comparison). If you want to know if all elements of $P are less than the corresponding element of $test, then you'll need to collapse along the 0th dimension. Same for if you want to know if only some of the elements of $P are less than $test. pdl ??sort or pdl apropos sort searches the documentation for functions whose name or description matches 'sort'. If you're new to PDL, check out the PDL Book, which you can download from pdl.perl.org. That and the First Steps document should be enough to give you the lay of the land. As you've noticed, the mailing list is also pretty responsive. cheers, Derek On Nov 14, 2014, at 11:37 AM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: Hey, thanks. This looks exactly right. I hadn’t realized that the == operator would work with *2* PDL operands! (boy do I wish this stuff had better documentation!). I’ll try it out, but from the looks of what you presented here, this looks like just the ticket. Do you know if you can do the same with the operator? The qsortvec function sorts vectors in lexical order, so I’d guess that if == works then the and operators probably work with 2 PDL operands as well (again assuming lexical ordering)? Ken p.s. Interesting that you’re the one who replied. I got my physics PhD at Boulder, eons ago. Ken From: Derek Lamb [mailto:de...@boulder.swri.edu] Sent: Friday, November 14, 2014 1:26 PM To: LYONS, KENNETH B (KENNETH) Cc: perldl@jach.hawaii.edu Subject: Re: [Perldl] matching vectors inside a PDL Hi Kenneth, I did this. The last line has what you're looking for in one line, but the stuff leading up to it shows my thought process: pdl $P = rint(random(3,10)*5) pdl p $P [ [4 1 4] [5 4 2] [1 2 2] [0 3 0] [1 1 2] [2 1 2] [4 0 1] [4 1 4] [0 1 4] [4 2 3] ] pdl $test = pdl(4,1,4) #turns out that [4 1 4] turns up twice, so I'll just pick that for now pdl p $P==$test [ [1 1 1] [0 0 0] [0 0 0] [0 0 0] [0 1 0] [0 1 0] [1 0 0] [1 1 1] [0 1 1] [1 0 0] ] pdl p sumover($P==$test) [3 0 0 0 1 1 1 3 2 1] pdl p sumover($P==$test)==$P-dim(0) [1 0 0 0 0 0 0 1 0 0] pdl p $findresult = which(sumover($P==$test)==$P-dim(0)) [0 7] Is that what you're looking for? Actually, a little cleaner way is to do pdl p $findresult = which(andover($P==$test)) [0 7] cheers, Derek On Nov 13, 2014, at 6:20 PM, LYONS, KENNETH B (KENNETH) k...@research.att.com wrote: I need to be able to match a vector inside a PDL, and can’t find a way to do it. The existence of qsortvec and uniqvec functions implies that such a comparison function exists (since you’d need to do that to sort) but the documentation doesn’t give any info on it. More specifically, if I have an nxm PDL $P, containing vectors of length n in the first dimension, and an nx1 PDL representing a test vector, $test, I want to be able to get the indices along the 2nd dimension where the vector in the PDL matches the test one. I would expect that such a function, which I’ll provisionally name findveci, would operate as $findresult = $P-findveci($test) Where $findresult would be a 1-dimensional PDL giving the set of indices along the second dimension of $P that match the vector $test. I should note that a similar purpose would be served by a function uniqveci (which, although an obvious extension of the set that are available, also seems not to exist), since you could combine that with qsortvec to do what I’m talking about. At present, I’ve resorted to pulling the vectors into perl lists and doing the matching there. But that’s far slower, and it seems wrong to have to do it that way. Any suggestions? ___ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl ___ Perldl mailing list Perldl@jach.hawaii.edu http://mailman.jach.hawaii.edu/mailman/listinfo/perldl