I am no expert in PDL, but your use case looks special enough to warrant
your custom preprocessing. I even was not aware of PDL::Char existence
until today.
On Sat, Oct 28, 2017 at 5:15 PM, Adam Russell <[email protected]>
wrote:
> Here is some code which represents the behavior I would like to work around
>
>
> use PDL;
>
> use PDL::Char;
>
> my $pchar = PDL::Char->new([["abc", "def", "ghi"],["jkl", "mno", "pqr",
> "stu"]]);
>
>
> This will give the error:
>
> Array is not rectangular at /opt/local/lib/perl5/vendor_
> perl/5.24/darwin-thread-multi-2level/PDL/Char.pm line 117.
>
> PDL::Char::_rcharpack(ARRAY(0x7f892382bf88), SCALAR(0x7f8923d812a8),
> SCALAR(0x7f8923d812c0)) called at /opt/local/lib/perl5/vendor_
> perl/5.24/darwin-thread-multi-2level/PDL/Char.pm line 125
>
> PDL::Char::_rcharpack(ARRAY(0x7f892382c090), SCALAR(0x7f8923d812a8),
> SCALAR(0x7f8923d812c0)) called at /opt/local/lib/perl5/vendor_
> perl/5.24/darwin-thread-multi-2level/PDL/Char.pm line 87
>
> PDL::Char::new("PDL::Char", ARRAY(0x7f892382c090)) called at
>
>
> Sergey, this is has something to do with something we corresponded about
> elsewhere, related to your most excellent AI::MXNet work. 😃
>
> I have a massive data file which has a mix of numeric and string
> data. Basically everything is processed in a PDL::Char, and then eventually
> everything is converted to a floating point representation (typically via a
> one hot encoding of the string/categorical data), I subclass
> AI::MXNet::DataIter, and use that to feed data into my model. I'd like to
> try and shave off some of the pre-processing time spent getting the rows in
> the data file to be of the same length to make the PDL::Char constructor
> happy.
> ------------------------------
> *From:* Adam Russell <[email protected]>
> *Sent:* Saturday, October 28, 2017 4:51:36 PM
> *To:* A B
> *Cc:* [email protected]
> *Subject:* Re: [Pdl-general] automatically filling in missing values on
> old creation?
>
>
> Actually, I am actually calling new PDL::Char(). I was on mobile earlier
> and some details got lost...
> ------------------------------
> *From:* A B <[email protected]>
> *Sent:* Saturday, October 28, 2017 4:01:29 PM
> *To:* Adam Russell
> *Cc:* [email protected]
> *Subject:* Re: [Pdl-general] automatically filling in missing values on
> old creation?
>
> I always thought that this is actually default behavior of the pdl
> constructor.
> developer@devbox:~$ perl -e 'use PDL; print pdl([[1],[1,2],[1,2,3],[1]])'
>
> [
> [1 0 0]
> [1 2 0]
> [1 2 3]
> [1 0 0]
> ]
>
>
> On Sat, Oct 28, 2017 at 12:33 PM, Adam Russell <[email protected]>
> wrote:
>
>> I have a data file in which each row has a variable number of columns. I
>> am reading this file into a regular Perl array first in order to do some
>> basic pre-processing of the data. I then am presently adding 0s to pad each
>> row to be as long as the longest row and so I have a rectangular array. I
>> then pass the array to pdl() and things seem OK.
>>
>> Is there a way to do this easier though? To skip writing my own loop to
>> pad the array first? That is, a way to take the irregularity shaped array
>> and have pdl make it rectangular for me?
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> pdl-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/pdl-general
>>
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
pdl-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-general