Sun Aug 02 20:04:00 2015: Request 106142 was acted upon.
Transaction: Correspondence added by SLAFFAN
       Queue: Module-ScanDeps
     Subject: [Patch] Preload dependencies for PDL and PDL::NiceSlice
   Broken in: (no value)
    Severity: (no value)
       Owner: Nobody
  Requestors: slaf...@cpan.org
      Status: open
 Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=106142 >


On Sun Aug 02 18:02:33 2015, RSCHUPP wrote:
> On 2015-08-02 00:48:42, SLAFFAN wrote:
> > Instrumenting _glob_in_inc to print to stdout whenever unico[rd]e is
> > passed as the subdir argument has no effect, so I assume the utf8.pm
> > preload sub is not being run for the above preload rules.
> 
> Thanks for investigating. I tried to figure out at what point
> utf8_heavy.pl
> comes into play. For that I prepended this to your sample script
> 
> BEGIN
> {
>     # insert spy CODE into require's module lookup
>      unshift @INC, sub
>     {
>         my ($self, $pm) = @_;
>         print STDERR "# require $pm\n";
>         ($package, $filename, $line) = caller;
>         print STDERR "#   from $package ($filename:$line)\n";
>         return;         # i.e. take a pass
>     };
> }
> 
> This intercepts any (explicit or implicit) "require", prints out what
> is required
> and from where and then resumes "normal" processing. Here's the output
> 
> # require PDL.pm
> #   from main (/home/roderich/todo/PAR/Module-ScanDeps/shawn.pl:15)
> # require PDL/Core.pm
> #   from main ((eval 1):6)
> # require PDL/Types.pm
> #   from PDL::Core (/usr/lib/x86_64-linux-
> gnu/perl5/5.22/PDL/Core.pm:223)
> # require Carp.pm
> #   from PDL::Types (/usr/lib/x86_64-linux-
> gnu/perl5/5.22/PDL/Types.pm:6)
> # require strict.pm
> #   from Carp (/usr/share/perl/5.22/Carp.pm:4)
> # require warnings.pm
> #   from Carp (/usr/share/perl/5.22/Carp.pm:5)
> # require Exporter.pm
> #   from Carp (/usr/share/perl/5.22/Carp.pm:99)
> # require overload.pm
> #   from PDL::Type (/usr/lib/x86_64-linux-
> gnu/perl5/5.22/PDL/Types.pm:428)
> # require overloading.pm
> #   from overload (/usr/share/perl/5.22/overload.pm:83)
> # require warnings/register.pm
> #   from overload (/usr/share/perl/5.22/overload.pm:144)
> # require Exporter/Heavy.pm
> #   from Exporter (/usr/share/perl/5.22/Exporter.pm:16)
> # require PDL/Exporter.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):314)
> # require DynaLoader.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):315)
> # require Config.pm
> #   from DynaLoader (/usr/lib/x86_64-linux-
> gnu/perl/5.22/DynaLoader.pm:21)
> # require vars.pm
> #   from Config (/usr/lib/x86_64-linux-gnu/perl/5.22/Config.pm:11)
> # require Scalar/Util.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1000)
> # require List/Util.pm
> #   from Scalar::Util (/usr/lib/x86_64-linux-
> gnu/perl/5.22/Scalar/Util.pm:11)
> # require XSLoader.pm
> #   from List::Util (/usr/lib/x86_64-linux-
> gnu/perl/5.22/List/Util.pm:21)
> # require utf8.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1028)
> # require utf8_heavy.pl
> #   from utf8 (/usr/share/perl/5.22/utf8.pm:16)
> # require re.pm
> #   from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:4)
> # require unicore/Heavy.pl
> #   from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:185)
> # require unicore/lib/Alpha/Y.pl
> # require PDL/Options.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):3288)
> # require Fcntl.pm
> #   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):4167)
> ...
> 
> utf8.pm and the utf8_heavy.pl are actually loaded from PDL::Core.pm
> The funny "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" is caused by the
> fact
>  that PDL/Core.pm is a generated file with some
> 
> # line 123 "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)"
> 
> lines in it. And the offending line is
> 
> if $value =~ /e\p{IsAlpha}/ or $value =~ /\p{IsAlpha}e/;
> 
> There's no explicit mention of utf8.pm here - the code uses a Unicode
> property
> in a regular expression. utf8.pm (at least in Perl 5.22) doesn't do
> anything
> except setting up a AUTOLOAD sub that will require utf8_heavy.pl when
> being run.
> (If you check $utf8::AUTOLOAD when our @INC spy is called, it's value
> is "utf8::SWASHNEW".)
> 
> So the whole utf8_heavy.pl + unico[dr]e shebang is triggered on demand
> whenever
> some Unicode feature of Perl is requested, e.g. a Unicode property in
> a regex,
> probably lots of others.
> 
> I don't think it's feasible to try to detect this by statical
> analysis.
> Should we just add this stuff (at least 4 MB speread over more than
> 400 files)
> to _every_ packed executable?
> 
> Cheers, Roderich

Thanks Roderich,

The size issue rears its head once more...

It would also be a Herculean task to get static scanning to detect all such 
cases (although maybe PPI could be leveraged if someone ever has the tuits - 
https://metacpan.org/pod/PPI::Token::Regexp ).  

Perhaps another flag could be added to pp for the cases where the code does not 
explicitly call for unicode, but it is needed for a packed executable to work.  
pp --unicode?


I also now think that this is the root cause of an issue I've been working 
around for a while using the code below.  I use the pp -x flag when building, 
and set an environment variable in my script before calling pp.

if ($ENV{BDV_PP_BUILDING}) {
    use 5.016;
    use feature 'unicode_strings';
    my $string = "sp_self_only() and \N{WHITE SMILING FACE}";
    $string =~ /\bsp_self_only\b/;
}

Given that, it should be possible to statically scan for the various 
permutations of /use feature 'unicode_/ to detect unicode_strings and 
unicode_eval.  If someone is using those features in their code then they need 
the extra libraries.
https://metacpan.org/pod/feature#The-unicode_strings-feature

Such scanning would not detect multiline chunks, as per the documentation 
caveats.  A "pp -unicode" style flag would still be needed in such cases.
https://metacpan.org/pod/Module::ScanDeps#CAVEATS


WRT the pp flag, maybe a more general approach would be something that 
parallels the feature pragma, e.g.
pp --feature=unicode_strings,unicode_eval
pp --feature=":5.12"


Regards,
Shawn.

Reply via email to