Sun Aug 02 18:02:33 2015: Request 106142 was acted upon.
Transaction: Correspondence added by RSCHUPP
       Queue: Module-ScanDeps
     Subject: [Patch] Preload dependencies for PDL and PDL::NiceSlice
   Broken in: (no value)
    Severity: (no value)
       Owner: Nobody
  Requestors: slaf...@cpan.org
      Status: open
 Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=106142 >


On 2015-08-02 00:48:42, SLAFFAN wrote:
> Instrumenting _glob_in_inc to print to stdout whenever unico[rd]e is
> passed as the subdir argument has no effect, so I assume the utf8.pm
> preload sub is not being run for the above preload rules.

Thanks for investigating. I tried to figure out at what point utf8_heavy.pl
comes into play. For that I prepended this to your sample script

BEGIN
{
    # insert spy CODE into require's module lookup
    unshift @INC, sub 
    {
        my ($self, $pm) = @_;
        print STDERR "# require $pm\n";
        ($package, $filename, $line) = caller;
        print STDERR "#   from $package ($filename:$line)\n";
        return;         # i.e. take a pass
    };
}

This intercepts any (explicit or implicit) "require", prints out what is 
required
and from where and then resumes "normal" processing. Here's the output

# require PDL.pm
#   from main (/home/roderich/todo/PAR/Module-ScanDeps/shawn.pl:15)
# require PDL/Core.pm
#   from main ((eval 1):6)
# require PDL/Types.pm
#   from PDL::Core (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Core.pm:223)
# require Carp.pm
#   from PDL::Types (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Types.pm:6)
# require strict.pm
#   from Carp (/usr/share/perl/5.22/Carp.pm:4)
# require warnings.pm
#   from Carp (/usr/share/perl/5.22/Carp.pm:5)
# require Exporter.pm
#   from Carp (/usr/share/perl/5.22/Carp.pm:99)
# require overload.pm
#   from PDL::Type (/usr/lib/x86_64-linux-gnu/perl5/5.22/PDL/Types.pm:428)
# require overloading.pm
#   from overload (/usr/share/perl/5.22/overload.pm:83)
# require warnings/register.pm
#   from overload (/usr/share/perl/5.22/overload.pm:144)
# require Exporter/Heavy.pm
#   from Exporter (/usr/share/perl/5.22/Exporter.pm:16)
# require PDL/Exporter.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):314)
# require DynaLoader.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):315)
# require Config.pm
#   from DynaLoader (/usr/lib/x86_64-linux-gnu/perl/5.22/DynaLoader.pm:21)
# require vars.pm
#   from Config (/usr/lib/x86_64-linux-gnu/perl/5.22/Config.pm:11)
# require Scalar/Util.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1000)
# require List/Util.pm
#   from Scalar::Util (/usr/lib/x86_64-linux-gnu/perl/5.22/Scalar/Util.pm:11)
# require XSLoader.pm
#   from List::Util (/usr/lib/x86_64-linux-gnu/perl/5.22/List/Util.pm:21)
# require utf8.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):1028)
# require utf8_heavy.pl
#   from utf8 (/usr/share/perl/5.22/utf8.pm:16)
# require re.pm
#   from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:4)
# require unicore/Heavy.pl
#   from utf8 (/usr/share/perl/5.22/utf8_heavy.pl:185)
# require unicore/lib/Alpha/Y.pl
# require PDL/Options.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):3288)
# require Fcntl.pm
#   from PDL::Core (Basic/Core/Core.pm.PL (i.e. PDL::Core.pm):4167)
...

utf8.pm and the utf8_heavy.pl are actually loaded from PDL::Core.pm
The funny "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)" is caused by the fact
that PDL/Core.pm is a generated file with some 

# line 123 "Basic/Core/Core.pm.PL (i.e. PDL::Core.pm)"

lines in it. And the offending line is

      if $value =~ /e\p{IsAlpha}/ or $value =~ /\p{IsAlpha}e/;

There's no explicit mention of utf8.pm here - the code uses a Unicode property
in a regular expression. utf8.pm (at least in Perl 5.22) doesn't do anything
except setting up a AUTOLOAD sub that will require utf8_heavy.pl when being run.
(If you check $utf8::AUTOLOAD when our @INC spy is called, it's value is 
"utf8::SWASHNEW".)

So the whole utf8_heavy.pl + unico[dr]e shebang is triggered on demand whenever
some Unicode feature of Perl is requested, e.g. a Unicode property in a regex,
probably lots of others.

I don't think it's feasible to try to detect this by statical analysis.
Should we just add this stuff (at least 4 MB speread over more than 400 files)
to _every_ packed executable?

Cheers, Roderich


Reply via email to