My goal: I want to create something similar to the phone guide. In this
page exist a great number of documents in format pdf. So, I want to unite
the different documents and can filter for name, last name, location. If I
make this manually I have to open each document, download the document and
search the name of the person manually.

The documents uploaded in this page are different: year, date. They contain
different information.


On Wed, Jul 20, 2022 at 7:04 PM Mike <te...@mflan.com> wrote:

>
> I'm going to be traveling, so will not be able to help much
> in the next 2 days.
>
> That is a PDF file you supplied.  Is it fair to say you want to
> be able to search for all the names listed in a text file and be
> able to print out which file contains which name.  And in some
> cases the name will not be in any of the files?  Is that the goal?
>
> Define your goal and we will help you.
>
>
> The file below is a bit old, but maybe it works for your
> PDF files.  I have not tested it on your url.  I gather
> you don't have HTML tables, so maybe it is not for your case.
>
>
> Mike
>
>
> #!/usr/bin/perl -w
> #
> #
> # This program writes the results of the webpage listed in line 17
> # to $outfile.  So basically it converts HTML to text.
> # It works reasonably well with HTML tables.
> #
> #
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use LWP::UserAgent;
> use HTML::FormatText::WithLinks::AndTables;
>
>
> my $page = 'http://www.mflan.com/crime.htm';
>
> my $outfile = 'output.txt';
>
> chdir '/home/mike/Documents/copy';
>
> open OUT, ">>$outfile" or die "Can't open '$outfile': $!";
>
> my ($sl, $request, $response, $html);
>
> $sl = LWP::UserAgent->new;
>
>
> $sl->proxy('http', ''); # enter proxy if needs be / and set it for Soap
> too ...
> $request = HTTP::Request->new('GET', $page);
> $response = $sl->request($request);
> $html = $response->as_string;
>
> print "Got it into \$html.\n";
>
>
>
> my $text = HTML::FormatText::WithLinks::AndTables->convert($html);
>
>
> print OUT "$text";
>
> print "\nAll done.\n";
>
> close OUT;
>
>
> __END__
>
>
>
>
> On 7/20/22 10:13, William Torrez Corea wrote:
> > The url of the page:
> >
> > https://www.pgr.gob.ni/PDF/2021/GACETA/GACETA_17_08_2021.pdf
> >
> > On 7/20/22, William Torrez Corea <willitc9...@gmail.com> wrote:
> >> Exist a page where you put info about the person but if you want to
> search
> >> a name you must search this manually. So, I want to automate this
> process
> >> with perl.
> >> --
> >>
> >> With kindest regards, William.
> >>
> >> ⢀⣴⠾⠻⢶⣦⠀
> >> ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
> >> ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
> >> ⠈⠳⣄⠀⠀⠀⠀
> >>
> >
>
>

-- 

With kindest regards, William.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀

Reply via email to