On Fri, Jul 29, 2022 at 11:38 AM hw <h...@adminart.net> wrote:

> On Sat, 2022-07-23 at 13:03 -0600, William Torrez Corea wrote:
> > My goal: I want to create something similar to the phone guide. In
> > this
> > page exist a great number of documents in format pdf. So, I want to
> > unite
> > the different documents and can filter for name, last name, location.
> > If I
> > make this manually I have to open each document, download the
> > document and
> > search the name of the person manually.
> >
> > The documents uploaded in this page are different: year, date. They
> > contain
> > different information.
> >
>
> Perhaps you can automate the downloading and then use tools that merge
> PDF files, like pdfunite, to turn them into a single PDF.  There's also
> pdf2txt that can extract text from a PDF --- of course, that would only
> work if there were a way to detect which information is what.
>
> Since we do not have all the PDF files which apparently are all
> different, we can't tell how it might be possible to detect which
> information is what.
>
> I wouldn't even bother with this because PDF is awful to get
> information from automatically.  Whoever makes these PDF files needs to
> provide the information in such a way that it is usable.  Since you
> need to download all the files anyway to search for a name, you're
> better off merging them into a single file and search that in your
> favourite PDF viewer.
>
>
> >
> > On Wed, Jul 20, 2022 at 7:04 PM Mike <te...@mflan.com> wrote:
> >
> > >
> > > I'm going to be traveling, so will not be able to help much
> > > in the next 2 days.
> > >
> > > That is a PDF file you supplied.  Is it fair to say you want to
> > > be able to search for all the names listed in a text file and be
> > > able to print out which file contains which name.  And in some
> > > cases the name will not be in any of the files?  Is that the goal?
> > >
> > > Define your goal and we will help you.
> > >
> > >
> > > The file below is a bit old, but maybe it works for your
> > > PDF files.  I have not tested it on your url.  I gather
> > > you don't have HTML tables, so maybe it is not for your case.
> > >
> > >
> > > Mike
> > >
> > >
> > > #!/usr/bin/perl -w
> > > #
> > > #
> > > # This program writes the results of the webpage listed in line 17
> > > # to $outfile.  So basically it converts HTML to text.
> > > # It works reasonably well with HTML tables.
> > > #
> > > #
> > >
> > > #!/usr/bin/perl
> > > use strict;
> > > use warnings;
> > > use LWP::UserAgent;
> > > use HTML::FormatText::WithLinks::AndTables;
> > >
> > >
> > > my $page = 'http://www.mflan.com/crime.htm';
> > >
> > > my $outfile = 'output.txt';
> > >
> > > chdir '/home/mike/Documents/copy';
> > >
> > > open OUT, ">>$outfile" or die "Can't open '$outfile': $!";
> > >
> > > my ($sl, $request, $response, $html);
> > >
> > > $sl = LWP::UserAgent->new;
> > >
> > >
> > > $sl->proxy('http', ''); # enter proxy if needs be / and set it for
> > > Soap
> > > too ...
> > > $request = HTTP::Request->new('GET', $page);
> > > $response = $sl->request($request);
> > > $html = $response->as_string;
> > >
> > > print "Got it into \$html.\n";
> > >
> > >
> > >
> > > my $text = HTML::FormatText::WithLinks::AndTables->convert($html);
> > >
> > >
> > > print OUT "$text";
> > >
> > > print "\nAll done.\n";
> > >
> > > close OUT;
> > >
> > >
> > > __END__
> > >
> > >
> > >
> > >
> > > On 7/20/22 10:13, William Torrez Corea wrote:
> > > > The url of the page:
> > > >
> > > > https://www.pgr.gob.ni/PDF/2021/GACETA/GACETA_17_08_2021.pdf
> > > >
> > > > On 7/20/22, William Torrez Corea <willitc9...@gmail.com> wrote:
> > > > > Exist a page where you put info about the person but if you
> > > > > want to
> > > search
> > > > > a name you must search this manually. So, I want to automate
> > > > > this
> > > process
> > > > > with perl.
> > > > > --
> > > > >
> > > > > With kindest regards, William.
> > > > >
> > > > > ⢀⣴⠾⠻⢶⣦⠀
> > > > > ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
> > > > > ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
> > > > > ⠈⠳⣄⠀⠀⠀⠀
> > > > >
> > > >
> > >
> > >
> >
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
>
I want to create this by means of code, i don't want to use any tool.
-- 

With kindest regards, William.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀

Reply via email to