pdf to spreadsheet advice
Hello, I periodically receive pdf's with a table of member names, addresses, etc in a badly formated hard to read pdf. I would like to open the pdf, extract the data, do a little re-organizing and write it to an excel spreadsheet. Perl seems like the best way to do this. I have searched CPAN and seen that there are a bunch of pdf and spreadsheet related modules. I am looking for advice about the best modules to use for this. Which modules would be the best to extract the data from the pdf and write to Excel with? I will probably do this on OS X, though I can use Windows if I need to. Thanks -- Matt -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Help with WWW::Mechanize
Mathew Snyder wrote: Tom Phoenix wrote: On 11/28/06, Mathew Snyder [EMAIL PROTECTED] wrote: I have a form I'm trying to fill out. One of the fields, despite being named in the HTML source keeps erroring out on me. select name=ValueOfActor fields = { ValueOfStatus = $status, Value0fActor = $username, }, Maybe you should use a typeface that shows the difference between O and 0 more clearly. As another good practice, you can use copy-and-paste when you need to be sure to get the exact spelling. Hope this helps! --Tom Phoenix Stonehenge Perl Training Upon further review, this wasn't it. I'm not sure why the 'O' in either of the fields above is different but I made sure they are the same in the code. It seems like it isn't getting past the login screen for the page I'm trying to access. There's only one form on it with a two fields, one called 'user' and the other 'pass' and a submit button. If I just run this #!/usr/bin/perl use warnings; use strict; use WWW::Mechanize; use HTML::TokeParser; my $username = msnyder; print Enter your password: ; chomp(my $password = STDIN); my $status = open; my $url = 'https://rt.xxx.xxx.com/Search/Build.html'; my $textRegex = 'Tickets'; my $agent = WWW::Mechanize-new(); $agent-get('https://rt.xxx.xxx.com/'); $agent-form(1); $agent-field('user',$username); $agent-field('pass',$password); $agent-click_button(value = 'Login'); sleep(10); it returns to the prompt with no errors or output. However, as soon as I add a line to follow a link based on either a URL or regex I get an error saying the sought after item wasn't found on the page. It should be noted that the login page has the same URL as the page loaded after logging in. I don't know if that matters though. I looked at the test subroutines but need to sort them out to figure out how to use them to tell me what the issue might be. Mathew Matthew My reading of the WWW::Mechanize documentation is that $agent-click_button returns an HTTP::Response object. I don't think that would print anything on error. In this situation I would check the HTTP::Response and possibly print out $agent-content to verify that I got the page I expected. Hope that helps -- MattJ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: How to manipulate environment variables in parent process?
siegfried wrote: I think the best I could hope for would be to write a perl script that generated a bat file and then I manually execute the bat file. I don't think there is anyway to automate the execution of the bat file. I'm sure that there is; if you can't put it into its own bat file, you could have Perl itself execute it via the system() command. Since new processes (such as those run with system()) inherit the environment, Yeah: that is precisely the problem. The child inherits from the parent. Can I make the child manipulate the environment table in the parent? I don't think so. Please tell me I'm wrong. it's easy to set up %ENV however you'd like. Hope this helps! --Tom Phoenix Stonehenge Perl Training Tom, Please elaborate. I need to have an interactive command shell with the symbols set up. Anytime I run a perl program that uses backquotes or system, those symbols that are defined by perl will only be good for duration of the perl program and as soon as perl exits, I'll have a command prompt with none of the new symbols defined -- correct? I believe this is true for all *nix and windows shells. Thanks, Siegfried Siegfried, as far as I know a child cannot modify its parent. Tom is smart then I am though, so he may know something I don't. You can get the time and date to use in file or directory name in Windows batch. http://www.robvanderwoude.com/index.html has some examples. Of course Windows batch is an abysmal scripting language, so if you want to do anything non-trival it is worth doing, or re-doing in Perl. -- MattJ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: How to use Perl for API testing
You might want to consider Test::WWW::Mechanize http://search.cpan.org/author/PETDANCE/Test-WWW- Mechanize-1.12/Mechanize.pm or WWW-Mechanize http://search.cpan.org/~petdance/WWW-Mechanize-1.18/ lib/WWW/Mechanize.pm -- MattJ On Jul 9, 2006, at 10:13 PM, Suja Emmanuel wrote: Hi, I want to use PERL for API testing, i.e., I want to call different URLs through the browser. How much is possin The information contained in, or attached to, this e-mail, contains confidential information and is intended solely for the use of the individual or entity to whom they are addressed and is subject to legal privilege. If you have received this e-mail in error you should notify the sender immediately by reply e-mail, delete the message from your system and notify your system manager. Please do not copy it for any purpose, or disclose its contents to any other person. The views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of the company. The recipient should check this e-mail and any attachments for the presence of viruses. The company accepts no liability for any damage caused, directly or indirectly, by any virus transmitted in this email. www.aztecsoft.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: best way of getting a web document
Mumia W. wrote: Dan wrote: LWP or HTTP::Client? i've used both and run across..some problems. [...] i need the most reliable to fetch the feed, and pass me the body of the page so i can pass it to an xml parser of sort. unless there's something else which can already do that? [...] Hi Dan. I've played with LWP before, and it worked okay. Another option is to use the lynx web browser to fetch the page source. As far as I know, lynx cannot parse XML, so you'd have to use a separate XML parser after fetching the page with lynx. More options for fetching pages are curl (the module) and curl (the program). Foremost among the XML parsers is XML::Parser; however, CPAN has many XML parsing modules. Hi, I use WWW::Mechanize http://search.cpan.org/~petdance/WWW-Mechanize-1.18/lib/WWW/Mechanize.pm to get pages. I do some simple XML validation and manipulation in some cases. -- Matt Johnson -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response