On Thursday, November 21, 2019 8:34:09 AM CET Olivier wrote: > hw <h...@gc-24.de> wrote: > > On Wednesday, November 20, 2019 3:29:00 AM CET Olivier wrote: > > > hw <h...@gc-24.de> writes: > > > > Hi, > > > > > > > > how can I read data from a web site which is using multiple frames and > > > > some > > > > javascript? > > > > > > Provided that the web site does not change too often and that they don't > > > implement stupid "security" features, this should not be too complicate. > > > > > > Each frame is a web page, with it own URL. So you can examine the source > > > code of the web page to find the URL of the first frame and second frame > > > > > > Them you can use any Perl library you like to load that URLand pars it > > > for what you are looking for. > > > > > > Then use that data to load the second frame with a URL modified to > > > include the type of data you have selected. > > > > > > Being frames makes it much easier, you hould not have to care about the > > > javascript too much. > > > > The web site seems to be created by a program running on the server, i. e. > > there is not really a web site. When I access it with lynx or with > > WWW::Mechanize, the answer from the server says that neither frames, nor > > javascript is supported, and it is not possible to log in. > > Of course lynx cannot process frames. But that is not what I meant to > tell you. > > Open the web page with your browser, FireFox, Chromium, whatever, the > CTRL-U to display the source. In that source, you should see some tages > <frame ir maybe <iframe which contains an URL.
When I do that, the login page is being displayed, and nothing happens when I press Ctrl-U. Maybe it's because the page is already made with frames? When I look at the source of the frame that contains the fields to enter a username a password, I can see that there are inputs for those, like this: <INPUT TYPE="TEXT" NAME="usrlogn" VALUE="" MAXLENGTH="15" SIZE="8"> The only URL is probably the one displayed in the address bar of the web browser when looking at the source of the frame. That URL seems to point at the program running on the web server with parameters in the URL which have been created by the program. One of the parameters seems to be a session ID. Instead of viewing the source of the frame, I can open the frame in other tab. How does that help me? There is no way to automatically get the URL for the frame because the parameters are being created by the program on the web server, and they are only valid for a short time. > Copy that URL and try to paste it in a separate window of your web browser. > You should see the list of the topic you can select from. In fact it > should display the contents of the 1st frame. Well, yes, I can see the source of the frame that has the select list. That doesn't help me either because to get the data I want, I need to select entries from the select list. Selecting such an entry results in another frame being updated; that frame shows a table. I can get the URL of that frame from the frame info of the web browser and download the frame and convert its table into a CSV and put the data into a database --- but I can not get the URL of the frame other than copying it manually from the frame info of the web browser. > If it does not, you are in a not too good shape. > > If it works, go back to the source code and locate the second <frame > tag, find the URL, copy, new window, paste. > > The concept is to access to the contens of the frames directly, without > accessing the main page. > > Best regards, > > Olivier > > > Can WWW::Mechanize somehow trick the server into assuming that frames and > > javascript are supported by the client? Like I said, there are no frames to do anything with when the web site is being accessed with WWW::Mechanize. I can only see that when I select an entry from the select list, the web browser sends a POST request for a subdocument and then right away makes a GET request for a style sheet. Unfortunately, the browser doesn't tell me what the POST request looks like. It should have something to do with what is selected from the list ... -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/