wget problems, asp, vars

2002-11-08 Thread myrn
I'm trying to get the following site:
http://overlord.hig.se/schema/schemav8.asp?KI=3DV20BKT=245+KN=Algoritmer+och+datastrukturer+B;

I've tried real hard but failed. Can someone more experienced help me
with this please.

The site is an asp site that fetches information from some database
when you feed it with those variables. Actually, the last variable
isn't needed.




Re: wget and asp

2002-09-13 Thread Max Bowsher

 Try the source I sent you.

Dominique wrote:
 thank you Max,

np

 Is it different than the one I CVS-ed yesterday? I mean, does it have
 changes in creating filenames? Please note, that I finally compiled it
 and could run it.

No changes... I did run autoconf, so you could go straight to configure (as you
have too new an autoconf version).

 CVS = a lot of breaks during checking out
 What breaks?
 It was freezing during check out - something I never had yet using
 CVS. I had to ctrl-c and restart a few times.

Weird... I had no such problems - but I was only running a 'cvs upd' having
checked it out before.

Something has just occurred to me: by default, wget defaults to recursion
resticted to 5 levels. Perhaps that is the problem?
If so, an -l0 will fix it.

If not, could you have another go at describing it?

Max.




Re: wget and asp

2002-09-13 Thread Dominique



No changes... I did run autoconf, so you could go straight to configure (as you
have too new an autoconf version).

it compiled just fine now

Something has just occurred to me: by default, wget defaults to recursion
resticted to 5 levels. Perhaps that is the problem?
If so, an -l0 will fix it.
  

well i dont think so. i have all the files on my disk, but there is 
confusion with names (filenames vs html content vs mozilla 
interpretation). The directory I am downloading is 2 levels at most anyway.

If not, could you have another go at describing it?
  

1) some filenames are not consistent (although there are no slashes 
anymore): names on disk, names in html code and names mozilla shows or wants

2) mother script to dynamically generate example files was not found 
(neither by wget nor by hand), I mean the asp file used to generate 
?filename stuff. This has to do with the site's structure I guess, but I 
may be wrong.

I think rather than explaining again the whole big story, it would be 
easier if someone more experienced with wget than me tried to get the 
/html structure of the site I practice hacking on, or at least those 3 
files I described lately. If you want to test wget against it, of 
course, otherwise I will be giving up here.

thank you
Dominique

Max.





Re: wget and asp

2002-09-12 Thread Dominique



The problem is that with a ?x=y, where y contains slashes, wget passes them
unchanged to the OS, causing directories to be created, but fails to adjyst
relative links to account for the fact that the page is in a deeper directory
that it should be. The solution is to map / to _ or something.

Max.
  

that was my naive thinking as well... yes, the filename brakes at /

dominique





Re: wget and asp

2002-09-12 Thread Max Bowsher

Max Bowsher wrote:
 The problem is that with a ?x=y, where y contains slashes, wget passes them
 unchanged to the OS, causing directories to be created, but fails to adjyst
 relative links to account for the fact that the page is in a deeper directory
 that it should be. The solution is to map / to _ or something.

Thomas Lussnig wrote:
 a) The OS do not automaticly create directorys, this have wget todo

Looking back at my last email, I think Did I really say that?!. That part is,
of course nonsense. However, the directories are created regardless.

 b) The idee to create directorys even for parameter is not wrong !!!
 This is only an example (not working) but you can see why directorys
 where chosen insteed of _
 http://download.com/get.php?filename=/windows/putty.exe
 http://download.com/get.php?filename=/linux/putty.tgz

I don't understand your example, but regardless, wget 1.9-beta from cvs
URL-encodes slashes in a query string.

Dominique: If I understand your problem correctly, then wget 1.9 has solved it.

Max.




Re: wget and asp

2002-09-11 Thread Thomas Lussnig

 To invoke html examples they use calls like (just the first example):

 http://www.w3schools.com/html/tryit.asp?filename=tryhtml_basic

What filename did you expect for this ?
- tryit.asp
- tryit.asp?filename=tryhtml_basic
- tryhtml_basic



 Wget saves a file and a directory with this very name, but it gets 
 stuck at this one:

 
http://www.w3schools.com/html/tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html_examples.asp
 


- tryit_edit.asp
- tryit_edit.asp?filename=tryhtml_basic
- 
tryit_edit.asp?filename=tryhtml_basicreferer=http:__www.w3schools.com_html_html_examples.asp
 


The Problem is gernal how should wget know what parameter are relevant 
to put them in the filename or not.
Also you can see it sometime on first look it is not so easy for an 
computer.
And the hardest problem you would become if you got on an page where 
someone do not wan't that you download anithing.

Hard to get the pictures
http://www.flirtface.de/

Extrem hard to get anything
http://suche.org/chater-treff/

On the last page i think you have with wget anchance near 0% to get the 
content.

Cu Thomas




Re: wget and asp

2002-09-11 Thread Dominique

 What filename did you expect for this ?
 - tryit.asp
 - tryit.asp?filename=tryhtml_basic
 - tryhtml_basic


Once again: the loaction is:

http://www.w3schools.com/html/tryit.asp?filename=tryhtml_basic

It is a frame set which requires frames. One of them is a problem, 
because it has special characters.

I was expectinng tryit.asp?filename=tryhtml_basic for the frameset. And 
the file is downloaded, and is correct. But this is only a frameset. The 
two frames are tryit_view.asp?filename=tryhtml_basic (which is also 
downloaded, and also correct) and the third should be

tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html_examples.asp

and just this one is truncated. I think some regexp or pattern or 
explicit list of where_not_to_break_a_string characters would solve the 
problem. Or maybe it is already possible, but I dont know how?

thanks a lot!
dominique





 Wget saves a file and a directory with this very name, but it gets 
 stuck at this one:

 
http://www.w3schools.com/html/tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html_examples.asp
 




 - tryit_edit.asp
 - tryit_edit.asp?filename=tryhtml_basic
 - 
 
tryit_edit.asp?filename=tryhtml_basicreferer=http:__www.w3schools.com_html_html_examples.asp
 


 The Problem is gernal how should wget know what parameter are relevant 
 to put them in the filename or not.
 Also you can see it sometime on first look it is not so easy for an 
 computer.
 And the hardest problem you would become if you got on an page where 
 someone do not wan't that you download anithing.

 Hard to get the pictures
 http://www.flirtface.de/

 Extrem hard to get anything
 http://suche.org/chater-treff/

 On the last page i think you have with wget anchance near 0% to get 
 the content.

 Cu Thomas






Re: wget and asp

2002-09-11 Thread Max Bowsher

Dominique wrote:

tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html
_examples.asp

 and just this one is truncated. I think some regexp or pattern or
 explicit list of where_not_to_break_a_string characters would solve
 the problem. Or maybe it is already possible, but I dont know how?

I think that some URL encoding has not happened somewhere. Whether wget or the
web server is at fault, I don't know, but the solution would be to URL encode
the slashes.

Max.




Re: wget and asp

2002-09-11 Thread Dominique

Is it something I can do myself or the code has to be changed?

Domi

I think that some URL encoding has not happened somewhere. Whether wget or the
web server is at fault, I don't know, but the solution would be to URL encode
the slashes.

Max.
  





Re: wget and asp

2002-09-11 Thread Thomas Lussnig



tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html
_examples.asp
  

and just this one is truncated. I think some regexp or pattern or
explicit list of where_not_to_break_a_string characters would solve
the problem. Or maybe it is already possible, but I dont know how?



I think that some URL encoding has not happened somewhere. Whether wget or the
web server is at fault, I don't know, but the solution would be to URL encode
the slashes.
  

Why should be there an url encoding ?
/ are an legal character in url and in the GET string.
Ist used for example for Path2Query translation.

The main problem is that wget need to translate an URL to and Filesystem 
name.

Filesystem names are PATH and FILE names. And wget do it right i think.

example:

http://my.domain/dyn_page.sql/content_id/1891/session/0815

Server:

File: /dyn_page.sql
Query String /content_id/1891/session/0815

Client:
0. dyn_page.sql/content_id/1891/session/0815(current i think)
1. dyn_page.sql_content_id_1891_session_0815
2. 0815

Only the Author of the webpage could tell you what is an good 
translation from an URL to filesystem
if there is an querystring on the page, else ALL solutions have their 
bad sites !!!
Only solution if you would try to make more people happy is to add some 
translation style and an option for selection.

This is the same problem as with javascript urls. But here an solution 
could be theoretical defined for the url naming:

- setup an webserver
- wildcard domain *.mirror
- error_hanlder page in the document_root
- table that contain original_url to filename mapping filled by wget
- the error handler cann strip of the .mirror and then return the 
propper content

POSITIV: work for all URL styles
NEGATIV: new Cocept, more work, not the original intention meet.



msg04276/pgp0.pgp
Description: PGP signature


Re: wget and asp

2002-09-11 Thread Max Bowsher

Thomas Lussnig wrote:
 Why should be there an url encoding ?
 / are an legal character in url and in the GET string.
 Ist used for example for Path2Query translation.

 The main problem is that wget need to translate an URL to and
 Filesystem name.

Yes, you are right, I wasn't think clearly.

 Filesystem names are PATH and FILE names. And wget do it right i
 think.

 example:

 http://my.domain/dyn_page.sql/content_id/1891/session/0815

 Server:

 File: /dyn_page.sql
 Query String /content_id/1891/session/0815

 Client:
 0. dyn_page.sql/content_id/1891/session/0815(current i think)
 1. dyn_page.sql_content_id_1891_session_0815
 2. 0815

 Only the Author of the webpage could tell you what is an good
 translation from an URL to filesystem
 if there is an querystring on the page, else ALL solutions have their
 bad sites !!!

??

The problem is that with a ?x=y, where y contains slashes, wget passes them
unchanged to the OS, causing directories to be created, but fails to adjyst
relative links to account for the fact that the page is in a deeper directory
that it should be. The solution is to map / to _ or something.

Max.




Re: wget and asp

2002-09-10 Thread Max Bowsher

You don't give a whole lot of information. It's kind of impossible to help when
you don't know what the problem is.

Posting the URL of the problem site would be a good idea.

Max.

Dominique wrote:
 Is it possible at all?

 dominique

 Dominique wrote:

 Hi,

 I have a problem trying to wget a site for off-line usage. The site
 is made in asp and uses lots of stuff like:

 xxx.asp?filename=yyy

 when I download a sample asp page it seems to be almost empty and
 have no links!!! I dont understand...

 I tried mirroring, html extensions and all I could find relevant in
 the man page. Would someone please help me solve this?

 thank you
 dominique




Re: wget and asp

2002-09-10 Thread Dominique



Posting the URL of the problem site would be a good idea.
  


well, I have quite a few. let's start with this:

http://www.w3schools.com/html/default.asp

or just anything from such a page page. I hacked around for a while with 
no apparent success.

thanks
dominique

Max.

Dominique wrote:
  

Is it possible at all?

dominique

Dominique wrote:



Hi,

I have a problem trying to wget a site for off-line usage. The site
is made in asp and uses lots of stuff like:

xxx.asp?filename=yyy

when I download a sample asp page it seems to be almost empty and
have no links!!! I dont understand...

I tried mirroring, html extensions and all I could find relevant in
the man page. Would someone please help me solve this?

thank you
dominique
  


-- 
--
Dominik Szczerba, Dr.
CO-ME, D-ITET, ETZ F85
ETH Zentrum, Gloriastr. 35
CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi
--
ii swear i never use vi^[:wqZZ
--





Re: wget and asp

2002-09-10 Thread Thomas Lussnig

Dominique wrote:



 Posting the URL of the problem site would be a good idea.
  


 well, I have quite a few. let's start with this:

 http://www.w3schools.com/html/default.asp

 or just anything from such a page page. I hacked around for a while 
 with no apparent success.

Try this and it works !!!
wget -U Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 5.1) 
http://www.w3schools.com/html/default.asp

Problem is that these site Block wget 

Cu Thomas Lußnig



msg04260/pgp0.pgp
Description: PGP signature


Re: wget and asp

2002-09-10 Thread Dominique

Yes! It works!! I just missed -U option

thanks a lot!

dominique

Thomas Lussnig wrote:


 Try this and it works !!!
 wget -U Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 5.1) 
 http://www.w3schools.com/html/default.asp

 Problem is that these site Block wget 

 Cu Thomas Lunig





wget and asp

2002-09-07 Thread Dominique

Hi,

I have a problem trying to wget a site for off-line usage. The site is 
made in asp and uses lots of stuff like:

xxx.asp?filename=yyy

when I download a sample asp page it seems to be almost empty and have 
no links!!! I dont understand...

I tried mirroring, html extensions and all I could find relevant in the 
man page. Would someone please help me solve this?

thank you
dominique