Re: E-Mail distribution
Hi Philipp! http://wget.sunsite.dk/#mailinglists says you (=you personally) have to write an email to [EMAIL PROTECTED] to become unsubscribed. For this, you must a) know which list you are subscribed to b) know with which email address you are subscribed CU Jens PS: I am getting NO spam or viri via the wget mailing list, I don't know what is happening on your machine. > Hello WGET Team, > > I'm in your e-mail distribution list, > please please delete me at this list because I´ve got so much mails > with viruses and in my company they become crazy > Please let me know about the deleting. > -- Lassen Sie Ihren Gedanken freien Lauf... z.B. per FreeSMS GMX bietet bis zu 100 FreeSMS/Monat: http://www.gmx.net/de/go/mail
Re: how to follow incorrect links?
Hi Tomasz! > There are some websites with backslashes istead of slashes in links. > For instance : > instead of : > Internet Explorer can "repair" such addresses. My own assumption is: It repairs them, because Microsoft introduced that #censored# way of writing HTML. Anyway, this will not help you, I know. I think you should email the webmaster and tell him/her about the errors. > How to make wget to follow such addresses? I think it is impossible. I can think of one way: start wget -nc -r -l0 -p URL after it finishes, replace all "\" with "/" in the downloaded htm(l) files This will make the html files correct. After that, start wget -nc -r -l0 -p URL again wget will now parse the downloaded and corrected HTML files instead of the wrong files on the net. Continue this procedere until wget does not download any more files. I do not know how handy you are in your OS, but this should be doable with one or two small batch files. Maybe one of the pros has a better idea. :) CU Jens (just another user) -- DSL Komplett von GMX +++ Supergünstig und stressfrei einsteigen! AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
Re: Back after a while
Dear Hrvoje, I was surprised to find Gisle's reply - your original post did not get through to me! For all those wgeteers with similar mail problems: http://www.mail-archive.com/wget%40sunsite.dk/ http://www.mail-archive.com/wget%40sunsite.dk/msg07093.html > > For the last several months I've been completely absent from Wget > > development, and from the net in general. Here is why, and the story > > is not for the faint of heart. Well, your story really shocked and touched me. It's one of those things we know (somewhere in the back of our brains) can happen but we tend to think they will only happen to a nameless and faceless group of "the others". Well, your story certainly has opened my eyes... I'm very glad you are recovering and wish you all the best for the future and a fast and complete recovery! I don't know about the others but I'd like to read about your progress whenever you feel like sharing. I'm squeezing my thumbs for you! (german saying for wishing luck) Jens -- DSL Komplett von GMX +++ Supergünstig und stressfrei einsteigen! AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
Re: -i problem: Invalid URL ÂÂh: Unsupported scheme
Hi Mike! > on the command line, but if I copy this to, e.g., test.txt and try > wget -i test1.txt Well, make sure a) whether you have test.txt or test1.txt b) you only have URLs in your txt file like http:// - no options c) you save the txt file with a pure ASCII editor like Notepad - not Wordpad (it works with Wordpad, but Notepad is better) d) you use -i and not -I (as you wrote in your first line - wget is case-sensitive) > Does anyone else have this problem? At least not me. CU Jens (just another user) -- DSL Komplett von GMX +++ Supergünstig und stressfrei einsteigen! AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
Re: Images in a CSS file
Hi Deryck! As far as I know, wget cannot parse CSS code (and neither JavaScript code). It has been requested often, but so far noone has tackled this (probably rather huge) task. CU Jens (just another user) > Hello, > > I can make wget copy the necessary CSS files referenced from a webpage > but is it possible to make it extract any images referenced from > within the CSS file? > > Thanks > -- GMX im TV ... Die Gedanken sind frei ... Schon gesehen? Jetzt Spot online ansehen: http://www.gmx.net/de/go/tv-spot
Re: Bug (wget 1.8.2): Wget downloads files rejected with -R.
Hi Jason! If I understood you correctly, this quote from the manual should help you: *** Note that these two options [accept and reject based on filenames] do not affect the downloading of HTML files; Wget must load all the HTMLs to know where to go at all--recursive retrieval would make no sense otherwise. *** If you are seeing wget behaviour different from this, please a) update your wget and b) provide more details where/how it happens. CU & good luck! Jens (just another user) > When the -R option is specified to reject files by name in recursive mode, > wget downloads them anyway then deletes them after downloading. This is a > problem when you are trying to be picky about the files you are downloading > to save bandwidth. Since wget appears to know the name of the file it is > downloading before it is downloaded (even if the specified URL is redirected > to a different filename), then it should not bother downloading the file > at all if it is going to delete it immediately after downloading it. > > - Jason Cipriani > -- GMX im TV ... Die Gedanken sind frei ... Schon gesehen? Jetzt Spot online ansehen: http://www.gmx.net/de/go/tv-spot
Re: get link Internal Server Error
For me this link does NOT work in IE 6.0 latest Mozilla latest Opera So I tested a bit further. If you go to the site and reach http://www.interwetten.com/webclient/start.html and then use the URL you provide, it works. A quick check for stored cookies revealed that two cookies are stored. So you have to use wget with cookies. For info on how to do that, use the manual. CU Jens > hi all: > some link use IE open is normal,but use wget download have > somewrong, i cant find some slove way, i think it maybe a bug : > example link: > > http://www.interwetten.com/webclient/betting/offer.aspx?type=1&kindofsportid=10&L=EN > this link use IE open is ok,but use wget have this wrong, > Connecting to www.interwetten.com[213.185.178.21]:80... connected. > HTTP request sent, awaiting response... 500 Internal Server Error > 01:02:27 ERROR 500: Internal Server Error. > henryluo > -- NEU: WLAN-Router für 0,- EUR* - auch für DSL-Wechsler! GMX DSL = supergünstig & kabellos http://www.gmx.net/de/go/dsl
wget 1.9 for windows: no debug support?
Hi! While I was testing the new wget 1.9 binary for windwos from http://space.tin.it/computer/hherold/ I noticed that it was slow if a URL specified within -i list.txt did not exist. It would wait until wget tries the next URL listed. Well, to find out what was happening, I specified -d for the debug output. The message was: debug support not compiled in and wget would continue with normal downloading. Is this an oversight or does it serve a purpose? CU Jens -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService Jetzt kostenlos anmelden unter http://www.gmx.net +++ GMX - die erste Adresse für Mail, Message, More! +++
Re: -A acceptlist (not working right)
Hi Samuel! > You The Man! Pssst! Don't tell everyone! :D > > Then I did > > wget -nc -x -r -l0 -p -np -t10 -k -nv -A.gif,.htm,.html http://URL > This also worked, then I began trying to figure out what the > hell was wrong, I added the .cfm to the list. returned an > empty foldertried .shtml & phtml, got the same thing. > apparently it does not like those in the accept list > Try adding those, see what you get. Hm, I don't have .cfm/.shtml files handy at the moment. So I just added the extensions to the list and it worked the same as before. > Then, I tried putting html at the nd of my list, putting the > other stuff in front of it, viola. strange, very strange. > Also, your -R example worked great, but if I add a .mov to > the end of the list, it nullifies all the other reject commands. > If I move the .mov to the front, it works? Seems to make no difference for me. Maybe this is a bug that only shows up on your OS? But I cannot imagine how this is possible. > Questions: > How can I direct the files into another directory other than > /user/ Try -Pdir1/dir2 or (works at least in Windows -P../dir/dir2 ../ goes up one level. Currently -P does not allow a change of drive, but I think the coders are working on that (right?) >How can I just blow it all into one directory without the folder > structure. filenames and structure are not a problem, I just need > the data in the files. try -nd (for no directories) BTW, -nd is default (so to say) for single files (wget http://host.com/page.html) > and nights... Have a virtual drink on me. Cheers! :) CU Jens http://www.jensroesner.de/wgetgui/ -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: -A acceptlist (not working right)
Hi Sam! I am curently not at home so cannot check the other messages about this topic. I just used wget -nc -x -r -l0 -p -np -t10 -k -nv -R.jpg,.zip,.exe,.htm,.html http://URL with 1.8.1 and it returned only gifs, as expected. Then I did wget -nc -x -r -l0 -p -np -t10 -k -nv -A.gif,.htm,.html http://URL and received only htm(l) and gif files, also as expected. This was done on Windows. I also worked with wget 1.7.x for me. Then I tried wget -nc -x -r -l0 -p -np -t10 -k -nv -A*n*.htm*,.gif http://URL What returned only htm(l) files with an "n" in the name and gif files. AFAIK, these options work perfectly on Windows. Sorry, if that is not much help for you. > >I have been able to get a .wgetrc file into the usr/local/etc > directory, saved as .wgetrc, it is a text file with the command. > With Mac OS X it is the user directory that it respects, not the > usr/local/etc directory which is invisible on the BSD based Mac > OS X, HD/users//.wgetrc. This location worked fine > and other commands were executed without issue. > > accept = htm,html,shtml,phtml,cfm > > returns > > everything, gif, jpeg. swf, pdf, etc. > >I was able to employ the .wgetrc file but the commands of > comma list accept and reject are still exhibiting the same > behavior. No matter how it is used by syntax or whether > it is in the command line or .wgetrc The features are not > working right in accept or reject lists. Now what do I do? > I haven't tested the reject domains list yet to see if it behaves > similarly. > >The commands all work one at a time but not as a list. > Is this a version problem with all platforms or is this a Mac > OS X only problem? Please advise, I don't know how to get > around this problem. I really need this thing to work, it can > do exactly what I need and is a great program for what I > need to do. Would an older version work better for me? > > Thanks, Sam > -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: A strange bit of HTML
Hi Hrvoje! First, I did/do not mean to offend/attack you, just in case that my suspicion about you being pi55ed because of my post is not totally unjustified. > > If the HTML code says > > supermarket > > Why can't wget just ignore everything after ...URL"? > > Because, as he said, Wget can parse text, not read minds. Ah *slapsforehead* /me stupid. > For > example, you must know where a tag ends to be able to look for the > next one, or to find comments. It is not enough to look for '>' to > determine the tag's ending -- something like src="foo"> is a perfectly legal tag. okok, granted, to dissolve <-fool.htm-> for example, you'd really have a hard time, I suppose. I honestly did not think of people messing with < and >. > As for us being strict, I can only respond with a mini-rant... > Wget doesn't create web standards, but it tries to support them. > Spanning the chasm between the standards as written and the actual > crap generated by HTML generators feels a lot like shoveling shit. [rant name="my rant"] Ah, tell me about it. Although I come from the other side (Trying to write my sites -with a text editor- so that they look ok on different browsers and remain HMTL compliant) I surely know how much 'fun' it can be to work with standards. Especially if they were set by a commitee as intelligent and just (as in justice) like W3C... BTW, as an engineering student I am fully aware how much help good standards can be. [/rant] > Some amount of shoveling is necessary and is performed by all small > programs to protect their users, but there has to be a point where you > draw the line. There is only so much shit Wget can shovel. Unfortunately, the amount of shit on the web will not decrease. I fear that the opposite may be true. no, wait, I am pretty sure... > I'm not saying Ian's example is where the line has to be drawn. (Your > example is equivalent to Ian's -- Wget would only choke on the last > "going" part). But I'm sure that the line exists and that it is not > far from those two examples. Ok, but I understand you correctly that these two examples (mine was intended to be equivalent, but without JS) should be on the "parse and retrieve" side of this line, not the "ignore and blame Frontpage" side? CU Jens -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: A strange bit of HTML
Hi there! > > >href="66B27885.htm" "msover1('Pic1','thumbnails/MO66B27885.jpg');" > >onMouseOut="msout1('Pic1','thumbnails/66B27885.jpg');"> >SRC="thumbnails/66B27885.jpg" NAME="Pic1" BORDER=0 > BTW: it is valign="middle" :P (I detest AllCaps and property=value instead of property="value".) > That sounds like they wanted onMouseOver="msover1(...)" > It's also likely that msover1 is a Javascript function :-( Definitively, I would say. > >I can't call this a bug, but is Wget doing the right thing by > >ignoring the href altogether? > Until there's an ESP package that can guess what the author intended, > I doubt wget has any choice but to ignore the defective tag. *g* Seriously, I think you guys are too strict. Similar discussion have spawned numerous times. If the HTML code says supermarket Why can't wget just ignore everything after ...URL"? Is there any instance where this would create unwanted behaviour for the user? It does not matter if there is a javascript called, a CSS broken, or the webmaster has bad breath. Now, if a "mouseover picture" is loaded, wget cannot retrieve it anyway, no matter if the javascript is correct or malformed, right? > In addition, > wget should send an email to webmaster@, > complaining about the invalid HTML :-) /me signs this petition! In addition, mails should be written for bad (=unreadable) combos of font colour and background colour, animated gifs and blink tags. Kind regards Jens -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: -H suggestion
Hi! Once again I think this has nothing to do in the bug list, but, there you go: > I've toyed with the idea of making a flag to allow `-p' span hosts > even when normal download doesn't. Funny you mention this. When I first heard about -p (1.7?) I thought exactly that it would default to that behaviour. I think it would be really useful if the page requisites could be wherever they want. I mean, -p is already ignoring -np (since 1.8?), what I think is also very useful. > > The -i switch provides for a file listing the URLs to be downloaded. > > Please provide for a list file for URLs to be avoided when -H is > > enabled. > > URLs to be avoided? Given that a URL can be named in more than one > way, this might be hard to do. > Sorry, but does --reject-host (or similar, I don't have the docs here ATM) not exactly do this? I may well be missing the point here. But with disallowing hosts and dirs you should be able to do this. Or is the problem to load the lists from an external file? Then, please ignore my comment, I have no experience in this. CU Jens -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: Need help: How to specify the output path for getting multiple files using Wget?
Hi! > > I need to download multiple files from a web site and would like to put > > them all in a specific directory on my local PC, let's say c:\temp. > > How do I accomplish this? > You should be able to do this with the -P option, but there was a > report by Jens Rösner a few days ago that -Pc:\temp does not work, > but that -P/temp does (or something like that). However he did not > mention which version of Wget he was using. Here is my report. Sorry I forgot to tell that I am using wget 1.8.1: *** Can I use -P (Directory prefix) to save files in a user-determinded folder on another drive under Windows? I tried -PC:\temp\ which does not work (I am starting from D:\) Also -P..\test\ would not save into the dir above the current one. So I changed the \ into / and it worked. However, I still could not save to another drive with -Pc:/temp Any way around this? Bug/Feature? Windows/Unix problem? *** Once again: Saving is fine with /, but you cannot switch drives, at least not on my WinME. > In short, if -P c:\temp does not work, try using -P /temp instead, > but obviously this means you are restricted to saving to a > directory on the same drive letter as the directory from whence > wget was launched. Exactly that is what I experienced. Is/will there be a way around this? CU Jens -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: Uncoupling translations from source
Hi Hrvoje and everyone on the list, I think that sounds like a good idea. In fact, I thought so when this topic came up for the first time a few days ago. However, I might not be typical, as I really do not care about a translation as long as English remains wget's first language. (And no, I am not "American" ;) ) What I care about are added features and killed bugs, so my preference is clear: More development time by separating. So, that's my opinion, looking forward to read others'. CU Jens http://www.jensroesner.de/wgetgui/
Re: What was that? Proxy!
Hi guys! Yes, you all are right. Proxy is the answer. I feel stupid now. /me goes to bed, maybe that helps! :| Thanks anyway! :) Until the next intelligent question :D CU Jens Man, I really hate ads like the following: -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
What was that?
Hi guys! Today I found something strange. First, I am using MS WinME (okok) and Netscape (okok) I was downloading from audi.com a 3000kB zip file from a page via right-click After (!) I had finished downloading I thought "Hey why not use wGet(GUI) for it. Smart, huh? One file, already downloaded... But there was another file I wanted to try in the first place. That did not work, because it was streaming mov :( (I guess that would be very difficult to implement?) Anyway, when wGet was downloading the zip I had already downloaded to another directory on another drive, the average speed was 850kb/sec! No, I am !not! sitting in the LAN of Audi.de oder so! I have a fast ethernet LAN access over uni, but I am in New Zealand, so, I highly doubt that it is possible. Anyway, the file is there, it works, it is strange. Can anyone explain a stupid Windows user what happened there? Does wget access the Netscape internal Cache? (Nahh, can't be...) I cannot provide you with a debug or -v log file. :( CU Jens *still confused http://www.jensroesner.de/wgetgui/
Re: bug?
Hi Tomas! > I see, but then, how to exclude from being downloaded per file-basis? First, let me be a smartass: Go to http://www.acronymfinder.com and lokk up RTFM Then, proceed to the docs of wget. wget offers download restrictions on host, directory and file name. Search in the docs for -H -D --exclude-domains `-A ACCLIST' `--accept ACCLIST' `accept = ACCLIST' `-R REJLIST' `--reject REJLIST' `reject = REJLIST' `-I LIST' `--include LIST' `include_directories = LIST' `-X LIST' `--exclude LIST' `exclude_directories = LIST' CU Jens http://www.jensroesner.de/wgetgui
Re:
Hi Tomas! I just have a rather strange idea: Should it not be instead of I think wGet ignores incorrect HTML syntax, right? CU Jens Tomas Hjelmberg schrieb: > > > > Can't get this to work, not on windows, nor linux. > not using 1.7 nor the latest CVS. > > /Tomas
Re: meta noindex
Hi Tomas! > Thanks a lot, but unfortunately that didn't work... > I just do a simple: > wget -r http://localhost > And my unwanted file is included all the time... Hm :( > Have you had it to work with 1.7 or are you using the CVS-version? CVS? No, I am just a stupid Windows user with some binaries from Heiko ;) Are you sure your wgetrc is recognized, that it is in the right directory? >From my experience I also noted that some servers block wGet as it is not a Browser. That is why I incorporated in my wGetGUI that the User can choose to have wGet ignore robots.txt and identify as a Mozilla browser. That normally should work. Maybe you also have to try both at the same time for your problem? Right now I am a bit puzzled what you meant by "I can't get wget 1.7 react on the following:" I thought you wanted wGet to ignore robots?! Correct? Good luck Jens http://www.jensroesner.de/wgetgui > > -----Original Message- > From: Jens Roesner [mailto:[EMAIL PROTECTED]] > Sent: 29 October 2001 23:21 > To: Tomas Hjelmberg > Subject: Re: meta noindex > > Hi Tomas! > > Put > robots = off > in your wgetrc > You cannot use it in the command line if I am not mistaken. > I think it was introduced in 1.7 so you should have no problems. > > Good luck > Jens > http://www.jensroesner.de/wgetgui > > Tomas Hjelmberg schrieb: > > > > Hi, > > I can't get wget 1.7 react on the following: > > > > > > > > > > ... > > > > > > Cheers /Tomas
Re: Recursive retrieval of page-requisites
Hi Mikko! > > To me that sounds like a logical combination of -r -np -p? > > Any correction appreciated. > Doesn't work, apparently because -np overrides -p. I know, sorry. I meant: What you want sounds like something that one would normally think -r -np -p would do and therefore I do not know why that is considered cheating?! > What I'd basically like is a setting that loads page-requisites REGARDLESS > OF ALL OTHER SETTINGS. I.e. you use the myriad of settings to fine tune > the exact set of pages requested, and then request "all requisites for the > selected set of pages". Yes, that's the way I would consider useful, too. CU Jens
Re: Recursive retrieval of page-requisites
Hi wGetters! > > > I want to download a subtree of HTML documents from a foreign site > > > complete with page-requisites for offline viewing. > > > I.e. "all HTML pages from this point downwards, PLUS all the images(etc.) > > > they refer to -- no matter where they are in the directory tree" > > This is cheating, What does "cheating" mean here? Now I know the meaning of "cheating", but I do not understand it in this context. Could someone please elaborate a bit? To me that sounds like a logical combination of -r -np -p? Any correction appreciated. Thanks Jens
Re: Cookie support
Hi Andreas! AFAIK wGet has cookie support. At least the 1.7 I use. If this does not help you, I did not understand your question. But I am sure there are smarter guys than me on the list! ;) CU Jens http://www.JensRoesner.de/wGetGUI/ [snip] > Would it make sense to add basic cookie support to wget? [/snip]
Re: referer question
Hi wgetters! @André > Guys, you don't understand what the OP wants. He needs a > dynamically generated referer, something like > wget --referer 'http://%h/' > where, for each URL downloaded, wget would replace %h by the > hostname. Well, I understood it this way. My problem was that I mainly use wGet and wGetGUI for downloads from !one! server. Therefore I did not think of the problem when wget leaves the server for which wGetGUI puts in the --referer=starthost. @Jan: Sorry, the option of wGetGUI is called "Identify as browser" what happens then is that wGetGUI does !both! --referer and --user-agent ! If I find time to do a user's manual, I will make this clear. Sorry for the confusion. @Vladi Ok, I know Windows sucks ;) But I am tooo lazy! BTW: I would like that --auto-referer, too ;) So go ahead! ;D CU Jens
Re: wget and dos
Hi Jan! > Do you have ANY idea where to place the .wgetrc on a windows system? AFAIK, it is enough to just have it in the wget directory. That is the way I did it with wGetGUI and it works I use the latest win32 1.7 CU Jens http://www.jensroesner.de/wgetgui PS: Ich studiere auch an der TUD! :)
Ever heard of that version?
Hi wGetters! I just stumbled over http://bay4.de/FWget/ Are his changes incorporated into Wget 1.7? Any opinions on that software? I think with WinME *yuck* as OS, this is out of question for me, but... CU Jens
Re: More Domain/Directory Acceptance
Hi Ian and wgetters! > Well if you're running it from a DOS-style shell, get rid of the > single quotes I put in there, i.e. try -Ibmaj* Oh, I guess that was rather stupid of me. However, the windows version will only work with -I/bmaj or -Ibmaj.roesner, not with anything like -I/bmaj* oder -I/bmaj?roesner :( Reasons? (I also tried -Iaj.r which also did not work... Once again here is the command line for all wanting to give it a try: wget -nc -r -l0 -nh -d -o test.log -H -I/bmaj http://members.tripod.de/jroes/test.html works like a blast. :) BTW: wget -nc -r -l0 -nh -d -o test.log -H -Dhome.nexgo.de -I/bmaj http://members.tripod.de/jroes/test.html works, too. So you can restrict hosts and dirs on that host. (Imagine a bmaj dir on the tripod server, for example.) And this setup suits me just fine. But having more options is always a good thing, so, are there wildcards like * and ? in the Win32 version of wget? CU Jens http://www.jensroesner.de/wgetgui
Re: More Domain/Directory Acceptance
Hi Ian, hi wgetters! Thanks for your help! > It didn't work for me either, but the following variation did: > wget -r -l0 -nh -d -o test.log -H -I'bmaj*' http://members.tripod.de/jroes/test.html Hm, did not for me :( neither in 1.4.5 nor in the newest Windows binaries version I downloaded from Heiko. :( > However, wget-1.7 dumps core with this so you'll have to use the > latest version from CVS. Hm, what exactly do you mean by that? Is the version from Heiko young enough? Here is what the debug output reads with 1.7.1: * DEBUG output created by Wget 1.7.1-pre1 on Windows. parseurl ("http://members.tripod.de/jroes/test.html";) -> host members.tripod.de -> opath jroes/test.html -> dir jroes -> file test.html -> ndir jroes newpath: /jroes/test.html --22:16:33-- http://members.tripod.de/jroes/test.html => `members.tripod.de/jroes/test.html' Connecting to members.tripod.de:80... Caching members.tripod.de <-> 62.52.56.162 Created fd 72. connected! ---request begin--- GET /jroes/test.html HTTP/1.0 User-Agent: Wget/1.7.1-pre1 Host: members.tripod.de Accept: */* Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... HTTP/1.1 200 OK Server: Apache/1.2.7-dev Set-Cookie: CookieStatus=COOKIE_OK; path=/; domain=.tripod.com; expires=Sat, 06-Jul-2002 10:17:26 MET cdm: 1 2 3 4 5 6Attempt to fake the domain: .tripod.com, members.tripod.de Set-Cookie: MEMBER_PAGE=jroes/test.html; path=/; domain=members.tripod.de cdm: 1 2 3 ** No file was written. I'll try it on another PC with another OS this weekend... But if you can give any advice already, that would be great! CU Jens
More Domain/Directory Acceptance
Hi again! I am trying to start from http://members.tripod.de/jroes/test.html (have a look) The first link goes to a site I do not want. The second link goes to a site that should be retrieved. wget -r -l0 -nh -d -o test.log -H -I/bmaj*/ http://members.tripod.de/jroes/test.html does not work. wget -r -l0 -nh -d -o test.log -H -Dhome.nexgo.de -I/bmaj*/ http://members.tripod.de/jroes/test.html does neither :( I also tried -Dhome.nexgo.de -I../bmaj.roesner/ with no success. The debug output (I know it is not bug, but this gives many information) reads: ** parseurl("http://home.nexgo.de/bmaj.roesner/";) -> host home.nexgo.de -> opath bmaj.roesner/ -> dir bmaj.roesner -> file -> ndir bmaj.roesner http://home.nexgo.de:80/bmaj.roesner/ (bmaj.roesner) is excluded/not-included. http://home.nexgo.de:80/bmaj.roesner/ already in list, so we don't load. ** How can it be done? CU Jens http://www.jensroesner.de/wgetgui/
Re: Domain Acceptance question
Hi Mengmeng! Thanks very much, I (obviously) was not aware of that! I'll see how I can incorporate that (-I/-X/-D/-H) in wGetGUI. Can I do something like -H -Dhome.nexgo.de -Ibmaj.roesner http://www.AudiStory.com ? I'll just give it a try. Thanks again! Jens
Domain Acceptance question
Hi there! Hmm, today I was trying out my new version of wGetGUI (0.71) on my own site http://www.AudiStory.com However the main content is on http://home.nexgo.de/bmaj.roesner/ So basically I tried to wget -r -l0 -nh -H -Dhome.nexgo.de http://www.audistory.com which works. But as home.nexgo.de has very many dirs except mine, I also tried wget -r -l0 -nh -H -Dhome.nexgo.de/bmaj.roesner http://www.audistory.com and wget -r -l0 -nh -H -Dhome.nexgo.de/bmaj* http://www.audistory.com which both did not work. Can this be? Is this did on purpose? Also, why does wget -r -l0 -nh -H -Dhttp://home.nexgo.de http://www.audistory.com not work? I am using 1.7.1-pre1 (anything newer available?) on Win ME (don't hit me!) CU Jens http://www.jensroesner.de/wgetgui