Re: How to debug wget ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jinhui Li wrote: > I am browsing the source code. And want to debug it to figure out how it > works. > > So, somebody please tell me how to debug ( with GDB ) or where can I > find information that I need. IMO, GDB is a great tool for diagnosing a particular problem one encounters with a program; it's not all that terribly useful for actually understanding the code itself, though. I find it much quicker to read through the code using a powerful viewer or editor, and making use of tools such as cscope and ctags. The best editors, such as Vim and Emacs, are integrated these tools, and so a simple control-click or key combination can bring up the definition of the function being called or the variable being referenced, or (in the case of cscope) the list of places where a particular function is being called, etc. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvcPD7M8hyUobTrERAsCEAJ9oQDJWzD/OPAvzvgJorlByd4YqyACfdLM1 GmQUVu/xnQ7HOr493hiWG28= =0XwB -END PGP SIGNATURE-
Re: Support for file://
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Petri Koistinen wrote: > Hi, > > I would be nice if wget would also support file://. Feel free to file an issue for this (I'll mark it "Needs Discussion" and set at low priority). I'd thought there was already an issue for this, but can't find it (either open or closed). I know this has come up before, at least. I think I'd need some convincing on this, as well as a clear definition of what the scope for such a feature ought to be. Unlike curl, which "groks urls", Wget "W(eb)-gets", and file:// can't really be argued to be part of the web. That in and of itself isn't really a reason not to support it, but my real misgivings have to do with the existence of various excellent tools that already do local-file transfers, and likely do it _much_ better than Wget could hope to. Rsync springs readily to mind. Even the system "cp" command is likely to handle things much better than Wget. In particular, special OS-specific, extended file attributes, extended permissions and the like, are among the things that existing system tools probably handle quite well, and that Wget is unlikely to. I don't really want Wget to be in the business of duplicating the system "cp" command, but I might conceivably not mind "file://" support if it means simple _content_ transfer, and not actual file duplication. Also in need of addressing is what "recursion" should mean for file://. Between ftp:// and http://, "recursion" currently means different things. In FTP, it means "traverse the file hierarchy recursively", whereas in HTTP it means "traverse links recursively". I'm guessing file:// should work like FTP (i.e., recurse when the path is a directory, ignore HTML-ness), but anyway this is something that'd need answering. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvcLq7M8hyUobTrERAl6YAJ9xeTINVkuvl8HkElYlQt7dAsUfHACfXRT3 lNR++Q0XMkcY4c6dZu0+gi4= =mKqj -END PGP SIGNATURE-
Re: [bug #20329] Make HTTP timestamping use If-Modified-Since
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yes, that's what it means. I'm not yet committed to doing this. I'd like to see first how many mainstream servers will respect If-Modified-Since when given as part of an HTTP/1.0 request (in comparison to how they respond when it's part of an HTTP/1.1 request). If common servers ignore it in HTTP/1.0, but not in HTTP/1.1, that'd be an excellent case for holding off until we're doing HTTP/1.1 requests. Also, I don't think "removing the previous HEAD request" code is entirely accurate: we probably would want to detect when a server is feeding us non-new content in response to If-Modified-Since, and adjust to use the current HEAD method instead as a fallback. - -Micah vinothkumar raman wrote: > This mean we should remove the previous HEAD request code and use > If-Modified-Since by default and have it to handle all the request and > store pages if it is not returning a 304 response > > Is it so? > > > On Fri, Aug 29, 2008 at 11:06 PM, Micah Cowan <[EMAIL PROTECTED]> wrote: >> Follow-up Comment #4, bug #20329 (project wget): >> >> verbatim-mode's not all that readable. >> >> The gist is, we should go ahead and use If-Modified-Since, perhaps even now >> before there's true HTTP/1.1 support (provided it works in a reasonable >> percentage of cases); and just ensure that any Last-Modified header is sane. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvb7t7M8hyUobTrERAsvQAJ4k7fKrsFtfC4MQtuvE3Ouwz6LseACePqt2 8JiRBKtEhmcK3schVVO347A= =yCJV -END PGP SIGNATURE-
Re: [BUG:#20329] If-Modified-Since support
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 vinothkumar raman wrote: > We need to give out the time stamp the local file in the Request > header for that we need to pass on the local file's time stamp from > http_loop() to get_http() . The only way to pass on this without > altering the signature of the function is to add a field to struct url > in url.h > > Could we go for it? That is acceptable. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvb5B7M8hyUobTrERAv2YAJ0ajYx+pynFLtV2YmEw7fA+vwf8ugCfSaU1 AFkIYSyyyS4egbyXjzBLXBo= =fIT5 -END PGP SIGNATURE-
Re: Checking out Wget
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 vinothkumar raman wrote: > Hi all, > > I need to checkout the complete source into my local hard disk. I am using > WinCVS when i searched for the module its saying that there is no module > information out there. Could any one help me out i am a complete novice in > this regard. WinCVS won't work, because there _is_ in fact no CVS module for Wget. Wget uses Mercurial as the source repository (and was using Subversion prior to that). For more information about the Wget source repository and its use, see http://wget.addictivecode.org/RepositoryAccess That page focuses on using the "hg" command-line tool; you may prefer to use TortoiseHg instead, http://tortoisehg.sourceforge.net/. The page does offer additional information about the repository and what is required to build from those sources. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvb4n7M8hyUobTrERAnquAJ9ItMQH1QYgXvyYTI6/IZDScIFGoACfVlqd p+LMC9AK5/SwYPyuGVfd5Ns= =RmLO -END PGP SIGNATURE-
Re: [wget-notify] add a new option
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 houda hocine wrote: > Hi, Hi houda. This message was sent to the wget-notify, which was not the proper forum. Wget-notify is reserved for bug-change and (previously) commit notifications, and is not intended for discussion (though I obviously haven't blocked discussions; the original intent was to be able to discuss commits, but I'm not sure I need to allow discussions any more, so it may be disallowed soon). The appropriate list would be wget@sunsite.dk, to which this discussion has been redirected. > we create a new format for archiviving (. warc), and we want to ensure > that wget generate directly this format from the input url . > You can help me by some ideas to achieve this new option? > The format is (warc -wget url) > I am in the process of trying to understand the source code to add this > new option. Which .c file fallows me to do this? Doing this is not likely to be a trivial undertaking: the current file-output interface isn't really abstracted enough to allow this, so basically you'll need to modify most of the existing .c files. We are hoping at some future point to allow for a more generic output format, for direct output to (for instance) tarballs and .mhtml archives. At that point, it'd probably be fairly easy to write extensions to do what you want. In the meantime, though, it'll be a pain in the butt. I can't really offer much help; the best way to understand the source is to read and explore it. However, on the general topic of adding new options to Wget, Tony Lewis has written the excellent guide at http://wget.addictivecode.org/OptionsHowto. Hope that helps! Please note that I won't likely be entertaining patches to Wget to make it output to non-mainstream archive formats, and even once generic output mechanisms are supported, the mainstream archive formats will most likely be supported as extension plugins or similar, and not as built-in support within Wget. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIvbyf7M8hyUobTrERApl8AJwNvWOdDd0Z//wbNzN/jyZFqKI5iQCfQOx4 3zlxPGaVqjsPhwa7ZwB4wrs= =Zy+N -END PGP SIGNATURE-