Re: spaces and other special caracters in directories
Le mercredi 10 avril 2002 à 23:55:16 +0200, Hrvoje Niksic a écrit: > Loic Le Loarer <[EMAIL PROTECTED]> writes: > > > When I fetch with wget a whole subtree and when directories contains > > space or some other special character, these character are > > urlencoded in the local version while it is not the case for files. > > > > For exemple if I mirror with wget -m the directory "to to" which > > contains the file "to to", I get localy the directory "to%20to" and > > the file "to to". Is there an option to have the directory "to to" ? > > The inconsistency you're seeing is a bug, but the "intended" behavior > goes into rather the opposite direction. The code was supposed to > url-encode *both* the file and the directory without the option to > suppress it. > > I will try to fix this for the next release, preferrably by uncoupling > the url encoding from the protection of file names from invalid > characters. Ideally, the latter would be configurable. Thank you for your answer. You are rigth, the url-encoding should be consistent, either all url-encoded or no with the option to suppress it. In fact, I'm using wget to copy file from one computer to another, an in this case, and don't want the file and directories to becomes url-encoded so I need the option to suppress url-encoding. I'm waiting for the next release. Thanks again for this great tool which wget is. -- Loïc "heaven is not a place, it's a feeling"
(±¤°í)°üÀý¿°¿¡¼ ÇعæµÈ ȯÈñÀÇ »ýÈ°(ÁֺΠ65¼¼ »ç¶ÇÀ¯¸®ÄÚ)
±ÍÇÏÀÇ ¸ÞÀÏÁÖ¼Ò´Â À¥¼ÇÎÁß, http://www..com/ ¿¡¼ ¾Ë°Ô µÈ°ÍÀ̸ç, E-Mail ÁÖ¼Ò ¿Ü¿¡, ´Ù¸¥ Á¤º¸´Â °®°í ÀÖÁö ¾Ê½À´Ï´Ù.Á¤ÅëºÎ ±Ç°í»çÇ׿¡ ÀÇ°Å Á¦¸ñ¿¡ [±¤°í]¶ó°í Ç¥±âÇÑ ¸ÞÀÏÀÔ´Ï´Ù. ¿øÄ¡ ¾ÊÀ¸¸é ¼ö½Å°ÅºÎ¸¦ ´·¯ÁÖ¼¼¿ä ÂüÀ¸·Î ³î¶ó¿î ÀÏÀÔ´Ï´Ù. Àú´Â°£°æÈ 2±â ȯÀÚ¿´À¾´Ï´Ù. ÈÀå½Ç °ÉÀ½¸¸Çصµ Èû¿¡ °Ü¿ü°í ¹Ù±ù ÃâÀÔÀº »ý°¢µµ ¸øÇÏ´ø ³ª³¯À̾ú°í º´¿ø¿¡¼´Â ÀÌ¹Ì Á¾¸»À» ¿¹°íÇÏ´Â µíÇÑ ´«Ä¡¿´À¾´Ï´Ù. ³×¶¼·ç¸¶´Ï¸¦ ¾Ë°Ô µÈ °ÍÀº Áö³ 1¿ù ûÁÖ¿¡¼ ´«ÀÌ °¡Àå ¸¹ÀÌ ¿Ô´ø ³¯·Î ±â¾ïÇÕ´Ï´Ù. Çϸ±¾øÀÌ ¹Ù±ù¼¼»ó¿¡ ³»¸®´Â ´«¸¸ º¸°íÀÖ³ë¶ó´Ï ÀúÈñ ±³È¸ÀÇ ±¸¿ªÀå´Ô²²¼ ÀúÈñ Áý ³ëÅ©¸¦ ÇÏ¸é¼ ¹«½¼ ¼ö¾×½ÃÆ®¶ó¸é¼ 60°³Â¥¸®¿Í ¿±·Ï¼ö 20ÀϺÐÂ¥¸®¸¦ °¡Á®¿ÔÀ¾´Ï´Ù. ³Ê¹« ¸¹Àº ¾àµé¿¡ ¼Ó¾Ò´øÅÍ¶ó ¹«½ÉÄÚ Áö³ªÄ¡·Á´Âµ¥ ±âµµ¸¦ ÇØÁÖ½Ã¸é¼ ¾î°ÅÁö·Î ¹ß¹Ù´Ú°ú ¼Õ¹Ù´Ú ±×¸®°í µî°ú ¹è¿¡ ºÙ¿©Áֽô °ÍÀ̾úÀ¾´Ï´Ù ±×·¯³ª óÀ½À¸·Î ¸ö¿¡ ºÙÀÎ Å¿ÀÎÁö ¸öÀÌ ³Ê¹« ´ä´äÇÏ°í.¶ß°Å¿öÁ® ´õ ÀÌ»ó ºÙÀϼö°¡ ¾ø¾úÀ¾´Ï´Ù. ±×·¯³ª ¸ÅÀÏ°°ÀÌ ÇØ´ë´Â Áý»ç´Ô ÀüÈ¿¡ Çϴ¼ö ¾øÀÌ ºÙÀ̱â·Î ÇßÀ¾´Ï´Ù. ±×·¯³ª ¾à°£¾¿¸¸ ¹·¾î ³ª¿Ã»Ó ½ÅÅëÄ¡ ¾Ê¾Ò´Âµ¥ ±×³ª¸¶ ¹ß¹Ù´Ú°ú ¼Õ¹Ù´Ú¸¸ ¾à°£ Àû½Ç »ÓÀ̾úÀ¾´Ï´Ù.. °£°æÈ°¡ ¸¹ÀÌ ÁøÇàµÈ Å¿À̾úÀ¾´Ï´Ù. ±×¶§±îÁö¸¸Çصµ Àú´Â ³ìÁó¿Ü¿¡´Â ´Þ¸® ¾²´Â °Ô ¾ø¾úÀ¾´Ï´Ù. ±×°Íµµ ¼ÒÈ°¡ ¾ÈµÅ¼ ÅäÇϱâ ÀϾ¥¿´À¾´Ï´Ù. ±×·¡¼ ÆØ°³ÃÆ°í ¿±·Ï¼ö´Â º° ½ÅÅëÄ¡ ¾Ê¾Æ¼ ijºñ³Ý¿¡ ³Ö¾îµ×À¾´Ï´Ù. ±×·¨´õ´Ï ÇÏ·ç´Â Áý»ç´ÔÀÌ ¿±·Ï¼ö¸¦ ¾î°ÅÁö·Î ¸ÔÀÌ´Â °ÍÀ̾úÀ¾´Ï´Ù. ±×¸®°í´Â ¹ß¹Ù´Ú°ú ¼Õ¹Ù´Ú/µî°ú ¹è¿¡ ³×¶¼·ç¸¶´Ï¸¦ ºÙÀÌ¸é¼ ¼Õ°¡¶ôÀ» ²À °É°í´Â ¾à¼ÓÀ»ÇÏÀÚ°í ÇßÀ¾´Ï´Ù. °øº¹¿¡ ÇÏ·ç 4³¢¾¿ ¿±·Ï¼ö ¸ÔÀ» °Í°ú ³×¶¼·ç¸¶´Ï¸¦ ¿ë·®´ë·Î ºÙÀÏ °ÍÀ» ¸»ÀÔ´Ï´Ù. ¶Ç ÇÑ°¡Áö °£°æÈ »óÅ´ϱî Áï °£ÀÌ Á¦ ±â´ÉÀ» ÇÒ¼ö ¾øÀ¸´Ï ¸ðµç ¾àÀº °£¿¡¼ Çص¶À» ½ÃÄÑ¾ß Çϴϱî Áö±Ý±îÁö ¸Ô°íÀÖ´Â ¾àÀ» ²÷À¸¶ó´Â ¾ö¸íÀ̾úÀ¾´Ï´Ù. ÇÏ´Â ¼ö¾øÀÌ ¸¾¼ÓÀ¸·Î ÀÏÁÖÀϸ¸ Çϱâ·Î ¾à¼ÓÇÏ°í ²ÙÁØÈ÷ ÇßÀ¾´Ï´Ù. ±× °á°ú ¾Æ»Ô»ç 3Àϸ¸¿¡ µî¿¡¼ ´©·± Áø¾×ÀÌ ³ª¿À´Â °ÍÀ̾úÀ¾´Ï´Ù. ³» ´«À» ÀǽÉÇßÀ¾´Ï´Ù. ±×·¯±â¸¦ ¾î´À»õ º¸¸§ÀÌ Áö³µÀ¾´Ï´Ù. ¿ë±â¸¦ ¾ò¾î¼ ¹äµµ ¿±·Ï¼ö·Î Áö¾î¸Ô¾úÀ¾´Ï´Ù. üÁú»ó ¹°À» ¸¹ÀÌ ¸ø¸Ô¾ú±â ¶§¹®À̾úÀ¾´Ï´Ù. ±×·¨´õ´Ï óÀ½¿¡ ¿À´Â ÀÚ°¢Áõ»óÀÌ °ÅÁþ¸»°°ÀÌ ¹ã¿¡ ÀáÀ» ³Ê¹« ´Þ°Ô ÀÚ´Â °ÍÀ̾úÀ¾´Ï´Ù. ½Ç·Î ¿À·£¸¸ÀÔ´Ï´Ù. ³Ê¹« ÁÁ¾ÒÀ¾´Ï´Ù. ¹ä¸ÀÀÌ µ¹¾Æ¿ÔÀ¾´Ï´Ù. ħħÇÏ´ø ´«ÀÌ È¯ÇØÁö´Â °Í°°¾ÒÀ¾´Ï´Ù. ¿À·£¸¸¿¡ ¿ÍÀÌÇÁ¿Í ¼îÇαîÁö °¥Á¤µµ·Î ü·ÂÀ» ȸº¹ÇßÀ¾´Ï´Ù. °á±¹ ÇÑ´Þ ¹Ý¸¸¿¡ º´¿ø¿¡ °¡¼ °£°æÈ »óŸ¦ Áø´Ü °á°ú ³î¶ó¿ï Á¤µµÀÇ È£ÀüÀ» º¸¿´´Ù´Â Áø´ÜÀ» ¾ò¾î³ÂÀ¾´Ï´Ù. ¿À´Â±æ¿¡ Áý»ç´Ô ¸»À» ¹Ï±â·Î ÇÏ°í º´¿ø¿¡¼ ¹ÞÀº ¾àÀ» ¾²·¹±âÅë¿¡ ¸ô·¡ ¹ö·ÈÀ¾´Ï´Ù. ÀÌÁ¦ µÎ´ÞÀÌ Áö³ª°í ÀÖÀ¾´Ï´Ù. ÀÎü°¡ ÀÌ·¸°Ô ¹°°ú ½ÃÆ®Çϳª·Î ¹Ù²ð¼ö ÀÖ´Ù´Â °Í¿¡ ³î¶ó¿ï µû¸§ÀÔ´Ï´Ù. ±×·¯³ª µ·ÀÌ ¹®Á¦¿´À¾´Ï´Ù. Àå±â°£ »ç¿ëÇϴ ȯÀÚ¿¡°Ô ¾à°£ÀÇ ¹è·Á¸¦ ÇØÁشٴ »çÀå´ÔÀÇ ¸»¾¸¿¡ ¾ÈµµÀÇ ¼ûÀ» ½¬°í ³×¶¼·ç¸¶´ÏÀÇ ¾çµµ ¹ÝÀ¸·Î ÁÙÀÌ°í ¿±·Ï¼öÀÇ ¾çÀº ÇÑ´ÞÁ¤µµ´Â ±×´ë·Î À¯ÁöÅ°·Î ÇßÀ¾´Ï´Ù. www.neterumani.net/bio/bio.htm< /a>
^~^* ³Ê ¾ÆÁ÷µµ ¼ÕÀ¸·Î ´Û´Ï? (Àç¹Ì³ Ç÷¹½Ã ÷ºÎ) ~*[È«º¸]*
~¾Æ±âÀÇ °íÃß ¼ö³½Ã´ë ¸ðÀ½Áý ~ ^0^* ##º»Á¦Ç°Àº Áß¼Ò±â¾÷û - ¿ùµåÄÅ ÁöÁ¤»óÇ°ÀÔ´Ï´Ù.## [È«º¸]³Ê ¾ÆÁ÷µµ ¼ÕÀ¸·Î ´Û´Ï? - ÄÄÇ»ÅÍ »ç¿ëÀÌ ¸¹°Å³ª,½ÃÇè°øºÎ¿¡ ÁöÃÄ ´«ÀÇ ÇǷθ¦ ÀÚÁÖ ´À³¢´Â·»Áî Âø¿ëÀÚ. - »©°í½Í¾îµµ ¼¼Ã´,º¸°üÀÌ ¹ø°Å·Î¿ö Âü°í °è½Ê´Ï±î? - ·»ÁîÇǾƴ »ç¹«½Ç, Çб³, ½Ç¿Ü µî ¾îµð¼µç 3ºÐÀÌ¸é ·»Áî ¼¼Ã´ÀÌ ³¡³³´Ï´Ù. - ´õ¿íÀÌ ¼ÕÀ¸·Î ´Û°í,³ÖÀ» ÇÊ¿ä°¡ ¾ø¾î Âõ¾îÁö°Å³ª ºÐ½ÇÇÒ À§Çèµµ ¾ø½À´Ï´Ù. - ¼¼Ã´°ú º¸°üÀÌ Çѹø¿¡ ÇØ°áµÇ´Â Æí¸®ÇÔ, ÈÞ´ë°¡ °£ÆíÇØ ¿ÜÃâ, ¿©Çà½Ã, ·»Áî »ç¿ëÀÚÀÇ ÇʼöÇ° ÀÔ´Ï´Ù. ÈÞ´ë¿ë ÀÚµ¿ ·»Áô±â (ÄÜÅØÆ®·»Áî ¼¼Ã´±â) *±ôÂïÇÏ°í ¼¼·ÃµÈ µðÀÚÀÎ *´Ù¸ñÀû ¸ÖƼ ¼¼Á¤¾×»ç¿ë. *¼ÒÇÁÆ®·»Áî,Çϵ巻ÁîÀÇ ´Ü¹éÁú Á¦°Å,¼¼Ã´,º¸Á¸. *µ¿À۽à Àü¿ø·¥ÇÁ Á¡µîÀ¸·Î Cover¸¦ ¿Áö¾Ê°í ÀÛµ¿À¯¹« È®ÀÎ °¡´É. *¼¼°è ÃÖ¼ÒÇü, ÃÊ°æ·®À¸·Î ÈÞ´ë °¡´É (size : w85*D72*H35mm) ^^* ÀÚ¼¼ÇÑ ³»¿ëÀº ÷ºÎÆÄÀÏÀ» ¿äûÇÏ½Ã¸é µ¿¿µ»óÀڷḦ º¸³»µå¸®°Ú½À´Ï´Ù. ^^* * °¢ ±×¸²À» ´©¸£½Ã¸é ÷ºÎµÈ »çÁøÀ» º¸½Ç¼ö ÀÖ½À´Ï´Ù.* ¼ÒºñÀÚ°¡°Ý \98,000¿ø È«º¸°¡°Ý \45,000¿ø(Ä«µå °¡´É) È«º¸±â°£Æ¯º° »çÀºÇ° : 3°¡Áö¸¦ ¸ðµÎ µå¸³´Ï´Ù. 1 .·»Áî ¼¼Ã´¿ë±â¸¦ ÇÑ°³´õµå¸³´Ï´Ù. 2 .¼±Âø¼ø 100¸í ¼±¹°¿ëÄÉÀ̽º ÁõÁ¤ 3 .a-ÈÀåÇ°ÄÉÀ̽º,b-¿µ¾çÅ©¸²,c-¿À¸¶»þ¸®ÇÁ3´Ü¿ì»êÁß ÅÃÀÏ ÁõÁ¤. *»çÀºÇ°Àº º¯°æµÉ ¼ö ÀÖ½À´Ï´Ù.* ȨÇÇ : http://www.giftdc.co.kr/index1.php °¢ ±×¸²µéÀ» ´·¯º¸¼¼¿ä ¼·Î´Ù¸¥ ±×¸²ÀÌ ¸µÅ©µÇ¾î ÀÖ½À´Ï´Ù.^0^ ##º»Á¦Ç°Àº Áß¼Ò±â¾÷û - ¿ùµåÄÅ ÁöÁ¤»óÇ°ÀÔ´Ï´Ù.## Á¦Ç° ¹®ÀÇ : 031)398-0300 HP : 011-344-4020 ´ÔÀÇ À̸ÞÀÏ ÁÖ¼Ò´Â ÀÎÅͳݰԽÃÆÇ,¹æ¸í·Ïµî¿¡¼ ¹«ÀÛÀ§·Î ¼öÁýÇÏ¿´À¸¹Ç·Î ÁÖ¼Ò ÀÌ¿ÜÀÇ ´ÔÀÇ °³ÀνŻ󿡰üÇÑ Á¤º¸´Â °¡Áö°í ÀÖÁö ¾ÊÀ½À» ¾Ë·Áµå¸³´Ï´Ù. ¾ÕÀ¸·Î ´õÀÌ»ó ¼ö½ÅÀ» ¿øÇÏÁö ¾ÊÀ¸½Å´Ù¸é ¼ö½Å°ÅºÎ¸¦ ÇØÁÖ¼¼¿ä.
Re: feature wish: switch to disable robots.txt usage
Hi! Just to be complete, thanks to Hrvoje's tip, I was able to find -e command --execute command Execute command as if it were a part of .wgetrc (see Startup File.). A command thus invoked will be executed after the commands in .wgetrc, thus taking precedence over them. I always wondered about that. *sigh* I can now think about changing my wgetgui in this aspect :) Thanks again Jens Hrvoje Niksic wrote: > > Noel Koethe <[EMAIL PROTECTED]> writes: > > > Ok got it. But it is possible to get this option as a switch for > > using it on the command line? > > Yes, like this: > > wget -erobots=off ...
Re: timeout does not work if connect hangs
Thanks for the report. There are plans to fix that problem for the next release.
Re: images referenced by javascript
"Thomas C. Meggs" <[EMAIL PROTECTED]> writes: > Does wget have any plans to implement downloading images referenced > from javascript inside an html document? Thanks. I'm afraid not in the near future. In general, it is impossible to process JS from Wget because it is a full-featured language, and one with interactive constructs. I suppose something could be hacked to snag the usual, "rollover"-type images manipulated by JavaScript, but I've never felt particularly inspired to code it.
Re: feature wish: switch to disable robots.txt usage
Noel Koethe <[EMAIL PROTECTED]> writes: > Ok got it. But it is possible to get this option as a switch for > using it on the command line? Yes, like this: wget -erobots=off ...
Re: feature wish: switch to disable robots.txt usage
On Mit, 10 Apr 2002, Jens Rösner wrote: Hello Jens, > > is it possible to get a new option to disable the usage > > of robots.txt (--norobots)? > I think at least since 1.7, probably even longer. > Cut from the doc: > robots = on/off > Use (or not) /robots.txt file (see Robots.). Be sure to know what you > are doing before changing the default (which is on). > Please note: > This is a (.)wgetrc-only command. > You cannot use it on the command line, if I am not mistaken. Ok got it. But it is possible to get this option as a switch for using it on the command line? thx. -- Noèl Köthe
Re: spaces and other special caracters in directories
Loic Le Loarer <[EMAIL PROTECTED]> writes: > When I fetch with wget a whole subtree and when directories contains > space or some other special character, these character are > urlencoded in the local version while it is not the case for files. > > For exemple if I mirror with wget -m the directory "to to" which > contains the file "to to", I get localy the directory "to%20to" and > the file "to to". Is there an option to have the directory "to to" ? The inconsistency you're seeing is a bug, but the "intended" behavior goes into rather the opposite direction. The code was supposed to url-encode *both* the file and the directory without the option to suppress it. I will try to fix this for the next release, preferrably by uncoupling the url encoding from the protection of file names from invalid characters. Ideally, the latter would be configurable.
Re: Debian bug 106391 - documentation doesn't warn about passwords in urls
On Mit, 10 Apr 2002, Hrvoje Niksic wrote: > You're right. I'll apply this patch, which I think should add enough > warnings to educate the unwary. Thanks for this patch. I will merge it into version 1.8.1 for Debian. -- Noèl Köthe
Re: Debian bug 21344 - the total bytes downloaded is countet assigned int
Guillaume Morin <[EMAIL PROTECTED]> writes: > I am forwarding Debian wishlist bug 21344 > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=21344&repeatmerged=yes > > In 1.8.1, the result is different. You get an overflow notice... Yes, the negative amount has been fixed by using `long long' for storing the quota. I dislike the idea of using "unsigned" types to gain larger values because the gain is only temporary, and introducing unsigneds into a program that uses signed integers consistently is tedious and error-prone. I believe SuSE maintains a patch that modifies Wget to use unsigned integers for all download-related storage. Someone really bent on that kind of optimization could take a look at that.
Re: Debian wishlist bug 21148 - wget doesn't allow selectivitybased on mime type
I believe this is already on the todo list. However, this is made harder by the fact that, to implement this kind of reject, you have to start downloading the file. This is very different from the filename-based rejection, where the decision can be made at a very early point in the download process.
Re: Debian bug 131851 - cwd during ftp causes download to fail
Guillaume Morin <[EMAIL PROTECTED]> writes: > When getting a file in a non-root directory from FTP with wget, wget > always tries CWD to that directory before getting the > file. Unfortunately sometimes you're not allowed to CWD to a > directory, but you're all allowed to list or download files from it > (taken that you know the filename). I believe this breaks rfc959. I think this is quite rare, so I don't plan to add this to Wget in the near future. If someone implements it cleanly, the functionality can go in.
Re: feature wish: switch to disable robots.txt usage
Hi Noel! Actually, this is possible. I think at least since 1.7, probably even longer. Cut from the doc: robots = on/off Use (or not) /robots.txt file (see Robots.). Be sure to know what you are doing before changing the default (which is on). Please note: This is a (.)wgetrc-only command. You cannot use it on the command line, if I am not mistaken. CU Jens Noel Koethe wrote: > > Hello, > > is it possible to get a new option to disable the usage > of robots.txt (--norobots)? > > for example: > I want to mirror some parts of http://ftp.de.debian.org/debian/ > but the admin have a robots.txt > > http://ftp.de.debian.org/robots.txt > User-agent: * > Disallow: / > > I think he want to protect his machine from searchengine > spiders and not from users want to download files.:) > > it would be great if I could use wget for this task but > now its not possible.:( > > Thanks alot. > > -- > Noèl Köthe
Re: Debian bug 106391 - documentation doesn't warn about passwordsin urls
[ Cc'ing to [EMAIL PROTECTED], as requested by Guillaume. ] Guillaume Morin <[EMAIL PROTECTED]> writes: > this is from the "advanced usage" section of examples (info docs): > >> * If you want to encode your own username and password to HTTP or >> FTP, use the appropriate URL syntax (*note URL Format::). >> >> wget ftp://hniksic:[EMAIL PROTECTED]/.emacs > > this would let other users on the system to see your password using > "ps". it should have a big disclaimer. You're right. I'll apply this patch, which I think should add enough warnings to educate the unwary. 2002-04-10 Hrvoje Niksic <[EMAIL PROTECTED]> * wget.texi: Warn about the dangers of specifying passwords on the command line and in unencrypted files. Index: doc/wget.texi === RCS file: /pack/anoncvs/wget/doc/wget.texi,v retrieving revision 1.62 diff -u -r1.62 wget.texi --- doc/wget.texi 2001/12/16 18:05:34 1.62 +++ doc/wget.texi 2002/04/10 21:40:32 @@ -285,6 +285,13 @@ @file{.netrc} file in your home directory, password will also be searched for there.} +@strong{Important Note}: if you specify a password-containing @sc{url} +on the command line, the username and password will be plainly visible +to all users on the system, by way of @code{ps}. On multi-user systems, +this is a big security risk. To work around it, use @code{wget -i -} +and feed the @sc{url}s to Wget's standard input, each on a separate +line, terminated by @kbd{C-d}. + You can encode unsafe characters in a @sc{url} as @samp{%xy}, @code{xy} being the hexadecimal representation of the character's @sc{ascii} value. Some common unsafe characters include @samp{%} (quoted as @@ -849,8 +856,15 @@ @code{digest} authentication scheme. Another way to specify username and password is in the @sc{url} itself -(@pxref{URL Format}). For more information about security issues with -Wget, @xref{Security Considerations}. +(@pxref{URL Format}). Either method reveals your password to anyone who +bothers to run @code{ps}. To prevent the passwords from being seen, +store them in @file{.wgetrc} or @file{.netrc}, and make sure to protect +those files from other users with @code{chmod}. If the passwords are +really important, do not leave them lying in those files either---edit +the files and delete them after Wget has started the download. + +For more information about security issues with Wget, @xref{Security +Considerations}. @cindex proxy @cindex cache @@ -975,6 +989,9 @@ authentication on a proxy server. Wget will encode them using the @code{basic} authentication scheme. +Security considerations similar to those with @samp{--http-passwd} +pertain here as well. + @cindex http referer @cindex referer, http @item --referer=@var{url} @@ -2409,6 +2426,10 @@ wget ftp://hniksic:mypassword@@unix.server.com/.emacs @end example +Note, however, that this usage is not advisable on multi-user systems +because it reveals your password to anyone who looks at the output of +@code{ps}. + @cindex redirecting output @item You would like the output documents to go to standard output instead of @@ -2773,10 +2794,12 @@ main issues, and some solutions. @enumerate -@item -The passwords on the command line are visible using @code{ps}. If this -is a problem, avoid putting passwords from the command line---e.g. you -can use @file{.netrc} for this. +@item The passwords on the command line are visible using @code{ps}. +The best way around it is to use @code{wget -i -} and feed the @sc{url}s +to Wget's standard input, each on a separate line, terminated by +@kbd{C-d}. Another workaround is to use @file{.netrc} to store +passwords; however, storing unencrypted passwords is also considered a +security risk. @item Using the insecure @dfn{basic} authentication scheme, unencrypted
feature wish: switch to disable robots.txt usage
Hello, is it possible to get a new option to disable the usage of robots.txt (--norobots)? for example: I want to mirror some parts of http://ftp.de.debian.org/debian/ but the admin have a robots.txt http://ftp.de.debian.org/robots.txt User-agent: * Disallow: / I think he want to protect his machine from searchengine spiders and not from users want to download files.:) it would be great if I could use wget for this task but now its not possible.:( Thanks alot. -- Noèl Köthe
Re: Debian bug 88176 - timestamping is wrong with -O
Unfortunately, this bug is not easy to fix. The problem is that `-O' was originally invented for streaming, i.e. for `-O -'. As a result, many places in Wget's code assume that they can freely operate on the file names, and -O seems more like an afterthought. On the other hand, many people (reasonably) expect `-O x' to simply override the file name from whatever was specified in the URL to "x". But the code doesn't work that way. I plan to change the handling of file names to make this work, but that will take some time. Unless someone takes time to fix this in the existing code base, the bug will remain open until the said reorganization. Until then, the workaround is to avoid the `-O -N' combination.
ÎÆ ºÎ¾÷! ºÎ¾÷! ºÎ¾÷! ͱ
Title: lldMail3-1 ±ÍÇÏÀÇ ¸ÞÀÏÁÖ¼Ò´Â À¥¼ÇÎÁß, http://www..com/ ¿¡¼ ¾Ë°Ô µÈ°ÍÀ̸ç, E-Mail ÁÖ¼Ò ¿Ü¿¡, ´Ù¸¥ Á¤º¸´Â °®°í ÀÖÁö ¾Ê½À´Ï´Ù.Á¤ÅëºÎ ±Ç°í»çÇ׿¡ ÀÇ°Å Á¦¸ñ¿¡ [±¤°í]¶ó°í Ç¥±âÇÑ ¸ÞÀÏÀÔ´Ï´Ù. ¿øÄ¡ ¾ÊÀ¸¸é ¼ö½Å°ÅºÎ¸¦ ´·¯ÁÖ¼¼¿ä
Re: Current download speed in progress bar
"Roger L. Beeman" <[EMAIL PROTECTED]> writes: > On Wed, 10 Apr 2002, Hrvoje Niksic wrote: > >> Agreed wholeheartedly, but how would you *implement* a non-jittering >> ETA? Do you think it makes sense the way 1.8.1 does it, i.e. to >> calculate the ETA from the average speed? > > One common programming technique is the exponential decay model. Sounds cool. Do you have a pseudocode or, failing that, a reference easy enough that even a programmer of Unix command-line utilities can follow it? :-) (I must admit that your email address adds a certain weight to whatever you have to say about measuring bandwidth.) > I believe that the method is chosen for its simplicity and that > justifications of its validity are completely after the fact. The > simplicity is that one keeps a previously calculated value and > averages that value with the current measurement and saves the > result for the next iteration, i.e. add and shift right. I thought about calculating the average between the "average" and the "current" speed, and use that for ETA, but it sounded too arbitrary and I didn't have time to gather empirical evidence that it was any better than just using average. Again, I'd be grateful if you could provide some code. > You must chose how to normalize the measurement based on > irregularity in the measurement interval, however. I'm afraid I can't parse this without understanding the algorithm.
Re: Current download speed in progress bar
Andre Majorel <[EMAIL PROTECTED]> writes: >> Agreed wholeheartedly, but how would you *implement* a >> non-jittering ETA? > > I'm not sure you can, but using the average speed will at least > low pass filter out most of the jittering. > >> Do you think it makes sense the way 1.8.1 does it, i.e. to >> calculate the ETA from the average speed? > > Yes. That's how it works in the latest CVS. If you wish, take a peek and see if you like it.
Re: Current download speed in progress bar
On Wed, 10 Apr 2002, Hrvoje Niksic wrote: > Agreed wholeheartedly, but how would you *implement* a non-jittering > ETA? Do you think it makes sense the way 1.8.1 does it, i.e. to > calculate the ETA from the average speed? One common programming technique is the exponential decay model. I believe that the method is chosen for its simplicity and that justifications of its validity are completely after the fact. The simplicity is that one keeps a previously calculated value and averages that value with the current measurement and saves the result for the next iteration, i.e. add and shift right. Justifications are that the current measurement is the most likely predictor of future behavior and that previous measurements are weighted diminishingly over time. You must chose how to normalize the measurement based on irregularity in the measurement interval, however. Roger L. Beeman
[¼ºÀα¤°í]»õ´ÜÀå ºòÀ̺¥Æ® ¼Ò½Ä
¾È³çÇϼ¼¿ä ? ´©µå¸ð¾Æ °í°´°ü¸®ÆÀÀÔ´Ï´Ù. ³»¿ë¿¡ ¾Õ¼ ¹Ì¼º³âÀÚºÐÀº ¹Ýµå½Ã »èÁ¦ÇØÁÖ¼¼¿ä º» ¸ÞÀÏÀº 1ȸ¼ºÀ̹ǷΠµÎ¹ø´Ù½Ã ¹ß¼ÛµÇÁö ¾ÊÀ¸¸ç ½Ç¼ö·Î 2¹øÀÌ»óÀÌ ¹ß¼ÛµÇ´õ¶óµµ ³Ê±×·¯ÀÌ ÀÌÇØÇØÁֽñ⠹ٶø´Ï´Ù. »çÀÌÆ®·Î À̵¿ÇϽ÷Á¸é Ŭ¸¯Çϼ¼¿ä ´©µå¸ð¾Æ »õ´ÜÀå ºòÀ̺¥Æ® À̺¥Æ®³»¿ë : 4¿ù°áÀçÀÚÁß 3°³¿ù(30,000¿ø)°áÀçÀÚ ºÐ²² ¿Ã¿¬¸»(2002,12,31)±îÁö ¼ºñ½º¿¬Àå À̺¥Æ®±â°£ : 4¿ù 1ÀÏ~4¿ù 31ÀϱîÁö À̺¥Æ®´ë»ó : À̺¥Æ®±â°£³» 3°³¿ù°áÀçÀÚ(30,000)¿¡ ÇÑÇÔ À̺¥Æ®Àû¿ë : °¡ÀÔÈÄ È®ÀÎÁï½Ã ¿Ã¿¬¸»(2002,12,31)±îÁö Àû¿ëÇÏ¿© µå¸³´Ï´Ù. µÎ¹ø ´Ù½Ã¿ÀÁö ¾Ê´Â ±âȸ ³õÄ¡Áö ¸¶¼¼¿ä ±¹³»ÃÖ°í µ¿¿µ»ó º¸À¯!! Ãæ°Ý µ¿¿µ»ó!!¼¿ÇÁ µ¿¿µ»ó!! ¼ºÀÎ ¿µÈ!!¼îÅ· µ¿¿µ»ó!! ¿±±â µ¿¿µ»ó!!´Ü¸·±ØÀå!! ÀÚÀ§ µ¿¿µ»ó!!¿¡·Î¹ÂÁ÷ ºñµð¿À!! Çѱ¹ µ¿¿µ»ó!!¹«»èÁ¦ ¿øÆÇ!! ħ½ÇÅ×Å©´Ð!!¿±±â üÀ§!! ¿¡·Î ¿µ»ó¼Ò¼³ ±¹³» ÃÖ´ë µ¿¿µ»ó º¸À¯ ¼ö½Å°ÅºÎ º» ¸ÞÀÏÀº Á¤º¸Åë½ÅºÎ ±Ç°í»çÇ× ¹ý·ü Á¦ 50Á¶¿¡ ÀÇ°ÅÇÏ¿© [±¤°í] ¸ÞÀÏÀÓÀ» ¹àÈü´Ï´Ù. ±ÍÇÏÀÇ ¸ÞÀÏ ÁÖ¼Ò´Â °Ë»ö¿£Áø°ú ÀÏ¹Ý »çÀÌÆ®ÀÇ °ø°³ÀûÀÎ °÷¿¡¼ ÃßÃâÇÑ °ÍÀ̸ç,¸ÞÀÏÁÖ¼Ò ÀÌ¿Ü¿¡´Â ´Ù¸¥ Á¤º¸´Â ÀÏü ¸ð¸£´Ï ¾È½ÉÇϽñ⠹ٶø´Ï´Ù. »çÀü Çã¶ô¾øÀÌ ¸ÞÀÏÀ» º¸³»°Ô µÈ Á¡ »ç°úµå¸³´Ï´Ù. °¨»çÇÕ´Ï´Ù.
Re: Current download speed in progress bar
For what it's worth, I have sitting here a publication that provides the digital filter weights for various low-pass filters. Pick the number of points and the spectrum you want and there's a close fit. Rob Lake [EMAIL PROTECTED] > From [EMAIL PROTECTED] Wed Apr 10 12:06:02 2002 > Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm > X-Mailing-List: [EMAIL PROTECTED] > Delivered-To: mailing list [EMAIL PROTECTED] > To: Wget List <[EMAIL PROTECTED]> > Subject: Re: Current download speed in progress bar > Mail-Followup-To: Andre Majorel <[EMAIL PROTECTED]>, > Wget List <[EMAIL PROTECTED]> > Mime-Version: 1.0 > Content-Disposition: inline > Content-Transfer-Encoding: 8bit > User-Agent: Mutt/1.3.27i > > On 2002-04-10 01:14 +0200, Hrvoje Niksic wrote: > > Andre Majorel <[EMAIL PROTECTED]> writes: > > > > > If find it very annoying when a downloader plays yoyo with the > > > remaining time. IMHO, remaining time is by nature a long term thing > > > and short term jitter should not cause it to go up and down. > > > > Agreed wholeheartedly, but how would you *implement* a non-jittering > > ETA? > > I'm not sure you can, but using the average speed will at least > low pass filter out most of the jittering. > > > Do you think it makes sense the way 1.8.1 does it, i.e. to > > calculate the ETA from the average speed? > > Yes. > > -- > André Majorel http://www.teaser.fr/~amajorel/> > std::disclaimer ("Not speaking for my employer"); >
Re: Current download speed in progress bar
On 2002-04-10 01:14 +0200, Hrvoje Niksic wrote: > Andre Majorel <[EMAIL PROTECTED]> writes: > > > If find it very annoying when a downloader plays yoyo with the > > remaining time. IMHO, remaining time is by nature a long term thing > > and short term jitter should not cause it to go up and down. > > Agreed wholeheartedly, but how would you *implement* a non-jittering > ETA? I'm not sure you can, but using the average speed will at least low pass filter out most of the jittering. > Do you think it makes sense the way 1.8.1 does it, i.e. to > calculate the ETA from the average speed? Yes. -- André Majorel http://www.teaser.fr/~amajorel/> std::disclaimer ("Not speaking for my employer");
Re: New suggestion.
On Monday 08 April 2002 19:18, you wrote: > Ivan Buttinoni <[EMAIL PROTECTED]> writes: > > Again I send a suggestion, this time quite easy. I hope it's not > > allready implemented, else I'm sorry in advance. It will be nice if > > wget can use the regexp to evaluate what accept/refuse to download. > > The regexp have to work on whole URL and/or filename and/or hostname > > and/or CGI argument. Sometime I found the apache directory sorting > > links that are unusefull, eg: > > .../?N=A > > .../?M=D > > > > Here follows an hipotesis for the above example: > > wget -r -l0 --reg-exclude '[A-Z]=[AD]$' http:// > > The problem with regexps is that their use would make Wget dependent > on a regexp library. To make matters worse, regexp libraries come in > all shapes and sizes, with incompatible APIs and implementing > incompatible dialects of regexps. > > I'm staying away from regexps as long as I possibly can. Ok, exist a lot of implementation regexp as a consequence exist a lot of implementations/dialets, but don't forget _gnu rexep_ (http://www.gnu.org/directory/rx.html)! And how difficult is insert regexps at compile time? (ex. ./configure --with-gnuregexp )? Ciao Ivan -- = BWARE TECHNOLOGIES - http://www.bware.it/ Via S.Gregorio, 3, Milano 20124 Italy - Phone: +39 02 2779181 Fax: +39 02 27791828 GSM: +39 335 1280432 =
Re: Current download speed in progress bar
"Tony Lewis" <[EMAIL PROTECTED]> writes: > Hrvoje Niksic wrote: > >> > The meter is updated maximum once per second, I don't think it makes >> > sense to update the screen faster than that. >> >> Maybe not, but I sort of like it. Wget's progress bar refreshes the >> screen (not more than) five times per second, and I like the idea of >> refreshing the download speed along with the amount. However, I've >> added the code to limit the ETA change to once per second. > > As long as it's going to be configurable, why not make it available > on the command line and in .wgetrc? Exactly what would you like to be able to configure? The refresh frequency of the progress bar? Or of the ETA display? I don't see the need for that. The granularity of ETA is 1 second, so refreshes shorter than that don't make sense. Likewise, max. five refreshes per second for the whole bar is about as good a value as any. I've never heard anyone complain. IMHO there *is* such a thing as too much configurability. Wget already has a huge number of options, and I'd like it to just do the right thing whenever possible. I think the latest incarnation of the code comes close to that. I'm open to criticisms, though. But before you criticize, at least try it out.
Re: Current download speed in progress bar
Hrvoje Niksic wrote: > > The meter is updated maximum once per second, I don't think it makes > > sense to update the screen faster than that. > > Maybe not, but I sort of like it. Wget's progress bar refreshes the > screen (not more than) five times per second, and I like the idea of > refreshing the download speed along with the amount. However, I've > added the code to limit the ETA change to once per second. As long as it's going to be configurable, why not make it available on the command line and in .wgetrc? (I don't have time to write my own enhancements, but I can help you design yours! ) Tony
RE: windows binary
Sorry. Correct link is http://space.tin.it/computer/hherold/ Heiko Herold -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907472 -- ITALY > -Original Message- > From: Herold Heiko [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 10, 2002 2:46 PM > To: List Wget (E-mail) > Subject: windows binary > > > for 1.8.1+cvs 2001 04 10 > http://space.tin.it/compute/hherold > > Heiko Herold > > -- > -- PREVINET S.p.A.[EMAIL PROTECTED] > -- Via Ferretto, 1ph x39-041-5907073 > -- I-31021 Mogliano V.to (TV) fax x39-041-5907472 > -- ITALY >
windows binary
for 1.8.1+cvs 2001 04 10 http://space.tin.it/compute/hherold Heiko Herold -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907472 -- ITALY
Windows wgetrc files (was Re: LAN with Proxy, no Router)
Hello Jens, On 10 Apr 2002 at 12:31, Jens Rösner wrote: > > > wgetrc works fine under windows (always has) > > > however, .wgetrc is not possible, but > > > maybe . does mean "in root dir" under Unix? > > > > The code does different stuff for Windows. Instead of looking for > > '.wgetrc' in the user's home directory, it looks for a file called > > 'wget.ini' in the directory that contains the executable. This does > > not seemed to be mentioned anywhere in the documentation. > > > From my own experience, you are right concerning the location wget searches > for wgetrc on Windows. > However, a file called "wgetrc" is sufficient. There are two wgetrc files it may use, a "system" wgetrc and a "user" wgetrc. For Windows, this distinction is somewhat obscured, as it does not use a concept of user home directory (it could be made to do so on some versions of Windows, but it doesn't at the moment). The "system" wgetrc file is only used if support for this has been conditionally compiled into the program (a preprocessor macro called SYSTEM_WGETRC should be defined if support is required and the value of the macro is the name of the actual wgetrc file to look for). For Windows, the SYSTEM_WGETRC macro is set to "wgetrc", which means that Wget will look for a system wgetrc file called wgetrc in the current working directory when it starts. You mentioned I was right about the location of the file, but this should not be the case for the file called "wgetrc" - it should be looking for this in the current directory, not the directory containing the executable (unless these are the same!). I don't think it makes sense for a "system" wgetrc file to be in the current directory, but that is what it does! Also note that Wget supports Microsoft, Borland and Watcom C compilers, all with different makefiles, but support for the "system" wgetrc file is only currently included in the makefile for the Microsoft compiler. > In fact, wgetrc.ini will not be found and thus > its options ignored. The file is called wget.ini (not wgetrc.ini) and this is what Wget uses as the "user" wgetrc for the Windows versions, even though it looks for it in the directory containing the wget.exe program, so is not really user specific (however, the user can override it by setting the WGETRC environment variable to point to a different wgetrc file). Looking at the comments in the source code it looks as if this "wget.ini" file in the directory containing wget.exe is supposed to be the only file that Wget reads for Windows. The comment in the source code (src/init.c) says "SYSTEM_WGETRC should not be defined under WINDOWS", even though it is defined in current builds, at least with when building with the Microsoft compiler. To sum up what the current Windows version does: * If built with the Microsoft compiler, looks for a file named "wgetrc" in the current directory and reads the file if it is present. * If "WGETRC" environment variable is set, reads the file named by the environment variable (errors out if not there); otherwise looks for a "wget.ini" file in the directory containing the wget.exe program being run and reads the file if it is present. The above information was gleaned from examination of the source, rather from actually running the program! Does yours do something different?
Re: LAN with Proxy, no Router
Hi Ian! > > wgetrc works fine under windows (always has) > > however, .wgetrc is not possible, but > > maybe . does mean "in root dir" under Unix? > > The code does different stuff for Windows. Instead of looking for > '.wgetrc' in the user's home directory, it looks for a file called > 'wget.ini' in the directory that contains the executable. This does > not seemed to be mentioned anywhere in the documentation. > >From my own experience, you are right concerning the location wget searches for wgetrc on Windows. However, a file called "wgetrc" is sufficient. In fact, wgetrc.ini will not be found and thus its options ignored. CU Jens -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: LAN with Proxy, no Router
On 10 Apr 2002 at 3:09, Jens Rösner wrote: > wgetrc works fine under windows (always has) > however, .wgetrc is not possible, but > maybe . does mean "in root dir" under Unix? The code does different stuff for Windows. Instead of looking for '.wgetrc' in the user's home directory, it looks for a file called 'wget.ini' in the directory that contains the executable. This does not seemed to be mentioned anywhere in the documentation.
Re: Current download speed in progress bar
Daniel Stenberg <[EMAIL PROTECTED]> writes: > The meter is updated maximum once per second, I don't think it makes > sense to update the screen faster than that. Maybe not, but I sort of like it. Wget's progress bar refreshes the screen (not more than) five times per second, and I like the idea of refreshing the download speed along with the amount. However, I've added the code to limit the ETA change to once per second. I've come up with a similar scheme you are describing, except I now use smaller subintervals. In other words, at compile-time you can independently choose how much you're going in the past, and in how many chunks that's divided. I've defaulted it to 3 seconds and 30 intervals, respectively. > This basicly explains what curl does, not saying it is any > particularly scientific way or anything, I've just found this info > interesting. Thanks for the info; I appreciate it.
RE: LAN with Proxy, no Router
.something in unix is a normal file, it just won't be displayed with a normal ls (similar to dir) without additional options (-a/-A). On most unixes, at least. The directory separator on unix is / , while \ is a escape character, meaning "treat the next character as special/normal (opposite to the default way). Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907472 -- ITALY > -Original Message- > From: Jens Rösner [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, April 10, 2002 3:09 AM > To: wget > Subject: Re: LAN with Proxy, no Router > > > Hi! > > Someone please slap me with a gigantic sledgehammer?! > *whump* > Thanks! > Oh man, how could I not see it? > I mean, I used the "index" search function in the wget.hlp file. > I should have searched the whole text. > Even with index search "proxies" is just one line above "proxy". > Oh well. > > Here is the report: > Result: > Works fine under windows with firewall and proxy over LAN > into the www. > > How: > Just put > http_proxy = http://proxy.server.com:1234/ > into the wgetrc file. > > Addition: > wgetrc works fine under windows (always has) > however, .wgetrc is not possible, but > maybe . does mean "in root dir" under Unix? > > Thanks anyway, I think I'll go to bed now, oh boy... > > CU > Jens > > > Hrvoje Niksic wrote: > > > > Jens Rösner <[EMAIL PROTECTED]> writes: > > > > > Could someone please tell me, what > > > "the appropriate environmental variable" is > > > and how do I change it in Windows > > > or what else I need to do? > > > > The variables are listed in the manual under Various->Proxies. Here > > is the relevant part: > > > > `http_proxy' > > This variable should contain the URL of the proxy for HTTP > > connections. > > > > `ftp_proxy' > > This variable should contain the URL of the proxy for FTP > > connections. It is quite common that HTTP_PROXY and > FTP_PROXY are > > set to the same URL. > > > > `no_proxy' > > This variable should contain a comma-separated list of domain > > extensions proxy should _not_ be used for. For > instance, if the > > value of `no_proxy' is `.mit.edu', proxy will not be used to > > retrieve documents from MIT. > > > > I'm no Windows expert, so someone else will need to explain > how to set > > them up. > > > > Another way is to tell Wget where the proxies are in its own config > > file, `.wgetrc'. I'm not entirely sure how that works > under Windows, > > but you should be able to create a `.wgetrc' file in your home > > directory and insert something like this: > > > > use_proxy = on > > http_proxy = http://proxy.server.com:1234/ > > ftp_proxy = http://proxy.server.com:1234/ > > proxy_user = USER > > proxy_passwd = PASSWD >