Re: [Feature Request] Add a short option for --content-disposition

2023-11-04 Thread Tim Rühsen
On 10/29/23 21:08, No-Reply-Wolfietech via Primary discussion list for 
GNU Wget wrote:

Nowadays it seems increasingly common to find a file that is not being hosted 
where its actually stored, for access control presumably, and it seems to make 
no sense in having to type content-disposition when a single letter flag is all 
that is needed?


Well, we can't simply change the default behavior. That would break lots 
of workflows.


And enabling it is also a matter of trusting the server, which is not 
always the case.


May you can enable this "by default" in your environments by adding the 
flag to ~/.wgetrc or /etc/wgetrc. Or specify a diferent config file in 
$WGETRC.


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


[Feature Request] Add a short option for --content-disposition

2023-10-29 Thread No-Reply-Wolfietech via Primary discussion list for GNU Wget
Nowadays it seems increasingly common to find a file that is not being hosted 
where its actually stored, for access control presumably, and it seems to make 
no sense in having to type content-disposition when a single letter flag is all 
that is needed?

Sent with [Proton Mail](https://proton.me/) secure email.

How can you only show User-Agent GET request without rest of the Debug output?

2023-06-24 Thread Sultanbek Umkhayev
Is there no way to just show the GET request User-Agent: only with debug 
-d. I would like to just see ---request begin--- GET output of only what 
user agent is being used not all the other debug info.


Reason being if I set --user-agent="Agent 007" I would like to know it 
is working actually being sent.

--
*Fingerprint:* 5599 5AEA 9022 E287 C77E 71C8 F0FF CCE6 0789 F592



[bug #44674] Add an option that will send the HTTP request to stderr or a file

2023-03-13 Thread kaleemseo
Follow-up Comment #11, bug #44674 (project wget):

https://blog-edu-477.blogspot.com/
https://blog-edu-478.blogspot.com/
https://blog-edu-479.blogspot.com/
https://blog-edu-480.blogspot.com/
https://blog-edu-481.blogspot.com/
https://blog-edu-482.blogspot.com/
https://blog-edu-483.blogspot.com/
https://edu-blog-001.blogspot.com/
https://edu-blog-010.blogspot.com/
https://edu-blog-011.blogspot.com/
https://edu-blog-012.blogspot.com/
https://edu-blog-013.blogspot.com/
https://edu-blog-014.blogspot.com/
https://edu-blog-015.blogspot.com/
https://edu-blog-016.blogspot.com/
https://edu-blog-017.blogspot.com/
https://edu-blog-018.blogspot.com/
https://edu-blog-019.blogspot.com/
https://edu-blog-002.blogspot.com/
https://edu-blog-020.blogspot.com/
https://edu-blog-021.blogspot.com/
https://edu-blog-022.blogspot.com/
https://edu-blog-003.blogspot.com/
https://edu-blog-004.blogspot.com/
https://edu-blog-005.blogspot.com/
https://edu-blog-006.blogspot.com/
https://edu-blog-007.blogspot.com/
https://edu-blog-008.blogspot.com/
https://edu-blog-009.blogspot.com/
https://edu-blog-024.blogspot.com/
https://edu-blog-0223.blogspot.com/
https://edu-blog-025.blogspot.com/
https://ab2-io.blogspot.com/
https://ab1-io.blogspot.com/
https://ab3-io.blogspot.com/
https://ab4-io.blogspot.com/
https://ab5-io.blogspot.com/
https://ab6-io.blogspot.com/
https://ab7-io.blogspot.com/
https://ab8-io.blogspot.com/
https://ab9-io.blogspot.com/
https://ab10-io.blogspot.com/
https://ab11-io.blogspot.com/
https://ab12-io.blogspot.com/
https://ab13-io.blogspot.com/
https://ab14-io.blogspot.com/
https://ab15-io.blogspot.com/
https://ab16-io.blogspot.com/
https://ab17-io.blogspot.com/
https://ab18-io.blogspot.com/
https://ab19-io.blogspot.com/
https://ab20-io.blogspot.com/
https://ab21-io.blogspot.com/
https://ab22-io.blogspot.com/
https://ab23-io.blogspot.com/
https://ab24-io.blogspot.com/
https://ab25-io.blogspot.com/
https://ab26-io.blogspot.com/
https://ab27-io.blogspot.com/
https://ab28-io.blogspot.com/
https://ab29-io.blogspot.com/
https://ab30-io.blogspot.com/
https://ab31-io.blogspot.com/
https://ab32-io.blogspot.com/
https://ab33-io.blogspot.com/
https://ab34-io.blogspot.com/
https://ab35-io.blogspot.com/
https://ab36-io.blogspot.com/
https://ab37-io.blogspot.com/
https://ab39-io.blogspot.com/
https://ab38-io.blogspot.com/
https://ab40-io.blogspot.com/
https://ab41-io.blogspot.com/
https://ab42-io.blogspot.com/
https://ab43-io.blogspot.com/
https://ab44-io.blogspot.com/
https://ab45-io.blogspot.com/
https://ab46-io.blogspot.com/
https://ab47-io.blogspot.com/
https://ab48-io.blogspot.com/
https://ab49-io.blogspot.com/
https://ab50-io.blogspot.com/
https://ab51-io.blogspot.com/
https://ab52-io.blogspot.com/
https://ab53-io.blogspot.com/
https://ab54-io.blogspot.com/
https://ab55-io.blogspot.com/
https://ab56-io.blogspot.com/
https://ab57-io.blogspot.com/
https://ab58-io.blogspot.com/
https://ab59-io.blogspot.com/
https://ab60-io.blogspot.com/
https://ab61-io.blogspot.com/
https://ab62-io.blogspot.com/
https://ab63-io.blogspot.com/
https://ab64-io.blogspot.com/
https://ab65-io.blogspot.com/
https://ab66-io.blogspot.com/
https://ab67-io.blogspot.com/
https://ab68-io.blogspot.com/
https://ab69-io.blogspot.com/
https://ab70-io.blogspot.com/
https://ab71-io.blogspot.com/
https://ab72-io.blogspot.com/
https://ab73-io.blogspot.com/
https://ab74-io.blogspot.com/
https://ab74-io.blogspot.com/
https://ab75-io.blogspot.com/
https://ab76-io.blogspot.com/
https://ab77-io.blogspot.com/
https://ab78-io.blogspot.com/
https://ab79-io.blogspot.com/
https://ab80-io.blogspot.com/
https://ab81-io.blogspot.com/
https://ab82-io.blogspot.com/
https://ab83-io.blogspot.com/
https://edu-blog-485.blogspot.com/
https://edu-blog-486.blogspot.com/
https://edu-blog-487.blogspot.com/
https://edu-blog-488.blogspot.com/
https://edu-blog-489.blogspot.com/
https://edu-blog-490.blogspot.com/
https://edu-blog-491.blogspot.com/
https://edu-blog-492.blogspot.com/
https://edu-blog-493.blogspot.com/
https://edu-blog-494.blogspot.com/
https://edu-blog-495.blogspot.com/
https://edu-blog-496.blogspot.com/
https://edu-blog-497.blogspot.com/
https://edu-blog-498.blogspot.com/
https://edu-blog-499.blogspot.com/
https://edu-blog-500.blogspot.com/
https://edu-blog-501.blogspot.com/
https://edu-blog-502.blogspot.com/
https://edu-blog-503.blogspot.com/
https://edu-blog-504.blogspot.com/
https://edu-blog-505.blogspot.com/
https://edu-blog-506.blogspot.com/
https://edu-blog-507.blogspot.com/
https://edu-blog-508.blogspot.com/
https://edu-blog-509.blogspot.com/
https://edu-blog-510.blogspot.com/
https://edu-blog-511.blogspot.com/
https://edu-blog-512.blogspot.com/
https://edu-blog-513.blogspot.com/
https://edu-blog-514.blogspot.com/
https://edu-blog-515.blogspot.com/
https://edu-blog-516.blogspot.com/
https://edu-blog-517.blogspot.com/
https://edu-blog-518.blogspot.com/
https://edu-blog-519.blogspot.com/
https://edu-blog-520.blogspot.com/
https://edu-blog-521.blogspot.com/

[bug #44674] Add an option that will send the HTTP request to stderr or a file

2023-03-13 Thread kaleemseo
Follow-up Comment #10, bug #44674 (project wget):

https://edu-blog-1.blogspot.com
https://boston-edu-seo.blogspot.com
https://ciaira-edu.blogspot.com
https://dexter-edu.blogspot.com
https://eagle-eye-seo.blogspot.com
https://falcon-edu.blogspot.com
https://grape-edu.blogspot.com
https://izak-edu.blogspot.com
https://john-elia-edu.blogspot.com
https://kaleem-seo-edu.blogspot.com
https://little-wishes-edu.blogspot.com
https://rumaila-edu.blogspot.com
https://blog-edu-01.blogspot.com
https://blog-edu-14.blogspot.com
https://blog-edu-15.blogspot.com
https://blog-edu-16.blogspot.com
https://blog-edu-161.blogspot.com
https://blog-edu-18.blogspot.com
https://blog-edu-19.blogspot.com
https://blog-edu-20.blogspot.com
https://blog-edu-21.blogspot.com
https://blog-edu-201.blogspot.com
https://blog-edu-22.blogspot.com
https://blog-edu-23.blogspot.com
https://blog-edu-24.blogspot.com
https://blog-edu-25.blogspot.com
https://blog-edu-26.blogspot.com
https://blog-edu-27.blogspot.com
https://blog-edu-28.blogspot.com
https://blog-edu-29.blogspot.com
https://blog-edu-30.blogspot.com
https://blog-edu-31.blogspot.com
https://blog-edu-32.blogspot.com
https://blog-edu-34.blogspot.com
https://blog-edu-33.blogspot.com
https://blog-edu-35.blogspot.com
https://blog-edu-36.blogspot.com/
https://blog-edu-37.blogspot.com/
https://blog-edu-38.blogspot.com/
https://blog-edu-39.blogspot.com/
https://blog-edu-40.blogspot.com/
https://blog-edu-41.blogspot.com/
https://blog-edu-42.blogspot.com/
https://blog-edu-43.blogspot.com/
https://blog-edu-44.blogspot.com/
https://blog-edu-45.blogspot.com/
https://blog-edu-46.blogspot.com/
https://blog-edu-47.blogspot.com/
https://blog-edu-48.blogspot.com/
https://blog-edu-49.blogspot.com/
https://blog-edu-50.blogspot.com/
https://blog-edu-51.blogspot.com/
https://blog-edu-52.blogspot.com/
https://blog-edu-53.blogspot.com/
https://blog-edu-54.blogspot.com/
https://blog-edu-55.blogspot.com/
https://blog-edu-56.blogspot.com
https://blog-edu-57.blogspot.com
https://blog-edu-58.blogspot.com
https://blog-edu-59.blogspot.com
https://blog-edu-60.blogspot.com
https://blog-edu-61.blogspot.com
https://blog-edu-62.blogspot.com
https://blog-edu-63.blogspot.com
https://blog-edu-64.blogspot.com
https://blog-edu-65.blogspot.com
https://blog-edu-66.blogspot.com
https://blog-edu-67.blogspot.com
https://blog-edu-68.blogspot.com
https://blog-edu-69.blogspot.com
https://blog-edu-70.blogspot.com
https://blog-edu-71.blogspot.com
https://blog-edu-72.blogspot.com
https://blog-edu-73.blogspot.com
https://blog-edu-74.blogspot.com
https://blog-edu-75.blogspot.com
https://blog-edu-76.blogspot.com
https://blog-edu-77.blogspot.com
https://blog-edu-78.blogspot.com
https://blog-edu-79.blogspot.com
https://blog-edu-80.blogspot.com
https://blog-edu-81.blogspot.com
https://blog-edu-82.blogspot.com
https://blog-edu-83.blogspot.com
https://blog-edu-84.blogspot.com
https://blog-edu-85.blogspot.com
https://blog-edu-86.blogspot.com
https://blog-edu-87.blogspot.com
https://blog-edu-88.blogspot.com
https://blog-edu-89.blogspot.com
https://blog-edu-90.blogspot.com
https://blog-edu-91.blogspot.com
https://blog-edu-92.blogspot.com
https://blog-edu-93.blogspot.com
https://blog-edu-94.blogspot.com
https://blog-edu-95.blogspot.com
https://blog-edu-96.blogspot.com
https://blog-edu-97.blogspot.com
https://blog-edu-98.blogspot.com
https://blog-edu-99.blogspot.com
https://aileen-edu.blogspot.com
https://alexa-edu.blogspot.com
https://alina-edu-pk.blogspot.com
https://angerona-edu.blogspot.com
https://anne-edu.blogspot.com
https://apoorva-edu.blogspot.com
https://ashok-edu.blogspot.com
https://bartholom-edu.blogspot.com
https://bleddyn-edu.blogspot.com
https://cadfan-edu.blogspot.com
https://cadi-edu.blogspot.com
https://caietanus-edu.blogspot.com
https://chrestos-edu.blogspot.com
https://christie-edu.blogspot.com
https://darina-edu.blogspot.com
https://delicia-edu.blogspot.com
https://dinah-edu.blogspot.com
https://dionysia-edu.blogspot.com
https://duane-edu.blogspot.com
https://dylan-edu.blogspot.com
https://edda-edu.blogspot.com
https://edijs-edu.blogspot.com
https://edsel-edu.blogspot.com
https://eduardo-edu-pk.blogspot.com
https://laarni-edu.blogspot.com
https://education-edu-edu-edu.blogspot.com
https://epaphras-edu.blogspot.com
https://eun-jung-edu.blogspot.com
https://yale-edu.blogspot.com
https://euphrasia-edu.blogspot.com
https://glory-edu.blogspot.com
https://gordie-edu.blogspot.com
https://hajni-edu.blogspot.com
https://helen-edu.blogspot.com
https://herminius-edu.blogspot.com
https://icarus-edu.blogspot.com
https://indrani-edu.blogspot.com
https://javed-edu.blogspot.com
https://javiera-edu.blogspot.com
https://jaylene-edu.blogspot.com
https://juli-edu.blogspot.com
https://jupp-edu.blogspot.com
https://kaleem-edu-pk-ok.blogspot.com
https://kamile-edu.blogspot.com
https://kokou-edu.blogspot.com
https://kratos-edu.blogspot.com
https://krista-edu.blogspot.com
https://kylli-edu.blogspot.com
https://laarni-edu-pc.blogspot.com

[bug #44674] Add an option that will send the HTTP request to stderr or a file

2023-03-13 Thread kaleemseo
Follow-up Comment #9, bug #44674 (project wget):


[comment #3 comment #3:]
> Just open a second console and start
>   nc -l -p 
> 
> Start wget in your first console
>   http_proxy=localhost: wget http://www.example.com
> 
> nc will now dump everything that Wget sends. You could even generate an
answer (e.g. with copy & paste).
> 
> Wget just adds a Proxy-Connection header which will not be sent on non-proxy
connections.
> 

(file #54477)

___

Additional Item Attachment:

File name: Main.txt   Size:13 KB




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #44674] Add an option that will send the HTTP request to stderr or a file

2023-03-02 Thread kaleemseo
Follow-up Comment #8, bug #44674 (project wget):


[comment #3 comment #3:]
> Just open a second console and start
>   nc -l -p 
> 
> Start wget in your first console
>   http_proxy=localhost: wget http://www.example.com
> 
> nc will now dump everything that Wget sends. You could even generate an
answer (e.g. with copy & paste).
> 
> Wget just adds a Proxy-Connection header which will not be sent on non-proxy
connections.
> 
https://blog-edu-477.blogspot.com/
https://blog-edu-478.blogspot.com/
https://blog-edu-479.blogspot.com/
https://blog-edu-480.blogspot.com/
https://blog-edu-481.blogspot.com/
https://blog-edu-482.blogspot.com/
https://blog-edu-483.blogspot.com/
https://edu-blog-001.blogspot.com/
https://edu-blog-010.blogspot.com/
https://edu-blog-011.blogspot.com/
https://edu-blog-012.blogspot.com/
https://edu-blog-013.blogspot.com/
https://edu-blog-014.blogspot.com/
https://edu-blog-015.blogspot.com/
https://edu-blog-016.blogspot.com/
https://edu-blog-017.blogspot.com/
https://edu-blog-018.blogspot.com/
https://edu-blog-019.blogspot.com/
https://edu-blog-002.blogspot.com/
https://edu-blog-020.blogspot.com/
https://edu-blog-021.blogspot.com/
https://edu-blog-022.blogspot.com/
https://edu-blog-003.blogspot.com/
https://edu-blog-004.blogspot.com/
https://edu-blog-005.blogspot.com/
https://edu-blog-006.blogspot.com/
https://edu-blog-007.blogspot.com/
https://edu-blog-008.blogspot.com/
https://edu-blog-009.blogspot.com/
https://edu-blog-024.blogspot.com/
https://edu-blog-0223.blogspot.com/
https://edu-blog-025.blogspot.com/
https://ab2-io.blogspot.com/
https://ab1-io.blogspot.com/
https://ab3-io.blogspot.com/
https://ab4-io.blogspot.com/
https://ab5-io.blogspot.com/
https://ab6-io.blogspot.com/
https://ab7-io.blogspot.com/
https://ab8-io.blogspot.com/
https://ab9-io.blogspot.com/
https://ab10-io.blogspot.com/
https://ab11-io.blogspot.com/
https://ab12-io.blogspot.com/
https://ab13-io.blogspot.com/
https://ab14-io.blogspot.com/
https://ab15-io.blogspot.com/
https://ab16-io.blogspot.com/
https://ab17-io.blogspot.com/
https://ab18-io.blogspot.com/
https://ab19-io.blogspot.com/
https://ab20-io.blogspot.com/
https://ab21-io.blogspot.com/
https://ab22-io.blogspot.com/
https://ab23-io.blogspot.com/
https://ab24-io.blogspot.com/
https://ab25-io.blogspot.com/
https://ab26-io.blogspot.com/
https://ab27-io.blogspot.com/
https://ab28-io.blogspot.com/
https://ab29-io.blogspot.com/
https://ab30-io.blogspot.com/
https://ab31-io.blogspot.com/
https://ab32-io.blogspot.com/
https://ab33-io.blogspot.com/
https://ab34-io.blogspot.com/
https://ab35-io.blogspot.com/
https://ab36-io.blogspot.com/
https://ab37-io.blogspot.com/
https://ab39-io.blogspot.com/
https://ab38-io.blogspot.com/
https://ab40-io.blogspot.com/
https://ab41-io.blogspot.com/
https://ab42-io.blogspot.com/
https://ab43-io.blogspot.com/
https://ab44-io.blogspot.com/
https://ab45-io.blogspot.com/
https://ab46-io.blogspot.com/
https://ab47-io.blogspot.com/
https://ab48-io.blogspot.com/
https://ab49-io.blogspot.com/
https://ab50-io.blogspot.com/
https://ab51-io.blogspot.com/
https://ab52-io.blogspot.com/
https://ab53-io.blogspot.com/
https://ab54-io.blogspot.com/
https://ab55-io.blogspot.com/
https://ab56-io.blogspot.com/
https://ab57-io.blogspot.com/
https://ab58-io.blogspot.com/
https://ab59-io.blogspot.com/
https://ab60-io.blogspot.com/
https://ab61-io.blogspot.com/
https://ab62-io.blogspot.com/
https://ab63-io.blogspot.com/
https://ab64-io.blogspot.com/
https://ab65-io.blogspot.com/
https://ab66-io.blogspot.com/
https://ab67-io.blogspot.com/
https://ab68-io.blogspot.com/
https://ab69-io.blogspot.com/
https://ab70-io.blogspot.com/
https://ab71-io.blogspot.com/
https://ab72-io.blogspot.com/
https://ab73-io.blogspot.com/
https://ab74-io.blogspot.com/
https://ab74-io.blogspot.com/
https://ab75-io.blogspot.com/
https://ab76-io.blogspot.com/
https://ab77-io.blogspot.com/
https://ab78-io.blogspot.com/
https://ab79-io.blogspot.com/
https://ab80-io.blogspot.com/
https://ab81-io.blogspot.com/
https://ab82-io.blogspot.com/
https://ab83-io.blogspot.com/
https://edu-blog-485.blogspot.com/
https://edu-blog-486.blogspot.com/
https://edu-blog-487.blogspot.com/
https://edu-blog-488.blogspot.com/
https://edu-blog-489.blogspot.com/
https://edu-blog-490.blogspot.com/
https://edu-blog-491.blogspot.com/
https://edu-blog-492.blogspot.com/
https://edu-blog-493.blogspot.com/
https://edu-blog-494.blogspot.com/
https://edu-blog-495.blogspot.com/
https://edu-blog-496.blogspot.com/
https://edu-blog-497.blogspot.com/
https://edu-blog-498.blogspot.com/
https://edu-blog-499.blogspot.com/
https://edu-blog-500.blogspot.com/
https://edu-blog-501.blogspot.com/
https://edu-blog-502.blogspot.com/
https://edu-blog-503.blogspot.com/
https://edu-blog-504.blogspot.com/
https://edu-blog-505.blogspot.com/
https://edu-blog-506.blogspot.com/
https://edu-blog-507.blogspot.com/
https://edu-blog-508.blogspot.com/
https://edu-blog-509.blogspot.com/
https://edu-blog-510.blogspot.com/

libressl TLS 1.3 feature request

2022-03-21 Thread Nam Nguyen
Can TLS 1.3 be enabled for libressl? I tested the latest wget 1.21.3 on
OpenBSD and wanted to test "Add option to select TLS 1.3 on the command
line."

According to this, TLS 1.3 support should be ready. related:
https://github.com/libressl-portable/portable/issues/228#issuecomment-712913693

$ openssl version
LibreSSL 3.5.2

$ wget --secure-protocol=TLSv1_3 
https://cdn.openbsd.org/pub/OpenBSD/7.0/amd64/install70.img
--2022-03-21 19:54:02--  
https://cdn.openbsd.org/pub/OpenBSD/7.0/amd64/install70.img
Your OpenSSL version is too old to support TLS 1.3
Disabling SSL due to encountered errors.

$ wget --version
GNU Wget 1.21.3 built on openbsd7.1.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls 
+ntlm +opie +psl +ssl/openssl 

Wgetrc: 
/etc/wgetrc (system)
Locale: 
/usr/local/share/locale 
Compile: 
cc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc" 
-DLOCALEDIR="/usr/local/share/locale" -I. -I../lib -I../lib 
-I/usr/local/include -I/usr/local/include -I/usr/local/include 
-DHAVE_LIBSSL -I/usr/local/include -DNDEBUG -O2 -pipe 
Link: 
cc -I/usr/local/include -I/usr/local/include -DHAVE_LIBSSL 
-I/usr/local/include -DNDEBUG -O2 -pipe -L/usr/local/lib 
-L/usr/local/lib -lpcre2-8 -L/usr/local/lib -lidn2 -lssl -lcrypto 
-lz -L/usr/local/lib -lpsl ../lib/libgnu.a 
/usr/local/lib/libiconv.so.7.0 /usr/local/lib/libintl.so.7.0 
/usr/local/lib/libiconv.so.7.0 -Wl,-rpath,/usr/local/lib 
/usr/local/lib/libunistring.so.0.1 /usr/local/lib/libiconv.so.7.0 
-Wl,-rpath,/usr/local/lib



request

2021-06-26 Thread Hari Bahadur K.C
Sir/ Madam Namaskar
I wanted to download multiple files using wget. But there was bug report.
Please help me to fix this bug.
Problem was like this " Email bug reports, questions, discussions to <
bug-wget@gnu.org>
and/or open issues at https://savannah.gnu.org/bugs/?func=additem=wget
."


Request

2021-06-26 Thread Hari K . C .
I have problems on downloading files through wget. Please help me


Sent from Mail for Windows 10



Re: How to send a POST request by wget same to a httpie request?

2020-07-21 Thread Ander Juaristi

You can send POST requests, with --method=POST.

Multipart bodies are not, and will probably never be, supported in wget.

There is ongoing work to implementing them in wget2, however [0]. You 
might want to check it out.


Having said this, in wget you can send the contents of a file as a body 
with the --body-file option. This is not the same as a multipart body, 
but might fit your use case. What --body-file does is send the file's 
contents directly in the body of the HTTP request. There is also 
--post-file, which is basically --method=POST + --body-file.


Cheers,
- AJ

[0] https://gitlab.com/gnuwget/wget2


El 2020-07-02 00:32, Peng Yu escribió:

$ http --form POST localhost:9000 f...@1.txt

The above httpie (https://httpie.org/) command will send the following
POST request. Could anybody let me know what is the equivalent wget
command to achieve the same HTTP request? Thanks.


POST / HTTP/1.1
Host: localhost:9000
User-Agent: HTTPie/2.2.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 171
Content-Type: multipart/form-data; 
boundary=36922889709f11dcba960da4b9d51a2e


--36922889709f11dcba960da4b9d51a2e
Content-Disposition: form-data; name="file"; filename="1.txt"
Content-Type: text/plain

abc

--36922889709f11dcba960da4b9d51a2e--




How to send a POST request by wget same to a httpie request?

2020-07-01 Thread Peng Yu
$ http --form POST localhost:9000 f...@1.txt

The above httpie (https://httpie.org/) command will send the following
POST request. Could anybody let me know what is the equivalent wget
command to achieve the same HTTP request? Thanks.


POST / HTTP/1.1
Host: localhost:9000
User-Agent: HTTPie/2.2.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 171
Content-Type: multipart/form-data; boundary=36922889709f11dcba960da4b9d51a2e

--36922889709f11dcba960da4b9d51a2e
Content-Disposition: form-data; name="file"; filename="1.txt"
Content-Type: text/plain

abc

--36922889709f11dcba960da4b9d51a2e--


-- 
Regards,
Peng



[bug #58525] HTTP request sent, awaiting response... Read error (Bad file descriptor) in headers.

2020-06-07 Thread Ulrich Windl
Follow-up Comment #1, bug #58525 (project wget):

Running in --debug mode I got some more details:

...
Initiating SSL handshake.
seconds 900,00, Winsock error: 0
Handshake successful; connected socket 3 to SSL handle 0x03106e00
certificate:
  subject: CN=orc.amd.com,O=Advanced Micro Devices,L=Austin,ST=Texas,C=US
  issuer:  CN=GeoTrust RSA CA 2018,OU=www.digicert.com,O=DigiCert Inc,C=US
X509 certificate successfully verified and matches host drivers.amd.com
...
HTTP request sent, awaiting response... seconds 900,00, Winsock error: 0

---response begin---
HTTP/1.1 302 Moved Temporarily
Server: AkamaiGHost
Content-Length: 0
Location: https://www.amd.com/de/support/kb/faq/download-incomplete
Date: Sun, 07 Jun 2020 20:51:20 GMT
Connection: keep-alive

---response end---
...
Converted file name
'Win10-Radeon-Software-Adrenalin-2020-Edition-20.5.1-May27.exe' (UTF-8) ->
'Win10-Radeon-Software-Adrenalin-2020-Edition-20.5.1-May27.exe' (CP1252)
--2020-06-07 22:50:34-- 
https://www.amd.com/de/support/kb/faq/download-incomplete
Resolving www.amd.com (www.amd.com)... seconds 0,00, 92.123.252.55
Caching www.amd.com => 92.123.252.55
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... seconds 0,00,
connected.
Created socket 4.
Releasing 0x0311ae90 (new refcount 1).
Initiating SSL handshake.
seconds 900,00, Winsock error: 0
Handshake successful; connected socket 4 to SSL handle 0x03124870
certificate:
  subject: CN=amd.com,OU=MARKETING,O=Advanced Micro
Devices,L=Austin,ST=Texas,C=US
  issuer:  CN=GeoTrust RSA CA 2018,OU=www.digicert.com,O=DigiCert Inc,C=US
X509 certificate successfully verified and matches host www.amd.com

---request begin---
GET /de/support/kb/faq/download-incomplete HTTP/1.1
User-Agent: Wget/1.19.4 (mingw32)
Accept: */*
Accept-Encoding: identity
Host: www.amd.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... Read error (error:0B07C065:x509
certificate routines:X509_STORE_add_cert:cert already in hash table;
error:0B07C065:x509 certificate routines:X509_STORE_add_cert:cert already in
hash table) in headers.
Closed 4/SSL 0x03124870
Retrying.

--2020-06-07 22:50:54--  (try: 2) 
https://www.amd.com/de/support/kb/faq/download-incomplete
Found www.amd.com in host_name_addresses_map (0311ae90)
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... seconds 0,00,
connected.
Created socket 4.
Releasing 0x0311ae90 (new refcount 1).
Initiating SSL handshake.
seconds 900,00, Winsock error: 0
Handshake successful; connected socket 4 to SSL handle 0x03123410
certificate:
  subject: CN=amd.com,OU=MARKETING,O=Advanced Micro
Devices,L=Austin,ST=Texas,C=US
  issuer:  CN=GeoTrust RSA CA 2018,OU=www.digicert.com,O=DigiCert Inc,C=US
X509 certificate successfully verified and matches host www.amd.com

---request begin---
GET /de/support/kb/faq/download-incomplete HTTP/1.1
User-Agent: Wget/1.19.4 (mingw32)
Accept: */*
Accept-Encoding: identity
Host: www.amd.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... ^C


___

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58525>

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #58525] HTTP request sent, awaiting response... Read error (Bad file descriptor) in headers.

2020-06-07 Thread Ulrich Windl
URL:
  <https://savannah.gnu.org/bugs/?58525>

 Summary: HTTP request sent, awaiting response... Read error
(Bad file descriptor) in headers.
 Project: GNU Wget
Submitted by: gnats
Submitted on: Sun 07 Jun 2020 08:49:53 PM UTC
Category: Protocol Issue
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Release: 1.20
 Discussion Lock: Any
Operating System: Microsoft Windows
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

I have this reproducible issue with downloads from AMD (Radeon driver) in
Windows 10 (64-bit) (binary from https://eternallybored.org/misc/wget/):

wget-1.20.3-win32>wget -N
https://drivers.amd.com/drivers/beta/Win10-Radeon-Software-Adrenalin-2020-Edition-20.5.1-May27.exe
--2020-06-07 22:33:11-- 
https://drivers.amd.com/drivers/beta/Win10-Radeon-Software-Adrenalin-2020-Edition-20.5.1-May27.exe
Resolving drivers.amd.com (drivers.amd.com)... 2.22.69.4
Connecting to drivers.amd.com (drivers.amd.com)|2.22.69.4|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://www.amd.com/de/support/kb/faq/download-incomplete
[following]
--2020-06-07 22:33:11-- 
https://www.amd.com/de/support/kb/faq/download-incomplete
Resolving www.amd.com (www.amd.com)... 92.123.252.55
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... connected.
HTTP request sent, awaiting response... Read error (error:0B07C065:x509
certificate routines:X509_STORE_add_cert:cert already in hash table;
error:0B07C065:x509 certificate routines:X509_STORE_add_cert:cert already in
hash table) in headers.
Retrying.

--2020-06-07 22:33:31--  (try: 2) 
https://www.amd.com/de/support/kb/faq/download-incomplete
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... connected.
HTTP request sent, awaiting response... Read error (Bad file descriptor) in
headers.
Retrying.

--2020-06-07 22:33:53--  (try: 3) 
https://www.amd.com/de/support/kb/faq/download-incomplete
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... connected.
HTTP request sent, awaiting response... Read error (Bad file descriptor) in
headers.
Retrying.

--2020-06-07 22:34:15--  (try: 4) 
https://www.amd.com/de/support/kb/faq/download-incomplete
Connecting to www.amd.com (www.amd.com)|92.123.252.55|:443... connected.
HTTP request sent, awaiting response... Read error (Bad file descriptor) in
headers.
Retrying.

^C





___

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58525>

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #56648] Add HTTP request header to file

2019-07-19 Thread Tim Ruehsen
Follow-up Comment #3, bug #56648 (project wget):

Good that you can work on...

On Debian GNU/Linux:
apt-get install xattr
wget --xattr www.example.com
xattr -p user.xdg.origin.url index.html >out.txt
cat out.txt


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #56648] Add HTTP request header to file

2019-07-19 Thread anonymous
Follow-up Comment #2, bug #56648 (project wget):

Thanks a lot for the fast reply. 

xattr does save URL to file attribute but I couldn't find a way to get the
attributes when using cat to merge all the files to one. Didn't even find a
way to copy attribute and append it to file.

WARC - seems to be helpful.
The manual doesn't explain what it is:
https://www.gnu.org/software/wget/manual/wget.html
Maybe add some explanation based on this:
https://www.archiveteam.org/index.php?title=Wget_with_WARC_output

-o also works but I saved to same file as -O so I get all the data in one file
and don't need to look for log file and try to merge data.

Thanks again.

P.S. I still think it might be easier to save original URL through
‘--save-headers’ 

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #56648] Add HTTP request header to file

2019-07-19 Thread Tim Ruehsen
Follow-up Comment #1, bug #56648 (project wget):

If you copy all files and headers into one file - did you play with the WARC
options ?

With --xattr the original URL is saved as extended file attribute, if your
file system supports it.

You could use -d -olog and let a script extract URL and filename from it.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[Bug-wget] [bug #56648] Add HTTP request header to file

2019-07-19 Thread anonymous
URL:
  <https://savannah.gnu.org/bugs/?56648>

 Summary: Add HTTP request header to file
 Project: GNU Wget
Submitted by: None
Submitted on: Fri 19 Jul 2019 01:26:19 PM UTC
Category: Feature Request
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: Asaf
Originator Email: 3023023...@gmail.com
 Open/Closed: Open
 Discussion Lock: Any
 Release: None
Operating System: GNU/Linux
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

I use wget to download website to one folder, copy all files to one file and
process the full text of the website.
I need to know from which link each part of the text was taken. I thought
‘--save-headers’ will help me but it only contains HTTP response header,
and I guess the link appears in the HTTP request.

Planning to move to httrack due to this issue. There I get original link on
note on each file by default.

Thanks. 




___

Reply to this item at:

  <https://savannah.gnu.org/bugs/?56648>

___
  Message sent via Savannah
  https://savannah.gnu.org/




Re: [Bug-wget] [Secunia Research] GNU wget Vulnerability Report - Request for Details

2019-04-04 Thread Tim Rühsen
On 4/4/19 4:42 PM, Josef Moellers wrote:
> On 04.04.19 09:27, Tim Rühsen wrote:
>> On 4/4/19 3:14 AM, Secunia Research wrote:
>>> Hello,
>>>
>>> We are currently processing a report published by a third-party [1] for GNU
>>> wget and are currently evaluating it to publish a Secunia Advisory for this.
>>> Please see the original report for details.
>>>
>>> We would appreciate to receive your comments on those issues before we
>>> publish our advisory based on this information.
>>>
>>> * Can you confirm the vulnerability?
>>
>> Yes
> 
> Can you please elaborate what EXACTLY the vulnerability is? I have
> searched through the (quite hefty) diff between 1.20.1 and 1.20.2 and
> have found only 4 differences that may be viewed as these, but the
> changes in
> src/ftp-ls.c and
> src/http.c
> do not fix a vulnerability.
> The CVE-entry is not quite helpful, to say the least ;-)

Well, I could tell you details since I have a PoC and I made the fix.
But maybe there is a reason why the JVN people dont't include the PoC
within their report. I am asking them...

Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] [Secunia Research] GNU wget Vulnerability Report - Request for Details

2019-04-04 Thread Josef Moellers
On 04.04.19 09:27, Tim Rühsen wrote:
> On 4/4/19 3:14 AM, Secunia Research wrote:
>> Hello,
>>
>> We are currently processing a report published by a third-party [1] for GNU
>> wget and are currently evaluating it to publish a Secunia Advisory for this.
>> Please see the original report for details.
>>
>> We would appreciate to receive your comments on those issues before we
>> publish our advisory based on this information.
>>
>> * Can you confirm the vulnerability?
> 
> Yes

Can you please elaborate what EXACTLY the vulnerability is? I have
searched through the (quite hefty) diff between 1.20.1 and 1.20.2 and
have found only 4 differences that may be viewed as these, but the
changes in
src/ftp-ls.c and
src/http.c
do not fix a vulnerability.
The CVE-entry is not quite helpful, to say the least ;-)

Thanks,

Josef
-- 
SUSE Linux GmbH
Maxfeldstrasse 5
90409 Nuernberg
Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] [Secunia Research] GNU wget Vulnerability Report - Request for Details

2019-04-04 Thread Tim Rühsen
On 4/4/19 3:14 AM, Secunia Research wrote:
> Hello,
> 
> We are currently processing a report published by a third-party [1] for GNU
> wget and are currently evaluating it to publish a Secunia Advisory for this.
> Please see the original report for details.
> 
> We would appreciate to receive your comments on those issues before we
> publish our advisory based on this information.
> 
> * Can you confirm the vulnerability?

Yes

> * Which products and versions are affected by the vulnerability?

GNU Wget < 1.20.2

> * When do you expect to release fixed versions?

1.20.2 has been released on 1st April 2019

> * Are there any mitigating factors or recommended workarounds?

Mitigate by updating to GNU Wget 1.20.2.

If updating is not possible, as far as I can say:
Use only trusted IRIs as input, do not *recursively* download from
untrusted servers.

Regards, Tim



signature.asc
Description: OpenPGP digital signature


[Bug-wget] [Secunia Research] GNU wget Vulnerability Report - Request for Details

2019-04-03 Thread Secunia Research
Hello,

 

We are currently processing a report published by a third-party [1] for GNU
wget and are currently evaluating it to publish a Secunia Advisory for this.
Please see the original report for details.

 

We would appreciate to receive your comments on those issues before we
publish our advisory based on this information.

 

* Can you confirm the vulnerability?

* Which products and versions are affected by the vulnerability?

* When do you expect to release fixed versions?

* Are there any mitigating factors or recommended workarounds?


References:
[1] http://jvn.jp/en/jp/JVN25261088/index.html

 

---

Kind Regards,

Laurent Delosieres
Security Specialist

Secunia Research at Flexera

Arne Jacobsens Allé 7, 5th floor
2300 Copenhagen S
Denmark


Phone +45 7020 5144
Fax +45 7020 5145

http://www.flexera.com  

 



[Bug-wget] Request: Handle -np better when URL omits final slash character

2019-01-29 Thread Andrew Pennebaker
Looks like -np is ignored for URL "directories" that omit the trailing
slash. At least this is what I see happen with Homebrew wget. Could we get
-np to act more flexibly, continuing to prohibit upwards navigation when
the original URL leaves off the slash?

-- 
Cheers,
Andrew


Re: [Bug-wget] Feature request

2018-10-07 Thread Ander Juaristi

Hi,

Most of the new features are now better shipped in wget2 and libwget.

Development is done in GitLab: https://gitlab.com/gnuwget/wget2
Docs: https://gnuwget.gitlab.io/wget2/reference/modules.html

I don't know if that particular feature would fit into wget2, but a 
generalisation
of it might be a good addition. I'm thinking on some functions in 
libwget to set
the max/min download speed, or something along the lines. There aren't 
currently

any such controls and they'd be a good addition IMO.

In any case, we're always pleased to receive MRs in the GitLab 
repository

for discussion.

- AJ


Hello maintainers,
I wrote this script in python which restarts the wget process if the 
speed

hits a particular minimum set by the user.
https://github.com/plant99/better-wget
Though it needs a little tidying up, I would love to add this as a 
feature

to original wget code. Please guide me on this.
Best Regards,
Shivashis Padhi




[Bug-wget] Feature request

2018-10-06 Thread Shivashis Padhi
Hello maintainers,
I wrote this script in python which restarts the wget process if the speed
hits a particular minimum set by the user.
https://github.com/plant99/better-wget
Though it needs a little tidying up, I would love to add this as a feature
to original wget code. Please guide me on this.
Best Regards,
Shivashis Padhi


Re: [Bug-wget] request to change retry default

2018-07-13 Thread Tim Rühsen
On 07/08/2018 02:59 AM, John Roman wrote:
> Greetings,
> I wish to discuss a formal change of the default retry for wget from 20
> to something more pragmatic such as two or three.
> 
> While I believe 20 retries may have been the correct default many years
> ago, it seems overkill for the modern "cloud based" internet, where most 
> sites are
> backed by one or more load balancers.  Geolocateable A records further
> reduce the necessity for retries by providing a second or third option
> for browsers to try.  To a lesser extent, GTM and GSLB technologies
> (however maligned they may be) are sufficient as well to
> properly handle failures for significant amounts of traffic.  BGP
> network technology for large hosting providers has also further reduced
> the need to perform several retries to a site.  Finally, for better or
> worse, environments such as Kubernetes and other container orchestration
> tools seem to afford sites an unlimited uptime should the marketing be
> trusted.

Solution: Just add 'tries = 3' to /etc/wgetrc or to ~/.wgetrc and never
care for it again.

But I wonder myself a bit about your request... if 3 tries would always
be enough to catch a file safely, then it doesn't matter if tries is set
to 20, 20.000 or even unlimited. Is there something you might have
forgotten to write !?

Regards, Tim



signature.asc
Description: OpenPGP digital signature


[Bug-wget] request to change retry default

2018-07-07 Thread John Roman
Greetings,
I wish to discuss a formal change of the default retry for wget from 20
to something more pragmatic such as two or three.

While I believe 20 retries may have been the correct default many years
ago, it seems overkill for the modern "cloud based" internet, where most sites 
are
backed by one or more load balancers.  Geolocateable A records further
reduce the necessity for retries by providing a second or third option
for browsers to try.  To a lesser extent, GTM and GSLB technologies
(however maligned they may be) are sufficient as well to
properly handle failures for significant amounts of traffic.  BGP
network technology for large hosting providers has also further reduced
the need to perform several retries to a site.  Finally, for better or
worse, environments such as Kubernetes and other container orchestration
tools seem to afford sites an unlimited uptime should the marketing be
trusted.






Re: [Bug-wget] Feature request: option to not download rejected files

2018-07-03 Thread Tim Rühsen
On 07/03/2018 12:48 PM, Zoe Blade wrote:
>> In Wget2 there is an extra option for this, --filter-urls.
> 
> Thank you Tim, this sounds like exactly what I was after!  (It's especially 
> important when you have wget logged in as a user, to be able to tell it not 
> to go to the logout page.)  Though if that feature could be ported to the 
> original wget, with its WARC support etc, that'd be useful.  I guess I'll 
> stick with my hacked version for now.

WARC for wget2 is on the list, maybe as an extra library project.

Thanks for your feedback - I wasn't aware of WARC users out there ;-)

Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] Feature request: option to not download rejected files

2018-07-03 Thread Zoe Blade
> In Wget2 there is an extra option for this, --filter-urls.

Thank you Tim, this sounds like exactly what I was after!  (It's especially 
important when you have wget logged in as a user, to be able to tell it not to 
go to the logout page.)  Though if that feature could be ported to the original 
wget, with its WARC support etc, that'd be useful.  I guess I'll stick with my 
hacked version for now.

Thanks,
Zoë.


Re: [Bug-wget] Feature request: option to not download rejected files

2018-06-29 Thread Tim Rühsen
On 06/29/2018 03:20 PM, Zoe Blade wrote:
> For anyone else who needs to do this, I adapted Sergey Svishchev's 1.8-era 
> patch for 19.1 (one of the few versions I managed to get to compile in OS X; 
> I'm on a Mac, and not the best programmer):
> 
> recur.c:578
> -  if (blacklist_contains (blacklist, url))
> +  if (blacklist_contains (blacklist, url) || !acceptable (url))
> 
> It's not ideal, but it seems to solve the problem as a temporary fix.  
> Hopefully it might help someone else who needs this functionality.

Hi Zoë,

we recently had a discussion (20.6.2018 "Why does -A not work") where I
confirmed that --reject-regex works like a filter for detected URLs.

BTW, the OP wanted --reject-regex to download+parse HTML (and delete
thereafter if matching the rejected regex) - so the opposite from your
request.

In Wget2 there is an extra option for this, --filter-urls. Maybe
--filter-mime-type is also worth a look.

Best would be if you can provide a small example / reproducer. It can
also be a hand-crafted HTML file.

Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] Feature request: option to not download rejected files

2018-06-29 Thread Zoe Blade
For anyone else who needs to do this, I adapted Sergey Svishchev's 1.8-era 
patch for 19.1 (one of the few versions I managed to get to compile in OS X; 
I'm on a Mac, and not the best programmer):

recur.c:578
-  if (blacklist_contains (blacklist, url))
+  if (blacklist_contains (blacklist, url) || !acceptable (url))

It's not ideal, but it seems to solve the problem as a temporary fix.  
Hopefully it might help someone else who needs this functionality.

Cheers,
Zoë.


Re: [Bug-wget] Feature request: option to not download rejected files

2018-06-29 Thread Zoe Blade
> ...it would be more useful to avoid downloading rejected files altogether...

Hmm, after a bit more digging, I see this isn't a new request: 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=217243  Is anyone working on 
this?


[Bug-wget] Feature request: option to not download rejected files

2018-06-29 Thread Zoe Blade
Hi!

First of all, I find wget very useful, so thank you to everyone who has 
contributed to it!

I gather that the rejection list (--reject and --reject-regex) is used to 
determine which downloaded files to permanently save or not.  While that's 
sometimes useful, there are other times it would be more useful to avoid 
downloading rejected files altogether.

For example, rejecting any file with a question mark in it, to avoid 
duplication due to endless combinations of parameters.  It would put far less 
strain on the server to be able to just download the main version of each page 
and not its various iterations.

Someone even went as far as to write a quick hack to add this functionality for 
themselves: 
https://stackoverflow.com/questions/12704197/wget-reject-still-downloads-file  
It would be much nicer if it was built in, in a more robust and extensible 
manner.

Thanks,
Zoë.


[Bug-wget] Help or feature request - WGET -N option preventing file overwrite

2018-06-15 Thread John Murrell
Hello,

 

I have been trying to use WGET to download some files but have been unable
to find a combination of options to achieve what I need to do.

 

The site I am downloading from publishes a new version of a file every hour
but the exact time it is available is not specified. The replacement has the
same name.

 

I am running a script using cron to check the site every 10 minutes using
wget -N to download only the new versions of the file.

 

The problem is that WGET -N clobbers the previous version  and what I want
is to keep a copy of each version. I can't find any way of achieving this -
can anyone advise please.

 

I have tried both renaming and moving the file when WGET -N completes but
then WGET -N 'forgets' the file has already been downloaded and downloads it
again.

 

-nc only keeps a copy of the original version of the source file - I need
all the versions of the file.

 

If this is not possible I would like to suggest a new option that either:

 

* Prevents clobbering files with different contents but duplicate
names this is a combination of the no option and -N option behaviour

* Alternatively WGET should have an option to remember the filename
(+ timestamp and any other comparison information it requires) even though
the file is no longer in the destination directory - I call this the Ghost
option as the ghostly memory of the file exists.

2nd option would be useful if people want to rename or move files on the fly
but I can see problems on storing this information between WGET actuations.

 

Thanks

 

John Murrell

 



Re: [Bug-wget] Submitted a merge request

2018-05-06 Thread Darshit Shah
* sameeran joshi <gsocsamee...@gmail.com> [180506 11:24]:
> Can anyone verify the merge request by me

Hi,

Thanks for the contribution!

However, once you have submitted a merge request on gitlab, you don't need to
send another mail to the mailing list. Someone will review and merge it as soon
as possible.

Sometimes that may take time (think days), since we are all volunteers and
aren't always available. Please wait for atleast 48 hours for a response.

-- 
Thanking You,
Darshit Shah
PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6


signature.asc
Description: PGP signature


[Bug-wget] Submitted a merge request

2018-05-06 Thread sameeran joshi
Can anyone verify the merge request by me


[Bug-wget] [bug #51155] Page requisite requests should use GET method irrespective of original request method

2017-07-31 Thread Darshit Shah
Update of bug #51155 (project wget):

  Status:None => Wont Fix   
 Open/Closed:Open => Closed 


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #50579] wget --continue: Download fails when server does not support HEAD request

2017-03-24 Thread Tim Ruehsen
Update of bug #50579 (project wget):

  Status:Needs Discussion => Wont Fix   
 Open/Closed:Open => Closed 


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #50579] wget --continue: Download fails when server does not support HEAD request

2017-03-24 Thread Darshit Shah
Follow-up Comment #3, bug #50579 (project wget):

As Dale mentions, what Wget does is exactly according to the RFC. In fact it
is the server that breaks the spec. And a workaround for this server may cause
breakages for other servers that actually follow the specifications. As a
result, I vote that we close this as a non-issue

___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #50579] wget --continue: Download fails when server does not support HEAD request

2017-03-23 Thread Dale Worley
Follow-up Comment #2, bug #50579 (project wget):

RFC 2616 section 9.4 says that an HTTP server should return the same headers
for a HEAD request as it does for a GET of the same URL.  So I see nothing
incorrect with wget relying on that behavior.

___

Reply to this item at:

  <http://savannah.gnu.org/bugs/?50579>

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #50579] wget --continue: Download fails when server does not support HEAD request

2017-03-18 Thread Tim Ruehsen
Update of bug #50579 (project wget):

Category:   Program Logic => Feature Request
  Status:None => Needs Discussion   
 Summary: wget --continue: URL with trailing slash '/' (but
Content-Disposition) => wget --continue: Download fails when server does not
support HEAD request

___

Follow-up Comment #1:

As you already stated, the server should have answered the HEAD request
properly. If a server refuses to do so, --continue currently doesn't work.

Especially with --content-disposition, Wget doesn't know about the filename to
look/save at, so it simply gives up here.

If now Wget tries to be 'clever' and (silently) falls back to the filename
derived from the URL, it could be that the wrong file becomes saved,
clobbered, destroyed (depending on additional options).

So IMO, we better leave it up to the user to *exactly* say what (s)he wants.

The title is misleading, the problem is the same wether with or without
trailing slash.

I leave this bug open for discussion and mark it feature request.

___

Reply to this item at:

  <http://savannah.gnu.org/bugs/?50579>

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




Re: [Bug-wget] CVE Request - Gnu Wget 1.17 - Design Error Vulnerability

2016-08-14 Thread Tim Rühsen
Hi,

here is a patch to limit the file modes to u+rw for temp. downloaded files.

Not sure if your proof of concept still works or not - but it seems a good
thing anyways.

Regards, Tim
From 5de996a94f74a31132660238e3b11fd0e29c18fe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim Rühsen?= 
Date: Sun, 14 Aug 2016 21:04:58 +0200
Subject: [PATCH] Limit file mode to u=rw on temp. downloaded files

* bootstrap.conf: Add gnulib modules fopen, open.
* src/http.c (open_output_stream): Limit file mode to u=rw
  on temp. downloaded files.

Reported-by: "Misra, Deapesh" 
---
 bootstrap.conf |  2 ++
 src/http.c | 13 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/bootstrap.conf b/bootstrap.conf
index 2b225b7..d9a5f90 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -40,6 +40,7 @@ dirname
 fcntl
 flock
 fnmatch
+fopen
 futimens
 ftello
 getaddrinfo
@@ -71,6 +72,7 @@ crypto/md5
 crypto/sha1
 crypto/sha256
 crypto/sha512
+open
 quote
 quotearg
 recv
diff --git a/src/http.c b/src/http.c
index 56b8669..d463f29 100644
--- a/src/http.c
+++ b/src/http.c
@@ -39,6 +39,7 @@ as that of the covered work.  */
 #include 
 #include 
 #include 
+#include 

 #include "hash.h"
 #include "http.h"
@@ -2471,7 +2472,17 @@ open_output_stream (struct http_stat *hs, int count, FILE **fp)
   open_id = 22;
   *fp = fopen (hs->local_file, "wb", FOPEN_OPT_ARGS);
 #else /* def __VMS */
-  *fp = fopen (hs->local_file, "wb");
+  if (opt.delete_after
+|| opt.spider /* opt.recursive is implicitely true */
+|| !acceptable (hs->local_file))
+{
+  *fp = fdopen (open (hs->local_file, O_CREAT | O_TRUNC | O_WRONLY, S_IRUSR | S_IWUSR), "wb");
+}
+  else
+{
+  *fp = fopen (hs->local_file, "wb");
+}
+
 #endif /* def __VMS [else] */
 }
   else
--
2.8.1



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [oss-security] CVE Request - Gnu Wget 1.17 - Design Error Vulnerability

2016-08-12 Thread Kurt Seifried
On Thu, Aug 11, 2016 at 3:11 PM, Misra, Deapesh  wrote:

> Hi,
>
> --
> - Background -
> --
>
> Here at iDefense, Verisign Inc, we have a Vulnerability Contributor
> Program (VCP) where we buy vulnerabilities.
>
> Recently, security researcher Dawid Golunski sold us an interesting
> vulnerability within Wget. We asked Red Hat (secalert at redhat dot com) if
> they would help us with the co-ordination (patching, disclosure, etc) of
> this vulnerability. Once they graciously accepted, we discussed the
> vulnerability with them. After their initial triage, Red Hat recommended
> that we publicly post the details of this vulnerability to this mailing
> list for further discussion and hence this email.
>
>
That would have been me =).


> It is very easy for an attacker to win this race as the file only gets
> deleted after the HTTP connection is terminated. He can therefore keep the
> connection open as long as necessary to make use of the uploaded file.
> Below is proof of concept exploit that demonstrates this technique.
>

Please note that the attacker would also have to have access to the local
file system, either shell access or by some additional exploit,
additionally they would have to have read access to the file wget is
downloading (so same security context, or really poor permissions).


> it is evident that the accept/reject rule is applied only after the
> download. This seems to be a design decision which has a security aspect to
> it. As discussed above,
>

It has to be. a PHP script can serve any file type for example. To filter
on the URI is not what is being asked, the downloaded file is what is being
filtered.


>- an attacker can ensure that the files which were not meant to be
> downloaded are downloaded to the location on the victim server (which
> should be a publicly accessible location)
>- the attacker can keep the connection open, even if the file/s have
> been downloaded on the victim server
>- the attacker can then access these files OR use them in a separate
> attack
>- the victim server's security is impacted since the
> developer/administrator was never warned explicitly that 'rejected files'
> can have a transient life on the victim server
>
>
> It looks like the design for wget needs to be changed so that the file it
> downloads to 'recursively search' through is not saved in a location which
> is accessible by the attacker. Additionally the documentation needs to be
> enhanced with the explicit mention of the 'transient nature' of the files
> which are to be rejected.
>

This is easily accomplished using a safe umask for the file.

Please note again that to exploit this you would need a situation where the
attacker can control what wget is fetching, or execute a man in the middle
attack, AND has local access to the system downloading the file AND has
permissions to read the file AND some sort of additional vulnerability that
requires being able to read a file in order to escalate privileges.

Wget is simply doing exactly what is asked of it, downloading files, and
once downloaded checking if you wanted to keep them or not. Same as any
HTTP(S) library that has a mirror function and filter function.

We welcome your comments/suggestions.
>
> thanks,
>
> Deapesh.
> iDefense Labs, Verisign Inc.
> http://www.verisign.com/en_US/security-services/security-
> intelligence/vulnerability-reports/index.xhtml
>
> PS: I hope the maintainer Giuseppe Scrivano gets to see this via the
> bug-wget list I have CC-ed.
>
>


-- 

--
Kurt Seifried -- Red Hat -- Product Security -- Cloud
PGP A90B F995 7350 148F 66BF 7554 160D 4553 5E26 7993
Red Hat Product Security contact: secal...@redhat.com


[Bug-wget] CVE Request - Gnu Wget 1.17 - Design Error Vulnerability

2016-08-11 Thread Misra, Deapesh
Hi,

--
- Background -
--

Here at iDefense, Verisign Inc, we have a Vulnerability Contributor Program 
(VCP) where we buy vulnerabilities. 

Recently, security researcher Dawid Golunski sold us an interesting 
vulnerability within Wget. We asked Red Hat (secalert at redhat dot com) if 
they would help us with the co-ordination (patching, disclosure, etc) of this 
vulnerability. Once they graciously accepted, we discussed the vulnerability 
with them. After their initial triage, Red Hat recommended that we publicly 
post the details of this vulnerability to this mailing list for further 
discussion and hence this email.

--
- Title -
--

Wget Race Condition Recursive Download Accesslist Race Condition Vulnerability


-  Vulnerable Version  -


GNU Wget <= 1.17   Race Condition / Access-list Bypass

---
-  Vulnerability  -
---

When wget is used in recursive/mirroring mode, according to the manual it can 
take the following access list options:

"Recursive Accept/Reject Options:
  -A acclist --accept acclist
  -R rejlist --reject rejlist

Specify comma-separated lists of file name suffixes or patterns to accept or 
reject. Note that if any of the wildcard characters, *, ?, [ or ], appear in an 
element of acclist or rejlist, it will be treated as a pattern, rather than a 
suffix."

These can for example be used to only download JPG images. 

The vulnerability surfaces when wget is used to download a single file with 
recursive option (-r / -m) and an access list ( -A ), wget only applies the 
list at the end of the download process. 

This can be observed on the output below:

# wget -r -nH -A '*.jpg' http://attackers-server/test.php
Resolving attackers-server... 192.168.57.1
Connecting to attackers-server|192.168.57.1|:80... connected.
    HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: 'test.php'

15:05:46 (27.3 B/s) - 'test.php' saved [52]

Removing test.php since it should be rejected.

FINISHED


Although the file get successfully deleted in the end, this creates a race 
condition situation as an attacker who has control over the URL, could slow 
down the download process so that he had a chance to make use of the malicious 
file before it gets deleted.


It is very easy for an attacker to win this race as the file only gets deleted 
after the HTTP connection is terminated. He can therefore keep the connection 
open as long as necessary to make use of the uploaded file.  Below is proof of 
concept exploit that demonstrates this technique.  


--
-  Proof of Concept  -
--

< REDACTED BY iDefense FOR THE TIME BEING >

---
-  Discussion  -
---

>From the wget manual:

https://access.redhat.com/security/team/contact

> Finally, it's worth noting that the accept/reject lists are matched twice 
> against downloaded files: once against the URL's filename portion, to 
> determine if the file should be downloaded in the first place; then, after it 
> has been accepted and successfully downloaded, the local file's name is also 
> checked against the accept/reject lists to see if it should be removed. The 
> rationale was that, since '.htm' and '.html' files are always downloaded 
> regardless of accept/reject rules, they should be removed after being 
> downloaded and scanned for links, if they did match the accept/reject lists. 
> However, this can lead to unexpected results, since the local filenames can 
> differ from the original URL filenames in the following ways, all of which 
> can change whether an accept/reject rule matches: 


and from the source code, in file recur.c:

  if (file
  && (opt.delete_after
  || opt.spider /* opt.recursive is implicitely true */
  || !acceptable (file)))
{
  /* Either --delete-after was specified, or we loaded this
 (otherwise unneeded because of --spider or rejected by -R)
 HTML file just to harvest its hyperlinks -- in either case,
 delete the local file. */
  DEBUGP (("Removing file due to %s in recursive_retrieve():\n",
   opt.delete_after ? "--delete-after" :
   (opt.spider ? "--spider" :
"recursive rejection criteria")));
  logprintf (LOG_VERBOSE,
 (opt.delete_after || opt.spider
  ? _("Removing %s.\n")
  : _("Removing %s since it should be rejected.\n")),
 file);
  if (unlink (file))
logprintf (LOG_NOTQUIET, "unlink: %s\n", strerror (errno));
  logputs (L

Re: [Bug-wget] feature request: automatically check OpenPGP signatures

2016-06-22 Thread Neal H. Walfield
Hi Tim,

On Wed, 22 Jun 2016 10:33:50 +0200,
Tim Ruehsen wrote:
> there already is a standard for such things, called Metalink, supported by 
> wget (and most other download tools). The standard also contains support for 
> OpenPGP signatures.

I wasn't aware of this.  Thanks for pointing it out!

:) Neal



Re: [Bug-wget] feature request: automatically check OpenPGP signatures

2016-06-22 Thread Tim Ruehsen
Hello Neal,

there already is a standard for such things, called Metalink, supported by 
wget (and most other download tools). The standard also contains support for 
OpenPGP signatures.

[1]https://en.wikipedia.org/wiki/Metalink
[2]https://tools.ietf.org/html/rfc5854
[3]https://tools.ietf.org/html/rfc6249
[4]http://www.metalinker.org/

Tim

On Tuesday 21 June 2016 12:15:44 Neal H. Walfield wrote:
> Hi wget developers,
> 
> It is unfortunately increasingly common that tutorials, howtos and
> installation programs do something like:
> 
>   wget --no-check-certificate https://some.server/path/install.sh
>   chmod a+x install.sh
>   ./install.sh
> 
> Ouch!
> 
> It would be great if wget had an option to specify an OpenPGP
> fingerprint that should be used to check a signature.  I imagine
> something like this:
> 
>   wget --check-sig 8F1118A33DDA9BA48E62AACB3243630052D9 http://...
> 
> (The signature could either be inline, which would prevent the use of
> the file until the signature is verified, which is arguably good, or
> automatically looked for in a separate file called, say, filename.sig,
> by default.)
> 
> For users who are just copying and pasting, this represents no
> additional work while adding a fair amount of protection.  For
> developers, it is a bit more work, but they should be providing
> signatures anyways.  For those who already provide signatures, this
> would help ensure that people actually check them and it would
> simplify the installation guides.  See, for instance, tails:
> 
>   https://tails.boum.org/install/expert/usb/
> 
> Thanks for considering this feature request!
> 
> :) Neal

signature.asc
Description: This is a digitally signed message part.


[Bug-wget] feature request: automatically check OpenPGP signatures

2016-06-21 Thread Neal H. Walfield
Hi wget developers,

It is unfortunately increasingly common that tutorials, howtos and
installation programs do something like:

  wget --no-check-certificate https://some.server/path/install.sh
  chmod a+x install.sh
  ./install.sh

Ouch!

It would be great if wget had an option to specify an OpenPGP
fingerprint that should be used to check a signature.  I imagine
something like this:

  wget --check-sig 8F1118A33DDA9BA48E62AACB3243630052D9 http://...

(The signature could either be inline, which would prevent the use of
the file until the signature is verified, which is arguably good, or
automatically looked for in a separate file called, say, filename.sig,
by default.)

For users who are just copying and pasting, this represents no
additional work while adding a fair amount of protection.  For
developers, it is a bit more work, but they should be providing
signatures anyways.  For those who already provide signatures, this
would help ensure that people actually check them and it would
simplify the installation guides.  See, for instance, tails:

  https://tails.boum.org/install/expert/usb/

Thanks for considering this feature request!

:) Neal



Re: [Bug-wget] short option for --content-disposition (feature request)

2016-01-12 Thread Ángel González

On 05/01/16 14:34, Giuseppe Scrivano wrote:

Hanno Böck  writes:


Hi,

I quite often use the --content-disposition command line option of wget.
It's a bit annoying to type in, but currently it seems there is no
short option for it.
Could such a short option be added?

c and d are already taken and I think also all other characters in
content disposition. So I'd like to propose to use -z or -y (just
because they're not used yet and easy to remember), but I'd be okay
with any other char.

you can shorten long command line options specifying only a prefix if it
doesn't collide with another one, in your case you can specify
--content-d (since wget has --content-on-error as well).

Regards,
Giuseppe

As well as being able to set an alias in your shell for wget

--content-disposition or even set that option in your wgetrc so it applies by 
default, which is probably what you want.





Re: [Bug-wget] short option for --content-disposition (feature request)

2016-01-05 Thread Giuseppe Scrivano
Hanno Böck  writes:

> Hi,
>
> I quite often use the --content-disposition command line option of wget.
> It's a bit annoying to type in, but currently it seems there is no
> short option for it.
> Could such a short option be added?
>
> c and d are already taken and I think also all other characters in
> content disposition. So I'd like to propose to use -z or -y (just
> because they're not used yet and easy to remember), but I'd be okay
> with any other char.

you can shorten long command line options specifying only a prefix if it
doesn't collide with another one, in your case you can specify
--content-d (since wget has --content-on-error as well).

Regards,
Giuseppe



[Bug-wget] short option for --content-disposition (feature request)

2016-01-03 Thread Hanno Böck
Hi,

I quite often use the --content-disposition command line option of wget.
It's a bit annoying to type in, but currently it seems there is no
short option for it.
Could such a short option be added?

c and d are already taken and I think also all other characters in
content disposition. So I'd like to propose to use -z or -y (just
because they're not used yet and easy to remember), but I'd be okay
with any other char.

-- 
Hanno Böck
http://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42


pgpBM8_Gu6vOU.pgp
Description: OpenPGP digital signature


Re: [Bug-wget] Feature Request

2015-09-01 Thread Darshit Shah
There already exists a similar option called --limit-rate. It's
documented in the man and info pages for Wget.

On Tue, Sep 1, 2015 at 9:41 PM, Harvey Pwca  wrote:
> Sir/Madam;
>
> I have an idea for a new wget option.
>
> Allow the user to specify a download speed for the item(s) being retrieved.
> The speed could be set in Bytes (B), Kilobytes (K), or Megabytes (M)
> depending upon the bandwidth available to the user.
>
> While it is possible to make wget become "aware" of the bandwidth available
> —and what is being used by other applications— I think this would not be a
> good idea since it removes from the user the ability to prioritize their
> applications usage of the bandwidth. Just allow the user to state what
> bandwidth wget should use.
>
> Thank you.
>



-- 
Thanking You,
Darshit Shah



[Bug-wget] [bug #45804] Add --client-request similar to --server-response

2015-08-21 Thread grarpamp
URL:
  http://savannah.gnu.org/bugs/?45804

 Summary: Add --client-request similar to --server-response
 Project: GNU Wget
Submitted by: grarpamp
Submitted on: Fri 21 Aug 2015 07:13:49 AM GMT
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: grarpamp
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: 1.16.3
Operating System: None
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

wget -nv --client-request --server-response --save-headers --content-on-error

Consider adding --client-request to print that portion.
Each request would get saved along with --save-headers
and threaded in with the --server-response.

Second and lesser yet related...
If there was a redirecting or retrying error
that produced a --content-on-error page, that
page could optionally be threaded in after each
respective request/response pair above with new option
--include-error-page. (This would be a WARC-like
file for non WARC users with perhaps 80 chars of #
separating the sections. It might be useful for
debugging or archive or efficiency to just get
all the data in a file, where dealing with
--max-redirect 0 and a response code would be
awkward.)





___

Reply to this item at:

  http://savannah.gnu.org/bugs/?45804

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-08-15 Thread Darshit Shah
Follow-up Comment #7, bug #44674 (project wget):

Maybe we can implement a --dry-run option to allow the user to see how the
request would look like without actually sending it. 

It would still require the --debug option to see the actual request.

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] Help request to download data from http

2015-07-13 Thread Pedro LL
Hello,
I am a beginner user of wget. I wanted to use it to download data from an 
specific website into my mac using unix but I have been unable after many many 
attempts.
I was typing in the terminal:
 wget -q -O - http://sodaserver.tamu.edu/assim/SODA_2.2.4/ | grep _2009 | wget 
-N --wait=0.5 --random-wait --force-html -i -
but it returns this:-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200901.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200902.cdf.-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200903.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200904.cdf.-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200905.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200906.cdf.-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200907.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200908.cdf.-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200909.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200910.cdf.-: Cannot resolve incomplete link /icons/unknown.gif.-: 
Cannot resolve incomplete link SODA_2.2.4_200911.cdf.-: Cannot resolve 
incomplete link /icons/unknown.gif.-: Cannot resolve incomplete link 
SODA_2.2.4_200912.cdf.No URLs found in -.
I was wondering if someone could give me a hand with this,
Many thanks in advance,Pedro  

Re: [Bug-wget] Help request to download data from http

2015-07-13 Thread Gisle Vanem

Pedro LL wrote:


I am a beginner user of wget. I wanted to use it to download data from an 
specific

 website into my mac using unix but I have been unable after many many 
attempts.

I was typing in the terminal:
  wget -q -O - http://sodaserver.tamu.edu/assim/SODA_2.2.4/ | grep _2009 |

 wget -N --wait=0.5 --random-wait --force-html -i -

but it returns this:-: Cannot resolve incomplete link /icons/unknown.gif.-


Since the base-href was removed in the first output, you'll have
to add it yourself in the 2nd invocation of Wget. Something like:

  wget -q -O - http://sodaserver.tamu.edu/assim/SODA_2.2.4/ | grep _2009 |
   wget -N --base=http://sodaserver.tamu.edu/assim/SODA_2.2.4/ --wait=0.5
  --random-wait --force-html -i -

But I think you could just do:
  wget -rq -nd -np -A.cdf --accept-regex=.*_2009.* 
http://sodaserver.tamu.edu/assim/SODA_2.2.4/

directly.

--
--gv



[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-31 Thread Tim Ruehsen
Follow-up Comment #3, bug #44674 (project wget):

Just open a second console and start
  nc -l -p 

Start wget in your first console
  http_proxy=localhost: wget http://www.example.com

nc will now dump everything that Wget sends. You could even generate an answer
(e.g. with copy  paste).

Wget just adds a Proxy-Connection header which will not be sent on non-proxy
connections.


___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-31 Thread INVALID.NOREPLY
Follow-up Comment #6, bug #44674 (project wget):

Also --debug doesn't show full FORM bodies.

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Tim Rühsen
Hi Steven,

Am Dienstag, 31. März 2015, 18:11:58 schrieb Stephen Wells:
 Dear all - I am currently trying to use wget to obtain mp3 files from the
 Google Translate TTS system. In principle this can be done using:
 
 wget -U Mozilla -O ${string}.mp3 
 http://translate.google.com/translate_tts?tl=TLq=${string};
 
 where TL is a twoletter language code (en,fr,de and so on).
 
 However I am meeting a serious error when I try to send Russian strings
 (tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under
 Cygwin) and the file system will display the cyrillic strings no problem.
 If I provide a command like this:
 
 http://translate.google.com/translate_tts?tl=ruq=мазать
 
 wget incorrectly processes the Cyrillic characters _before_ sending the
 http request, so what it actually requests is:
 
 http://translate.google.com/translate_tts?tl=ruq=%D0%BC%D0%B0%D0%B7%D0%B0%D
 1%82%D1%8C

This seems to be the correct behavior of a web client.
The URL in the GET request is transmitted UTF-8 encoded and percent escaping 
is performed for chars 127 (not mentioning control chars here).

 This of course produces a string of gibberish in the resulting mp3 file!

This is something different. If you are talking about the file name, well 
there is --restrict-file-names=nocontrol. Did you give it a try ?

 Is there any way to make wget actually send the string it is given, instead
 of mangling it on the way out? This is really blocking me.

From what you write, I am unsure if you are talking about the resulting file 
name or about HTTP URL encoding in a GET request.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Stephen Wells
Dear all - I am currently trying to use wget to obtain mp3 files from the
Google Translate TTS system. In principle this can be done using:

wget -U Mozilla -O ${string}.mp3 
http://translate.google.com/translate_tts?tl=TLq=${string};

where TL is a twoletter language code (en,fr,de and so on).

However I am meeting a serious error when I try to send Russian strings
(tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under
Cygwin) and the file system will display the cyrillic strings no problem.
If I provide a command like this:

http://translate.google.com/translate_tts?tl=ruq=мазать

wget incorrectly processes the Cyrillic characters _before_ sending the
http request, so what it actually requests is:


http://translate.google.com/translate_tts?tl=ruq=%D0%BC%D0%B0%D0%B7%D0%B0%D1%82%D1%8C

This of course produces a string of gibberish in the resulting mp3 file!

Is there any way to make wget actually send the string it is given, instead
of mangling it on the way out? This is really blocking me.

Cheers,
Stephen


[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-31 Thread INVALID.NOREPLY
Follow-up Comment #5, bug #44674 (project wget):

Tim: OK please somebody be sure that example is given nearby the --debug
option section of the man page.

Also it would be good if there was a built-in way to do it in case it is
inconvenient to install other programs or do extra input output job
starting and waiting on a given system.

Anonymous: the --debug part of the man page doesn't say clearly what it
will give, also --debug might not be compiled in. And in fact --debug
gives more than just the request, and --debug needs one to attempt the
request without any --dry-run safety mechanism before going on to the net...

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-31 Thread anonymous
Follow-up Comment #4, bug #44674 (project wget):

You can use the --debug flag to show the HTTP request and response headers,
including when the traffic is encrypted with SSL.

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Stephen Wells
Hi Tim,

Sorry for the ambiguity. To be more specific, the file name is fine: in the
shell script the file name $*.mp3 expands correctly to e.g. мазать.mp3 .
The audio within the file consists of the Google robot voice reading the
string of percent-escaped characters literally, not reading the Russian
word.

I will try Random Coder's suggestion of a more complete user agent string -
 apparently http://whatsmyuseragent.com/ is a handy way to find out what
your browser claims to be :)

On Tue, Mar 31, 2015 at 9:50 PM, Tim Rühsen tim.rueh...@gmx.de wrote:

 Hi Steven,

 Am Dienstag, 31. März 2015, 18:11:58 schrieb Stephen Wells:
  Dear all - I am currently trying to use wget to obtain mp3 files from the
  Google Translate TTS system. In principle this can be done using:
 
  wget -U Mozilla -O ${string}.mp3 
  http://translate.google.com/translate_tts?tl=TLq=${string};
 
  where TL is a twoletter language code (en,fr,de and so on).
 
  However I am meeting a serious error when I try to send Russian strings
  (tl=ru) in Cyrillic characters. I'm working in a UTF-8 environment (under
  Cygwin) and the file system will display the cyrillic strings no problem.
  If I provide a command like this:
 
  http://translate.google.com/translate_tts?tl=ruq=мазать
 
  wget incorrectly processes the Cyrillic characters _before_ sending the
  http request, so what it actually requests is:
 
 
 http://translate.google.com/translate_tts?tl=ruq=%D0%BC%D0%B0%D0%B7%D0%B0%D
  1%82%D1%8C

 This seems to be the correct behavior of a web client.
 The URL in the GET request is transmitted UTF-8 encoded and percent
 escaping
 is performed for chars 127 (not mentioning control chars here).

  This of course produces a string of gibberish in the resulting mp3 file!

 This is something different. If you are talking about the file name, well
 there is --restrict-file-names=nocontrol. Did you give it a try ?

  Is there any way to make wget actually send the string it is given,
 instead
  of mangling it on the way out? This is really blocking me.

 From what you write, I am unsure if you are talking about the resulting
 file
 name or about HTTP URL encoding in a GET request.

 Regards, Tim



Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Stephen Wells
THANK YOU Random Coder! That did the trick. Apparently my earlier attempts
were unsuccessful because the problem I was trying to solve was not the
problem I actually had :)

Specifically I went to whatsmyuseragent.com and my browser id'd as
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/41.0.2272.101 Safari/537.36 . I put that, in quotes, instead of
just Mozilla as the argument of the -U option, and now I get back an mp3
file with proper Russian audio in it. Much victory.


Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Ángel González

On 01/04/15 00:16, Stephen Wells wrote:

Hi Tim,

Sorry for the ambiguity. To be more specific, the file name is fine: in the
shell script the file name $*.mp3 expands correctly to e.g. мазать.mp3 .
The audio within the file consists of the Google robot voice reading the
string of percent-escaped characters literally, not reading the Russian
word.

I will try Random Coder's suggestion of a more complete user agent string -
  apparently http://whatsmyuseragent.com/ is a handy way to find out what
your browser claims to be :)


I remember google had a parameter for the encoding. It may be worth 
explicitly noting that it's utf-8, it may be using a fallback based on 
the User-Agent.





Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?

2015-03-31 Thread Random Coder
On Tue, Mar 31, 2015 at 10:11 AM, Stephen Wells sawells.2...@gmail.com wrote:
 Dear all - I am currently trying to use wget to obtain mp3 files from the
 Google Translate TTS system. In principle this can be done using:

 wget -U Mozilla -O ${string}.mp3 
 http://translate.google.com/translate_tts?tl=TLq=${string};

 ...

 http://translate.google.com/translate_tts?tl=ruq=%D0%BC%D0%B0%D0%B7%D0%B0%D1%82%D1%8C

 This of course produces a string of gibberish in the resulting mp3 file!


That URL is correct, it's what you'll see a browser send across the
wire for the same string.  Google is producing gibberish because of
some User-agent sniffing that they appear to be doing.

If you change the user agent to something that's more complete, like
Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/41.0.2228.0 Safari/537.36 instead of just Mozilla, it should
work correctly.



[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-30 Thread INVALID.NOREPLY
Follow-up Comment #1, bug #44674 (project wget):

Sure one could use tcpflow to see what is being sent, but what if it is an
https address, and what if we want to see it before we send it?

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-30 Thread INVALID.NOREPLY
Follow-up Comment #2, bug #44674 (project wget):

OK one could send to http://example.net/ recording with tcpflow... but still
one shouldn't need to do that.

___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #44674] Add an option that will send the HTTP request to stderr or a file

2015-03-30 Thread INVALID.NOREPLY
URL:
  http://savannah.gnu.org/bugs/?44674

 Summary: Add an option that will send the HTTP request to
stderr or a file
 Project: GNU Wget
Submitted by: jidanni
Submitted on: Tue 31 Mar 2015 04:06:57 AM GMT
Category: Testing
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: None
Operating System: None
 Reproducibility: None
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

Add an option that will send the HTTP request to stderr or a file, and not
over the network.

That way one could see what wget will send without sending it.









___

Reply to this item at:

  http://savannah.gnu.org/bugs/?44674

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [PATCH] Add Accept-Encoding to request header (fixes #40819)

2014-11-11 Thread Tim Ruehsen
This one-line patch fixes an RFC 2616 issue (see wget bug #40819 -
https://savannah.gnu.org/bugs/?40819)
.
If Accept-Encoding header is missing, the server may assume *any* type of
encoding (seems stupid to me, but the Squid proxy seems to obey that).

What do you think ?

TimFrom 624e171062e30e0c92cdd6c3e970f595d30b1572 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim Rühsen?= tim.rueh...@gmx.de
Date: Tue, 11 Nov 2014 16:01:58 +0100
Subject: [PATCH] Add 'Accept-Encoding: identity' to request header
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes bug #40819
Reported-by: Noël Köthe n...@debian.org
---
 src/ChangeLog | 7 +++
 src/http.c| 1 +
 2 files changed, 8 insertions(+)

diff --git a/src/ChangeLog b/src/ChangeLog
index a40a5a6..fc5570a 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,10 @@
+2014-11-11  Tim Ruehsen  tim.rueh...@gmx.de
+
+	* http.c (gethttp): Always add Accept-Encoding: identity
+
+	Fixes bug #40819
+	Reported-by: Noël Köthe n...@debian.org
+
 2014-11-10  Tim Ruehsen  tim.rueh...@gmx.de

 	* openssl.c: Fix compile-time check for TLSv1.1 and TLSv1.2
diff --git a/src/http.c b/src/http.c
index 584f4a8..de96e32 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1801,6 +1801,7 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, struct url *proxy,
 rel_value);
   SET_USER_AGENT (req);
   request_set_header (req, Accept, */*, rel_none);
+  request_set_header (req, Accept-Encoding, identity, rel_none);

   /* Find the username and password for authentication. */
   user = u-user;
--
2.1.3



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Add Accept-Encoding to request header (fixes #40819)

2014-11-11 Thread Darshit Shah

On 11/11, Tim Rühsen wrote:

This one-line patch fixes an RFC 2616 issue (see wget bug #40819 -
https://savannah.gnu.org/bugs/?40819)
.
If Accept-Encoding header is missing, the server may assume *any* type of
encoding (seems stupid to me, but the Squid proxy seems to obey that).

What do you think ?


Looks good. Please push it.

Also, I haven't checked, but does Wget handle a 406 Not Acceptable response 
correctly? It ought to be a fatal error.


P.S.: Unsigned email because GnuPG refused to update without killing the current 
user session



Tim



From 624e171062e30e0c92cdd6c3e970f595d30b1572 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim=20R=C3=BChsen?= tim.rueh...@gmx.de
Date: Tue, 11 Nov 2014 16:01:58 +0100
Subject: [PATCH] Add 'Accept-Encoding: identity' to request header
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes bug #40819
Reported-by: Noël Köthe n...@debian.org
---
src/ChangeLog | 7 +++
src/http.c| 1 +
2 files changed, 8 insertions(+)

diff --git a/src/ChangeLog b/src/ChangeLog
index a40a5a6..fc5570a 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,10 @@
+2014-11-11  Tim Ruehsen  tim.rueh...@gmx.de
+
+   * http.c (gethttp): Always add Accept-Encoding: identity
+
+   Fixes bug #40819
+   Reported-by: Noël Köthe n...@debian.org
+
2014-11-10  Tim Ruehsen  tim.rueh...@gmx.de

* openssl.c: Fix compile-time check for TLSv1.1 and TLSv1.2
diff --git a/src/http.c b/src/http.c
index 584f4a8..de96e32 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1801,6 +1801,7 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, 
struct url *proxy,
rel_value);
  SET_USER_AGENT (req);
  request_set_header (req, Accept, */*, rel_none);
+  request_set_header (req, Accept-Encoding, identity, rel_none);

  /* Find the username and password for authentication. */
  user = u-user;
--
2.1.3





--- end quoted text ---

--
Thanking You,
Darshit Shah



Re: [Bug-wget] [PATCH] Add Accept-Encoding to request header (fixes #40819)

2014-11-11 Thread Tim Rühsen
Am Dienstag, 11. November 2014, 23:17:06 schrieb Darshit Shah:
 On 11/11, Tim Rühsen wrote:
 This one-line patch fixes an RFC 2616 issue (see wget bug #40819 -
 https://savannah.gnu.org/bugs/?40819)
 .
 If Accept-Encoding header is missing, the server may assume *any* type of
 encoding (seems stupid to me, but the Squid proxy seems to obey that).
 
 What do you think ?
 
 Looks good. Please push it.

Thanks, it is pushed.

 Also, I haven't checked, but does Wget handle a 406 Not Acceptable response
 correctly? It ought to be a fatal error.

I didn't check that. Best would be a test case.

 P.S.: Unsigned email because GnuPG refused to update without killing the
 current user session

Hehe, Kmail randomly has the 'Sign' button deactivated.
That's my reason for sending unsigned mails sometimes.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Features request

2014-11-01 Thread Darshit Shah

On 11/01, taras.malivanc...@totaldefense.com wrote:

Hi Darshit Shah ,


Darshit Shah wrote:

Hi Taras,

Thanks for your interest in Wget.

On 10/31, taras.malivanc...@totaldefense.com wrote:

Hi all,

I use wget, I think such features will be useful.Are there 
objections against the below, or something is already implemented 
and I did not find?


1) Maximal size of downloaded file (not all the batch) : do not 
download if value in header present and stop and delete otherwise 
if it is too big. I implemented it in my internal version. This 
prevents downloading huge files when downloading from list and the 
process is not controlled manually.
This might be an interesting option to consider. I'm not aware of 
any option to limit downloads based on a single file's size. 
However, you must remember that not all servers send a 
Content-Length header. In such scenarios, Wget *must* download the 
file before it realizes its actual size. In such a case, what would 
be the expected outcome? Should Wget delete the larger file? Or 
retain it?
It should delete partial file.The full file must not be downloaded , 
the download will stop when downloaded stuff reaches limit.For example 
,/maxsize 10 while downloading 100 Mb file without size in header 
- stops at 10 Mb and deletes the downloaded chunk.


I'm not too sure if this is feature is required. But if you already have it 
implemented, do share it on this list and we may consider merging it into the 
codebase.


2) Rename output to file001...bin if the URL produces junk name 
(now unable to download).Additional option -  for any non-7bit 
chars.
Could you please explain this issue further? What kind of junk name 
causes Wget to refuse downloading? A test case would be the best way 
to demonstrate this.
HTTP download with compilcated URLs  like  
http://someurl/foo.php?x=1?y=2jsdjdjdd. ,the wget produces file name 
that cannot be created.I amnot sure,using not the last wget, 
encountered such issues sometimes.In other cases  it produces 
valid,but not readable name like 
foofdjfdklfdfdfdl;fkfkfdfkfkfkfkbad.bin


I still don't understand. I would suggest you give a concrete example with a 
real URL and the corresponding filename.


It seems like you're on Windows and Wget creates some filename which Windows 
refuses to create. We may only have to rename the output filenames for Windows 
in that case. Do provide a precise example so that we can understand the problem 
better.



3) -N should work with - O ,also for bad output names without -O.
Look at the manual page for Wget, under the --output-document 
section. It explains why -N and -O do not fit well together.


This is also related to bad names. With -N -O the wget can look at 
timestamp of file named as written in -O. Please explain why this is 
not OK.
I don't understand what you're asking for. Could you please be more clear on 
your position as to why -N and -O have a use-case when used together?



--- end quoted text ---

--
Thanking You,
Darshit Shah


pgpYqDq0CrsJO.pgp
Description: PGP signature


Re: [Bug-wget] Features request

2014-11-01 Thread taras.malivanc...@totaldefense.com

Darshit Shah wrote:

On 11/01, taras.malivanc...@totaldefense.com wrote:


 a case, what would be the expected outcome? Should Wget delete the 
larger file? Or retain it?
It should delete partial file.The full file must not be downloaded , 
the download will stop when downloaded stuff reaches limit.For 
example ,/maxsize 10 while downloading 100 Mb file without size 
in header - stops at 10 Mb and deletes the downloaded chunk.


I'm not too sure if this is feature is required. But if you already 
have it implemented, do share it on this list and we may consider 
merging it into the codebase.
I have it,but in old compilation of version 1.9, will try to compile 
the last one
nderstand. I would suggest you give a concrete example with a real URL 
and the corresponding filename.


It seems like you're on Windows and Wget creates some filename which 
Windows refuses to create. We may only have to rename the output 
filenames for Windows in that case. Do provide a precise example so 
that we can understand the problem better.



I will send as I will find one
 also related to bad names. With -N -O the wget can look at timestamp 
of file named as written in -O. Please explain why this is not OK.
I don't understand what you're asking for. Could you please be more 
clear on your position as to why -N and -O have a use-case when used 
together?


 The usage is keeping a file up to date when the output  provides bad. 
Either the file cannot be created or working with bad name is not 
convenient. The process runs automatically periodically.wget -O file.bin 
-N http://dsjdjfkfkfdfd... - either tells the file is not newer or 
downloads new version.If the download fails - it keeps

old file. Then following batches process the data accordingly.




[Bug-wget] Features request

2014-10-31 Thread taras.malivanc...@totaldefense.com

Hi all,

I use wget, I think such features will be useful.Are there objections 
against the below, or something is already implemented and I did not find?


1) Maximal size of downloaded file (not all the batch) : do not download 
if value in header present and stop and delete otherwise if it is too 
big. I implemented it in my internal version. This prevents downloading 
huge files when downloading from list and the process is not controlled 
manually.
2) Rename output to file001...bin if the URL produces junk name (now 
unable to download).Additional option -  for any non-7bit chars.

3) -N should work with - O ,also for bad output names without -O.

Regards,Taras.



Re: [Bug-wget] Features request

2014-10-31 Thread Darshit Shah

Hi Taras,

Thanks for your interest in Wget.

On 10/31, taras.malivanc...@totaldefense.com wrote:

Hi all,

I use wget, I think such features will be useful.Are there objections 
against the below, or something is already implemented and I did not 
find?


1) Maximal size of downloaded file (not all the batch) : do not 
download if value in header present and stop and delete otherwise if 
it is too big. I implemented it in my internal version. This prevents 
downloading huge files when downloading from list and the process is 
not controlled manually.
This might be an interesting option to consider. I'm not aware of any option to 
limit downloads based on a single file's size. However, you must remember that 
not all servers send a Content-Length header. In such scenarios, Wget *must* 
download the file before it realizes its actual size. In such a case, what would 
be the expected outcome? Should Wget delete the larger file? Or retain it?


2) Rename output to file001...bin if the URL produces junk name (now 
unable to download).Additional option -  for any non-7bit chars.
Could you please explain this issue further? What kind of junk name causes Wget 
to refuse downloading? A test case would be the best way to demonstrate this.



3) -N should work with - O ,also for bad output names without -O.
Look at the manual page for Wget, under the --output-document section. It 
explains why -N and -O do not fit well together.


--
Thanking You,
Darshit Shah


pgp6KbLOgPzch.pgp
Description: PGP signature


Re: [Bug-wget] Features request

2014-10-31 Thread taras.malivanc...@totaldefense.com

Hi Darshit Shah ,


Darshit Shah wrote:

Hi Taras,

Thanks for your interest in Wget.

On 10/31, taras.malivanc...@totaldefense.com wrote:

Hi all,

I use wget, I think such features will be useful.Are there objections 
against the below, or something is already implemented and I did not 
find?


1) Maximal size of downloaded file (not all the batch) : do not 
download if value in header present and stop and delete otherwise if 
it is too big. I implemented it in my internal version. This prevents 
downloading huge files when downloading from list and the process is 
not controlled manually.
This might be an interesting option to consider. I'm not aware of any 
option to limit downloads based on a single file's size. However, you 
must remember that not all servers send a Content-Length header. In 
such scenarios, Wget *must* download the file before it realizes its 
actual size. In such a case, what would be the expected outcome? 
Should Wget delete the larger file? Or retain it?
 It should delete partial file.The full file must not be downloaded , 
the download will stop when downloaded stuff reaches limit.For example 
,/maxsize 10 while downloading 100 Mb file without size in header - 
stops at 10 Mb and deletes the downloaded chunk.


2) Rename output to file001...bin if the URL produces junk name (now 
unable to download).Additional option -  for any non-7bit chars.
Could you please explain this issue further? What kind of junk name 
causes Wget to refuse downloading? A test case would be the best way 
to demonstrate this.
HTTP download with compilcated URLs  like  
http://someurl/foo.php?x=1?y=2jsdjdjdd. ,the wget produces file name 
that cannot be created.I amnot sure,using not the last wget, encountered 
such issues sometimes.In other cases  it produces valid,but not readable 
name like foofdjfdklfdfdfdl;fkfkfdfkfkfkfkbad.bin



3) -N should work with - O ,also for bad output names without -O.
Look at the manual page for Wget, under the --output-document section. 
It explains why -N and -O do not fit well together.


This is also related to bad names. With -N -O the wget can look at 
timestamp of file named as written in -O. Please explain why this is 
not OK.





[Bug-wget] request for help

2013-12-16 Thread jennifer stone
Hello,
Greetings
I am a novice who is really interested in contributing to wget. guidance
and suggestions are required
Thanking you in anticipation


Re: [Bug-wget] request for help

2013-12-16 Thread Darshit Shah
Hi,

Thank you for your interest in GNU Wget. You can start with the Wget Wiki
at: http://wget.addictivecode.org/
This will give you a basic understanding of the application and will link
you to the source repository and bug tracker too.

GNU Wget uses Git. You should have a fairly working knowledge of Git and
version control systems. Clone the repository first and look at the
bug-tracker to find some bugs that you think you can help fix. Once you're
done with that, use `git format-patch` to generate a patch file and send it
over to this group. Remember to add an entry to the ChangeLog file before
creating the patch.

You could start off with simple bugs like: 40746 or 40656. A patch is
tentatively supplied with the former while the latter requires creating a
test case for replicating the results first.

There is also simple bug in the output of Wget that you could attempt to
resolve. When invoked with -O and an extremely long URL, Wget's output is
faulty. This has been documented in Bug #40908. It should be a trivial
solution.


On Mon, Dec 16, 2013 at 8:20 AM, jennifer stone jenny.stone...@gmail.comwrote:

 Hello,
 Greetings
 I am a novice who is really interested in contributing to wget. guidance
 and suggestions are required
 Thanking you in anticipation




-- 
Thanking You,
Darshit Shah


[Bug-wget] request for help with wget (crawling search results of a website)

2013-11-03 Thread Altug Tekin
Dear mailing List members,

According to the website http://www.gnu.org/software/wget/ it is ok to
write emails with help requests to this mailing list. I have the following
problem:

I am trying to crawl the search results of a news website using *wget*.

The name of the website is *www.voanews.com http://www.voanews.com*.

After typing in my *search keyword* and clicking search on the website, it
proceeds to the results. Then i can specify a *to and a from-date* and
hit search again.

After this the URL becomes:

http://www.voanews.com/search/?st=articlek=mykeyworddf=10%2F01%2F2013dt=09%2F20%2F2013ob=dt#article

and the actual content of the results is what i want to download.

To achieve this I created the following wget-command:

wget --reject=js,txt,gif,jpeg,jpg \
 --accept=html \
 --user-agent=My-Browser \
 --recursive --level=2 \
 
www.voanews.com/search/?st=articlek=germanydf=08%2F21%2F2013dt=09%2F20%2F2013ob=dt#article

Unfortunately, the crawler doesn't download the search results. It only
gets into the upper link bar, which contains the Home,USA,Africa,Asia,...
links and saves the articles they link to.

*It seems like he crawler doesn't check the search result links at all*.

*What am I doing wrong and how can I modify the wget command to download
the results search list links (and of course the sites they link to) only ?*

Thank you for any help...


Re: [Bug-wget] request for help with wget (crawling search results of a website)

2013-11-03 Thread Dagobert Michelsen
Hi,

Am 03.11.2013 um 09:13 schrieb Altug Tekin altugteki...@gmail.com:
 I am trying to crawl the search results of a news website using *wget*.
 
 The name of the website is *www.voanews.com http://www.voanews.com*.
 
 After typing in my *search keyword* and clicking search on the website, it
 proceeds to the results. Then i can specify a *to and a from-date* and
 hit search again.
 
 After this the URL becomes:
 
 http://www.voanews.com/search/?st=articlek=mykeyworddf=10%2F01%2F2013dt=09%2F20%2F2013ob=dt#article
 
 and the actual content of the results is what i want to download.
 
 To achieve this I created the following wget-command:
 
 wget --reject=js,txt,gif,jpeg,jpg \
 --accept=html \
 --user-agent=My-Browser \
 --recursive --level=2 \
 
 www.voanews.com/search/?st=articlek=germanydf=08%2F21%2F2013dt=09%2F20%2F2013ob=dt#article
 
 Unfortunately, the crawler doesn't download the search results. It only
 gets into the upper link bar, which contains the Home,USA,Africa,Asia,...
 links and saves the articles they link to.
 
 *It seems like he crawler doesn't check the search result links at all*.
 
 *What am I doing wrong and how can I modify the wget command to download
 the results search list links (and of course the sites they link to) only ?*


You need to inspect the urls of the results and make sure to
only download these. Maybe a --no-parent is enough.


Best regards

  -- Dago


-- 
You don't become great by trying to be great, you become great by wanting to 
do something,
and then doing it so hard that you become great in the process. - xkcd #896



smime.p7s
Description: S/MIME cryptographic signature


Re: [Bug-wget] request for help with wget (crawling search results of a website)

2013-11-03 Thread Tony Lewis
Altug Tekin wrote:

 To achieve this I created the following wget-command:

 wget --reject=js,txt,gif,jpeg,jpg \
  --accept=html \
  --user-agent=My-Browser \
  --recursive --level=2 \

www.voanews.com/search/?st=articlek=germanydf=08%2F21%2F2013dt=09%2F20%2F
2013ob=dt#article

You need to quote the URL since it contains characters that are interpreted
by your command shell. (Most likely nothing after the  was sent to the
web server.

I think you might run into problems with --accept since the URL does not end
with .html so you might need to delete that argument to get the results
you want.

Tony




Re: [Bug-wget] Help request: Limit recursion, but unconditionally include all media files

2013-10-22 Thread Tim Ruehsen
On Monday 21 October 2013 12:33:10 Alexander Tobias Heinrich wrote:
 For example, I tried:
 wget --tries=3 --retry-connrefused --no-clobber --load-cookies=cookies.txt
 --convert-links --page-requisites --adjust-extension --recursive
 --include-directories /strategy/live-poker,/download
 http://www.pokerstrategy.com/strategy/live-poker
 
 This correctly downloads only the html documents I want and also gets the
 media files from the /download folder, but:
 - does not modify the html so that img-Tags point to the downloaded files
 (however, it does modify a-Tags that link to local html documents)
 - does not get media files from other domains.
 
 If for example I add --span-hosts, it simply gets too much (all documents
 from different language versions of the website that I don't need).
 
 Note: For the example URL I provided here you won't need to log in and thus
 the  load-cookies option can be waived.

Hi Alexander,

please have a look into the 'Recursive Accept/Reject Options' docs.

You could set the domains to be followed by using --domains.
Also --include-directories and/or --exclude-directories might be a help.

I am not sure that you can achieve your goal with a single call to Wget.
Missing files / directories could be downloaded using separate calls to Wget.
--input-file combined with --force-html and/or --base might be a help.

Regards, Tim




Re: [Bug-wget] Review Request (Bug 39453)

2013-08-08 Thread Will Dietz
On Thu, Aug 8, 2013 at 3:07 AM, Tim Ruehsen tim.rueh...@gmx.de wrote:
 On Wednesday 07 August 2013 17:37:43 Will Dietz wrote:
 On Wed, Aug 7, 2013 at 2:54 PM, Tim Rühsen tim.rueh...@gmx.de wrote:
  Am Mittwoch, 7. August 2013, 08:24:35 schrieb Will Dietz:
  Hi all,
 
  There's a minor integer error in wget as described in the following bug
  report:
 
  https://savannah.gnu.org/bugs/?39453
 
  Patch is included, please review.
 
  Thanks!
 
  Hi Will,
 
  isn't the real problem a signed/unsigned comparison ?
 
  If remaining_chars becomes negative (due to token is longer or equal to
  line_length), the comparison
 
if (remaining_chars = strlen (token))
 
  is false or at least undefined.
 
  If we change it to
 
if (remaining_chars = (int) strlen (token))
 
  the function should work.
 
  Using gcc -Wsign-compare warns about such constructs.
 
  Isn't there another bug, when setting
 
  remaining_chars = line_length - TABULATION;
 
  ?
 
  line_length might already be without TABULATION:
if (line_length = 0)
 
  line_length = MAX_CHARS_PER_LINE - TABULATION;
 
  Regards, Tim

 Thanks for the response!

 Yes, this is a signed/unsigned comparison error at its core.  In my
 proposed patch I chose to avoid letting 'remaining_chars' go negative
 in the first place in order to correctly handle tokens that required
 the full size_t to represent their length.  That said your suggested
 change is simpler and would also address the comparison issue.  This
 might be the way to go since such long tokens are at best very
 unlikely to occur if not impossible due to memory limits.

 As for the second bug I'm not sure as the code would still print N
 characters for the first line and wrapped lines would be indented and
 contain TABULATION fewer characters before wrapping, which seems
 correct.  Whether or not 'MAX_CHARS_PER_LINE - TABULATION' is the
 correct value for N when line_length is zero or negative is not
 something I can comment on, but I see no reason to assume it is
 incorrect either.

 Does this make sense?

 The first line has no tabulation but a prefix. So, the length of 'prefix'
 should be taken into account instead of TABULATION.

 The patch to handle both issues (signed/unsigned comparison and prefix length)
 IMHO should be:

 diff --git a/src/main.c b/src/main.c
 index 8ce0eb3..869e5db 100644
 --- a/src/main.c
 +++ b/src/main.c
 @@ -844,11 +844,11 @@ format_and_print_line (const char *prefix, const char
 *line,
line_dup = xstrdup (line);

if (line_length = 0)
 -line_length = MAX_CHARS_PER_LINE - TABULATION;
 +line_length = MAX_CHARS_PER_LINE;

if (printf (%s, prefix)  0)
  return -1;
 -  remaining_chars = line_length;
 +  remaining_chars = line_length - strlen(prefix);
/* We break on spaces. */
token = strtok (line_dup,  );
while (token != NULL)
 @@ -856,7 +856,7 @@ format_and_print_line (const char *prefix, const char
 *line,
/* If however a token is much larger than the maximum
   line length, all bets are off and we simply print the
   token on the next line. */
 -  if (remaining_chars = strlen (token))
 +  if (remaining_chars = (int) strlen (token))
  {
if (printf (\n%*c, TABULATION, ' ')  0)
  return -1;

 Do you agree ?

 Regards, Tim


Expanding the scope of the fix (I originally was only attempting to
address the comparison bug), my latest suggested patch is attached,
with the following highlights

* Fix comparison bug
* No tabulation vs prefix-length bug (the issue you mention above,
that could cause wrapping at the wrong point).
* Avoid using strlen(prefix) for computing remaining characters (this
is important to ensure proper behavior on different locales such as
ja_JP.utf8).
* (Stylistic) Ensure consistent alignment by placing first line of
text on 'second' line, indented.  This matches the style used for
printing information about wgetrc and also makes reading the wrapped
lines easier.
* Replace dead code considering non-positive line_length with assert

Thoughts?

Thanks!

~Will


0001-format_and_print_line-Fix-bugs-when-wrapping-improve.patch
Description: Binary data


[Bug-wget] Review Request (Bug 39453)

2013-08-07 Thread Will Dietz
Hi all,

There's a minor integer error in wget as described in the following bug report:

https://savannah.gnu.org/bugs/?39453

Patch is included, please review.

Thanks!



Re: [Bug-wget] Review Request (Bug 39453)

2013-08-07 Thread Tim Rühsen
Am Mittwoch, 7. August 2013, 08:24:35 schrieb Will Dietz:
 Hi all,
 
 There's a minor integer error in wget as described in the following bug
 report:
 
 https://savannah.gnu.org/bugs/?39453
 
 Patch is included, please review.
 
 Thanks!

Hi Will,

isn't the real problem a signed/unsigned comparison ?

If remaining_chars becomes negative (due to token is longer or equal to 
line_length), the comparison
  if (remaining_chars = strlen (token))
is false or at least undefined.

If we change it to
  if (remaining_chars = (int) strlen (token))
the function should work.

Using gcc -Wsign-compare warns about such constructs.

Isn't there another bug, when setting
remaining_chars = line_length - TABULATION;
?
line_length might already be without TABULATION:
  if (line_length = 0)
line_length = MAX_CHARS_PER_LINE - TABULATION;

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Review Request (Bug 39453)

2013-08-07 Thread Will Dietz
On Wed, Aug 7, 2013 at 2:54 PM, Tim Rühsen tim.rueh...@gmx.de wrote:
 Am Mittwoch, 7. August 2013, 08:24:35 schrieb Will Dietz:
 Hi all,

 There's a minor integer error in wget as described in the following bug
 report:

 https://savannah.gnu.org/bugs/?39453

 Patch is included, please review.

 Thanks!

 Hi Will,

 isn't the real problem a signed/unsigned comparison ?

 If remaining_chars becomes negative (due to token is longer or equal to
 line_length), the comparison
   if (remaining_chars = strlen (token))
 is false or at least undefined.

 If we change it to
   if (remaining_chars = (int) strlen (token))
 the function should work.

 Using gcc -Wsign-compare warns about such constructs.

 Isn't there another bug, when setting
 remaining_chars = line_length - TABULATION;
 ?
 line_length might already be without TABULATION:
   if (line_length = 0)
 line_length = MAX_CHARS_PER_LINE - TABULATION;

 Regards, Tim

Thanks for the response!

Yes, this is a signed/unsigned comparison error at its core.  In my
proposed patch I chose to avoid letting 'remaining_chars' go negative
in the first place in order to correctly handle tokens that required
the full size_t to represent their length.  That said your suggested
change is simpler and would also address the comparison issue.  This
might be the way to go since such long tokens are at best very
unlikely to occur if not impossible due to memory limits.

As for the second bug I'm not sure as the code would still print N
characters for the first line and wrapped lines would be indented and
contain TABULATION fewer characters before wrapping, which seems
correct.  Whether or not 'MAX_CHARS_PER_LINE - TABULATION' is the
correct value for N when line_length is zero or negative is not
something I can comment on, but I see no reason to assume it is
incorrect either.

Does this make sense?

Thanks,

~Will



[Bug-wget] Feature request

2012-12-22 Thread CCC DDD
Could there be an option to convert links in memory at the time of downloading 
the file, rather than on disk after all the files are downloaded? This would 
enable piping a recursive retrieval directly to a .tar archive

 Thanks


Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-08 Thread Petr Pisar
On Sat, Jul 07, 2012 at 01:25:49PM -0600, Daniel Kahn Gillmor wrote:
 On 07/07/2012 12:50 PM, Ángel González wrote:
  On 06/07/12 01:01, pro...@secure-mail.biz wrote:
  Because SSL CA's have failed many times (Comodo, DigiNotar, ...) I wish to 
  have an option to pin a SSL certificate. The fingerprint may be optionally 
  provided through a new option.
  Have you tried using --ca-certificate option?
 
 I believe the OP wants to pin the certificate of the remote server (that
 is, the end entity certificate), whereas --ca-certificate pins the
 certificate of the issuing authority.
 
Indeed? I thought the --ca-certificate just makes the certificate trustful, so
declaring server certificate using --ca-certificate could be enought.

Though there can be problem with HTTP redirects and of course some picky TLS
libraries can insist on CA=true X.509 attribute. Also some TLS implementations
checks the server hostname against certificate name. 

So if the TLS library cannot be cheated with --ca-certificate option,
overriding root of trust in other way is good idea.

I'm just little worried about the digest algorithm. One can claim the MD5 is
too weak. There have been real successfull attacks exploiting MD5 collisions
of signed object in X.509 certificate. The ability to specify different
alogrithm is necessary.

Also remember the HTTP redirect scenario where you need to verify two (or
more) servers. It's necessary to be able to supply more pinning options.

Maybe option pinning certificate to hostname would be the best choice. No
hashes. Just supply the peer certificate content like with --ca-certificate.
E.g. `--peer-certificate example.com:/tmp/example.cert'.

-- Petr


pgpsNRlttrc9I.pgp
Description: PGP signature


Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-07 Thread Daniel Kahn Gillmor
On 07/07/2012 12:50 PM, Ángel González wrote:
 On 06/07/12 01:01, pro...@secure-mail.biz wrote:
 Because SSL CA's have failed many times (Comodo, DigiNotar, ...) I wish to 
 have an option to pin a SSL certificate. The fingerprint may be optionally 
 provided through a new option.
 Have you tried using --ca-certificate option?

I believe the OP wants to pin the certificate of the remote server (that
is, the end entity certificate), whereas --ca-certificate pins the
certificate of the issuing authority.

--dkg



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-07 Thread proper
d...@fifthhorseman.net wrote:
 On 07/07/2012 12:50 PM, Ángel González wrote:
  On 06/07/12 01:01, pro...@secure-mail.biz wrote:
  Because SSL CA's have failed many times (Comodo, DigiNotar, ...)
 I wish to have an option to pin a SSL certificate. The fingerprint may be
 optionally provided through a new option.
  Have you tried using --ca-certificate option?

 I believe the OP wants to pin the certificate of the remote server (that

 is, the end entity certificate), whereas --ca-certificate pins the
 certificate of the issuing authority.

Yes, that's what I actually wanted to say. Thanks for clarifying.

Cheers,
proper

__
powered by Secure-Mail.biz - anonymous and secure e-mail accounts.




Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-07 Thread Dagobert Michelsen
Hi,

I have a tiny comment from a downstream packager standpoint: It would be nice 
if the
capath would be configurable during configure time instead of hardcoding it
to /etc/ssl/certs as it is now - we e.g. use /etc/opt/csw/ssl/certs and need
to perl-pi in the unpacked sources. Not a real problem, but also not the most
elegant solution.


Best regards

  -- Dago

-- 
You don't become great by trying to be great, you become great by wanting to 
do something,
and then doing it so hard that you become great in the process. - xkcd #896




Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-07 Thread Daniel Kahn Gillmor
On 07/07/2012 02:20 PM, Dagobert Michelsen wrote:
 I have a tiny comment from a downstream packager standpoint: It would be nice 
 if the
 capath would be configurable during configure time instead of hardcoding it
 to /etc/ssl/certs as it is now - we e.g. use /etc/opt/csw/ssl/certs and need
 to perl-pi in the unpacked sources. Not a real problem, but also not the most
 elegant solution.

fwiw, I agree with this, and suspect that a patch wouldn't be hard to
come up with (and would be fairly non-controversial).

If you're building against GnuTLS, Look around line 88 of gnutls.c,
because i don't think GnuTLS embeds a default location for a trusted
root certificate store.

If you're building against OpenSSL, i think you might want to change
your OpenSSL configuration directly (at least on debian, libcrypto seems
to hardcode a default path to /usr/lib/ssl/certs, which is a symlink to
/etc/ssl/certs).

hth,

--dkg



signature.asc
Description: OpenPGP digital signature


[Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-06 Thread proper
Because SSL CA's have failed many times (Comodo, DigiNotar, ...) I wish to have 
an option to pin a SSL certificate. The fingerprint may be optionally provided 
through a new option.

__
powered by Secure-Mail.biz - anonymous and secure e-mail accounts.




[Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-06 Thread proper
Because SSL CA's have failed many times (Comodo, DigiNotar, ...) I wish to have 
an option to pin a SSL certificate. The fingerprint may be optionally provided 
through a new option.

__
powered by Secure-Mail.biz - anonymous and secure e-mail accounts.




[Bug-wget] extra HEAD request for dealing with redirects (302)

2012-03-22 Thread Laurent C
Hi,

Thanks a lot for writting wget, it's been very helpful!

I'm using wget to recursively retrieve certain types of documents
using the accept list (-A).  To know wether or not to retrieve a file
wget looks at the extension of the file contained in the link (which
makes sense). However, if that link turns out to be a redirection,
using an http 302 code, to a document that does match the accept list
then I would like to retrieve it. I understand that wget cannot know
about this unless it does an extra http request.

E.g., I'm looking for .ps files (wget -r -A ps URL). I get a link
like: serve_doc.cgi?file=foo.ps. serve_doc.cgi does not match the -A
option, but that link is really a redirect to a link with a ps
extension.

A simple solution would be to add serve_doc.cgi to my -A list. However
in my application I don't know of such filenames before I first
encounter them.

Currently is there a way to ask wget to retrieve such files? I
searched around and could not find much about this.

I could think of 2 options that would work in this case:
1) Outside of wget: look at the debug output of wget and manually
check the header of links that wget reject based on not matching the
accept list.
2) Add some code in wget that would do this. Sounds like it would be
an extra HEAD request per link that gets rejected based on not
matching the accept list.
3) Any other suggestions?

For 2), if I understood the code base and coded it up, is this
something that could be useful? Any tips or ideas on how to do it (or
not do it)?

Thanks, Laurent



[Bug-wget] Feature request/suggestion: option to pre-allocate space for files

2012-01-24 Thread markk
Hi,

This post is to suggest a new feature for wget: an option to pre-allocate
disk space for downloaded files. (Maybe have a --pre-allocate command-line
option?)

The ability to pre-allocate space for files would be useful for a couple
of reasons:

- By pre-allocating all space before downloading, the risk of exiting due
to a disk-full error is avoided. When downloading from a server which
doesn't support resuming downloads, an accidental disk full condition
means you have to re-download the whole file after freeing up some disk
space. That wastes a lot of time and network bandwidth.

- Disk fragmentation can be reduced. Downloading large files can take many
hours. While wget is downloading, much other disk activity can be caused
by other programs (web browser cache, email client etc.). The result is
the wget output file can end up unnecessarily fragmented. And likewise,
files written by other programs while wget is running end up more
fragmented.

On Linux, fallocate() and posix_fallocate() can be used to pre-allocate
space. The advantage of fallocate() is that, by using the
FALLOC_FL_KEEP_SIZE flag, space is allocated but the apparent file size is
unchanged. That means resuming with --continue works as normal.
posix_fallocate() on the other hand, sets the file length to its full
size, meaning that --continue won't work unless there were some way to
specify the byte offset that wget should continue from.

The fallocate program (see man 1 fallocate) can be used to manually
pre-allocate space. For a single file that's a slight hassle but simple
enough. (Run wget to determine file length, break, use fallocate to
allocate space, then re-run wget.) But when using wget to download many
files in one session it's not really practical.

Of course, if the web server does not report the file size, it won't be
possible to pre-allocate space. Or would it...? Suppose the user is
downloading some CD ISO images from a server which does not report file
lengths. If the user could tell wget to pre-allocate 800MB for each file,
and then have wget call ftruncate() when each file has finished
downloading, that should achieve a result almost as good as if the server
did report file lengths.


-- Mark





[Bug-wget] progressive download feature request

2011-12-27 Thread Michal Tausk

Dear developers of wget,

If you find some free time, would you, please, implement a feature that 
would progressively download a file that is growing on the remote site?


Though it might be obvious, let me explain what I mean.

In real world, some files might grow on the remote filesystem, eg. files 
that are being dumped, streams, ongoing download of larger file, etc. If 
we want to get these, currently, we need to run something like this 
several times consecutively:


wget -c http://www.mydomain.com/file.asf

...until the file has stopped growing (eg. certian tv stream was dumped).

The --ignore-length is not taken into consideration, (logically) when 
using --continue as it needs to count the difference in size between 
the downloaded file and the remote file. However, the file is downloaded 
only up to the content-length it retrieved on the last invocation of 
wget -c.


Thanks and all the best in 2012!

Michal







Re: [Bug-wget] progressive download feature request

2011-12-27 Thread Perry Smith

On Dec 27, 2011, at 4:36 PM, Keisial wrote:

 On 27/12/11 17:05, Michal Tausk wrote:
 The --ignore-length is not taken into consideration, (logically)
 when using --continue as it needs to count the difference in size
 between the downloaded file and the remote file. However, the file is
 downloaded only up to the content-length it retrieved on the last
 invocation of wget -c.
 It doesn't need to use Content-Length with -c, and in fact -c doesn't
 seem to inhibit ignore-length, so it should work.
 Are you sure it's not the server sending you only up to the length that
 was available when you requested the file?

If you can put wireshark on it, check to see which FIN comes over first.  I 
bet Keisial is right.  I bet the server is telling wget I'm done by sending 
the FIN.

Hope this helps
pedz




Re: [Bug-wget] progressive download feature request

2011-12-27 Thread Keisial
Michal Tausk wrote:
 If you can put wireshark on it, check to see which FIN comes over first.  I 
 bet Keisial is right.  I bet the server is telling wget I'm done by sending 
 the FIN.

 Hope this helps
 pedz

 --

 I can try that, but you are both probably right. Even though, it should be 
 doable from the client's side(?). Server can be Apache, but might be some 
 other that does not support this progressive feature (I'm not sure which one 
 does). So it can still be pretty nice feature. What do you think?

 Michal
I think you're confusing something. How do you expect the client to do that?
(other than looping checking if the file is bigger and hoping to not
query the server faster than the file grow rate)

May be easier to trick the server not to add the Content-Length and send
everything readable at that point (eg. with a cgi).




[Bug-wget] Feature Request: --progress enhancements

2011-12-18 Thread wesley
When --progress is set, it would be nice if at the completion of the 
transfer rather then freezing the transfer rate at whatever it was at 
when the transfer finished if it was instead recalculated as 
(filesize/totoal_download_time) similar to what rsync does. If someone 
is outputting this information to a log, having the *actual* thruput of 
the given file would be more informative than an instantaneous rate of 
change at the time the file finished.


A further enhancement to the --progress option would be to present the 
(queue count / maxcount) as part of the progress display. After all, if 
a user sets --progress it is ostensibly to receive feedback on what wget 
is doing. At the moment I believe only the debug output presents this 
information and unless you write some helper code to parse out this 
information and present it more neatly it can be very tedious to try to 
read it as output scrolls by. Having it visible in the progress line 
would make checking it much simpler and it given that it is the only 
indication of wgets global progress seems like an appropriate thing to 
include.




  1   2   >