orley [mailto:wor...@alum.mit.edu]
Sent: Tuesday, November 20, 2018 7:30 AM
To: Tony Lewis
Cc: bug-wget@gnu.org
Subject: Re: [Bug-wget] Read error (Success)?
"Tony Lewis" writes:
> I'm getting the following error and don't understand what it's trying to
> tell me:
I'm getting the following error and don't understand what it's trying to
tell me:
Read error at byte 97430 (Success)
What could the server be doing to cause wget to report an error with the
details being "Success"?
For what it's worth, the page in question is coming from WordPress and the
It's nice to know that a feature I first proposed in 2002 and submitted
patches for in 2003 and again in 2005 has finally made it into wget.
Tony
-Original Message-
From: Tony Lewis [mailto:tle...@exelana.com]
Sent: Tuesday, September 30, 2003 5:44 PM
To: wget-patc...@sunsite.dk
Su
Ángel González replied to Юрий Фролов:
> > 1. This feature vget?
> > 2. Vget can not work with ntlm?
> wget does support ntlm authentication.
While it's true that wget supports an older version of NTLM, most corporate
environments require a more recent version.
Tony
On Friday, June 05, 2015 1:24 PM, Tim Rühsen wrote:
> > First, I have not dug into the source code to see how -H is implemented.
> > However, it makes sense to me that one ought to be able to specify
> > both -H and -D together.
> -H (=all domains)
> to exclude some sites use --exclude-domains do
On June 03, 2015 Tim Ruehsen wrote:
> This has already been fixed to:
>
> "Set domains to be followed. domain-list is a comma-separated list of
> domains. Note that it does not turn on -H."
First, I have not dug into the source code to see how -H is implemented.
However, it makes sense to me
On Tuesday, May 05, 2015 4:05 PM Ángel González wrote:
> That's an interesting history-digging question!
>(yes, this mailing list address is the appropiate place)
> The commit that added it (2c41d783c) says:
> Submitted by Alan Eldridge in
> <200111042106.fa4l63b75...@wwweasel.geeksrus.net>.
Alfred M. Szmidt wrote:
> wget -o /dev/null -O - "$@"
By the way, you can accomplish the same thing by:
wget -q -O - "$@"
Darshit Shah wrote:
> On the other hand, providing this configure option and making a note in the
> NEWS
> section, I am hoping that more maintainers will use the --disable-assert flag
> and
> remove assertions from the compiled code.
Would it make more sense to have an --enable-assert flag to
pedro lomas wrote:
> I'm trying to download a complete copy of a personal blog that is not
> public and therefore requires authentication blogger, by username and
> password.
>
> I used the command:
>
> -http-user=correoelectron...@gmail.com wget -http-password = password
> --mirror http://nombred
Tim Rühsen wrote:
> I guess with 'work from the distribution' you are talking about a tarball
> (created by 'make dist').
I suppose so -- since I haven't learned git yet, I didn't even know how you
created the tarball.
Tony
Giuseppe Scrivano wrote:
> changelogs are crap, not useful to anyone, that is my opinion.
That assumes that everyone interested in the wget source is actively using git.
However, many people work from the distribution and the only documentation they
receive is in the changelog.
I have no objec
Darshit Shah wrote:
> > how would you programmatically retrieve these links? Triggering
> > "onload" or other events? I wonder how many of these occurrences we
> > can cover by simply trying to parse cases like document.location='foo'
> > without involving any JS engine.
> >
> I think the only
Darshit Shah wrote:
> In case both the --config and --no-config commands are issued, the one
that
> appears first on the command will be considered and the other ignored.
Given my memory of the way the parsing loop works, I would expect that it
would use the last one that appears. How do GNU comm
Darshit Shah wrote:
> This option when enabled will compile Wget with certain flags which will
> spew a lot of output to the screen on runtime in case it encounters
> potential race conditions. Hence, this option should NEVER be used on
> production systems.
It is sometimes necessary to debug iss
Altug Tekin wrote:
> To achieve this I created the following wget-command:
>
> wget --reject=js,txt,gif,jpeg,jpg \
> --accept=html \
> --user-agent=My-Browser \
> --recursive --level=2 \
>
www.voanews.com/search/?st=article&k=germany&df=08%2F21%2F2013&dt=09%2F20%2F
2013&ob=dt#articl
Tim Ruehsen wrote:
> Wget should have taken the URL 'teståäöÅÄÖ' as ISO-8859-1 and convert it
into UTF-8, which would fail to download.
Neither Firefox nor Internet Explorer can navigate that link. Both fail
trying to retrieve teståäöÃ
ÃÃ.
I concur with Tim that this behavior of wget is acc
Tim Rühsen wrote:
> I saw so far
[snip]
> 215 UNIX MultiNet Unix Emulation V5.3(93) -> LIST, Unix output format
There appears to be a version number at the end of this string. Is that
significant in detecting the type of system or should the system detection
code ignore the version number?
Also,
The last time I updated my patch was for version 1.11.4.
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Giuseppe
Scrivano
Sent: Sunday, August 25, 2013 11:39 AM
To: Tony Lewis
Cc: bug-wget@gnu.org
Ángel González wrote:
> wget doesn't have such option (it's done internally).
I have developed an enhancement to wget to do exactly what Bin Wang has
requested. The proposed command line option for this enhancement is
--unfollowed-links used as follows:
wget http://some.site.com/ --unfollowed-li
NTLM has evolved over the years. The last time I looked at the code, wget
only supported NTLM message types 1 and 2. It very well may be that the
server you are connecting to requires type 3 messages.
In 2009, I did a major refactoring of http_ntlm.c (against the wget 1.11.4
sources) to support NT
Hrvoje Niksic wrote:
> Another possibility would be to add some more syntax to --method so as to
> allow --method=POST:file=foo, and also --method=PUT:data=abc
Since '=' is a valid (and frequent) delimiter for what follows "data="
perhaps another pair of delimiters would be better. How about
gary speaks wrote:
> I need to be able to download just the data from the page into a text file
to
> parse later with a different program. I've used this syntax with other
sites
> and works perfect, but I've run into a program with one web site.
>
> Here's the site and the syntax I'm using:
> WGET.
Alex wrote:
> It is possible to add options something like "--local-filesystem-encoding"
> to convert filenames to given encoding? Or what is better way to do (now
> parse log to get URI-encoded name, then decode it to CP866 and rename
files.
> Bat-file http://dl.dropbox.com/u/27457022/rus_site.z
When I tried to build wget 1.14 on Mac OS X I found that I was missing some
things that are necessary to build wget:
- GNU Transport Layer Security Library, which needs
- XZ Utils (to extract the archive) and Nettle version 2.4, which needs
- GNU Multiple Precision Arithmetic Library
I could
I think you meant:
** Accept the arguments --accept-regex and --reject-regex.
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Giuseppe
Scrivano
Sent: Monday, August 06, 2012 12:21 PM
To: info-...@gnu.or
loc wrote:
> I can't seem to get wget --post-data to work with any site, I've tried a
few more and none work.
>
. . . . . . .
...
> Below is one I'm trying as a test and if I load the page in Firefox and
> just put "test" in for the Firs
You're looking for:
--page-requisitesget all images, etc. needed to display HTML page.
wget URL --page-requisites
should give you what you need.
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Garry
S
It seems to me that wget should not reuse a connection from one host to
access another (even if those hosts share an IP address). I suspect the
current behavior is accidental rather than intentional.
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-b
Heino Pedersen writes:
> Wget does not accept SAN when checking the certificate.
>
>
> host@host:~$ wget -O /dev/null "https://domain.com";
> --2012-01-17 00:33:58-- https://domain.com Løser kypto.dk...
> 62.141.46.191 Connecting to domain.com|127.0.10.10|:443... forbundet.
> ERROR: certificat
If the remote web page is using forms, you will need to use either the
--post-data or --post-file option. For example:
wget http://www.yoursite.com/query.aspx --post-data="q=SQL_QUERY"
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana
Vikram Narayanan wrote:
> Isn't it a waste of bandwidth?
> Is it not possible to check only the PDF files without downloading the
whole content?
wget has to download HTML content in order to discover where the PDF files
are located,
AFAIK, my NTLM changes were never integrated with wget because we didn't have a
good mechanism to test them against various configurations. At this point, I
don't have access to any NTLM server so I can no longer test them at all.
Tony
-Original Message-
From: bug-wget-bounces+wget=exela
Micah Cowan wrote:
> (I'd be interested in knowing whether folks actually have legal
> obligations to respect TOS to an unrestricted-access site like that... I
> imagine it might even vary by location)
What terms of service? I didn't see any terms of service (perhaps because I
didn't look for t
Ángel González wrote:
> Maybe not. Consider a url like:
> http://www.example.net/download.php?file=releases/wget.exe
>
> In that case using as filename wget.exe makes more sense than
> download.php@file=releases%2Fwget.exe
> Whereas there are other cases where the basename is preferible.
> Probabl
Los desarrolladores de wget no manejan soluciones de traducción, es manejado
completamente por el equipo de traducción de español en
translationproject.org. El proyecto de traducción pregunta mantenedores de
los paquetes no realizar ningún cambio en las traducciones que ofrecen,
incluso cuando está
Patrick, while you're waiting for someone to improve the output of wget to
provide the information you want, the --debug output will tell you which page
generated the reference to the broken link. Look for:
---request begin---
GET /path/to/broken_link.html HTTP/1.0
Referer: http://mysite.com/pat
It works as I would expect in 1.11.4, with the exception of downloading this
file:
sourceforge.net/projects/biblatex-biber/files/index.html
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Micah Cowan
Se
It sounds like you were depending on a bug in version 1.9, which has since been
fixed. If you ask wget to retrieve *_20110119_File.csv and put the results into
File.csv then it's going to do its best to accomplish that task.
You might want to remove the --output-document and use a script that ru
Daum wrote:
> wget --span-hosts --convert-links --page-requisites
> http://boston.craigslist.org/gbs/abo/2028061114.html doesn't seem to be
> bringing down the images.
When I run that command, three directories are created:
- boston.craigslist.org
- images.craigslist.org
- www.craigslist.org
Micah Cowan wrote:
> Not sure the exit status is the best place to do that, though. Both
> cases really are success cases, and and should be treated as such.
I consider "success" to be "wget downloaded the files on the command line" (and
possibly related files, e.g., --mirror). Even if it didn't
Kamenik, Aleksander wrote:
> So wget -N always retuns 0 on success (tested 1.12 too). Success is both
> downloading a file and not downloading it due to current timestamp.
I think it would be useful for wget to distinguish between these two use cases.
I would be happy if -N without a download al
Kamenik, Aleksander wrote:
> I periodically download a file with "wget -N someurl". I want to execute some
> external program to process the downloaded file, but only when wget actually
> downloaded a new version of it.
[snip]
> Is there a better way, something like find's --exec?
I don't have a
Leonard Ehrenfried wrote:
> in order to get familiar with the wget codebase I have grepped the code and
> found a TODO in convert.c
Searching the source code for "TODO" is probably not the best way to find
things to work on. Take a look at http://wget.addictivecode.org/HelpingWithWget
Tony
oh...@cox.net wrote:
> I have multiple environments with both Win2K and Win2K3 ADs and various
> Win2K and Win2K3 servers as domain members, etc., and I have control of all
> of them, plus Linux machines of various flavors, so I think I can help a
> little with that.
Great, I will put something t
oh...@cox.net wrote:
> So, it seems like the problem is that wget may be doing only NTLM?
The current version of wget only supports NTLM authentication. Specifically, it
sends the following flags to the server:
NEGOTIATE_OEM (0x0002)
NEGOTIATE_NTLM_KEY (0x0200)
> Does anyone know
Rahul Prasad wrote:
> I want to implement a batch download feature.
[snip]
> Plz tell me where show I add
> if(opt.batch) {
In main.c you will find the following line:
url = alloca_array (char *, nurl + 1);
You will need to change that to something like:
if (opt.batch)
{
process_ba
Rahul Prasad wrote:
> I cant understand the flow of control. If you provide me with a quick
> tutorial on how to solve the problem that I have given, I can quick start
> coding instead of spending time in analyzing 1000 lines of code.
The tutorial shows you how to add a command-line option and a
Avinash wrote:
> However, the program seems to want to waste a ton of bandwidth in
> downloading EVERYTHING linked from the website you feed it, and then as it
> finishes download it, it removes it if it's not an mp3. Is there any way
to
> tell it, "seriously, if it doesn't end in .mp3, don't even
Other folks were working on that too. Tony Lewis had something working,
> can't remember if we put it in, or if we were waiting on more thorough
> testing work (to verify it still worked properly with NTLMv2). But I
> wouldn't have held up a release for that, since not having NTLMv2
v...@mage.me.uk wrote:
> The recovery of wget crossed my mind too but I presumed in such an
> instance the user would install their own copy of wget or speak to the
> administrator. Perhaps asking the user how to proceed if the system
> wgetrc parse failed would suitable. e.g
>
> "Parsing system w
Micah Cowan wrote:
> For some value of "quickly". This obviously necessitates extra
> round-trips to the server. Can still be useful, but still perhaps not as
> useful as doing URL-matching properly.
I would prefer an extra round trip to avoid downloading a 2GB file that will
immediately be ignor
Micah Cowan wrote:
> Yeah, that was the original thinking. But I still hate it. For one
> thing, there are no longer any guarantees that recurse-able HTML files
> end in ".html"
There are a bunch of suffixes that are actively used for HTML plus there is
no reason that one has to include a suffix
Guillaume Turri wrote:
> In fact, why is this option treated after a download?
When mirroring, all HTML files have to be downloaded (whether or not it is
desired to ultimately keep the HTML file) in order to find all the
interesting file. For example:
wget http://www.somesite.com/index.html --mi
Micah Cowan wrote:
> > It may not have been a typo, but it doesn't parse in English.
>
> Sure it does. It's grammatically correct English, and I've seen it used
> in various places (though somewhat rarely).
Apparently "which see" is used as a literal translation of the Latin "quod
vide". In the p
Hrvoje Niksic wrote:
> > waiting interval specified by this function is influenced by
> > -...@code{--random-wait}, which see.
> > +...@code{--random-wait}.
>
> "Which see" is no typo, it suggests to the user to also look at the
> documentation of "--random-wait" command-line option.
It may not
> Voytek Eymont wrote:
>>> >> Value="Accept" title="Accept" type="button">
>
> function Al_Click_Button(frm, al_disp_id, module, template) {
> frm._module.value=module;
> frm._template.value=template;
> frm._al_action_click.value = 1;
>
Voytek Eymont wrote:
> BUT, additionally, I also need to push button(s) on new items
>
> 'selected source' on buttons I wish to push has like:
>
> Value="Accept" title="Accept" type="button">
The onclick action tells the browser to execute some JavaScript, but wget
does not have a JavaScript e
Keisial wrote:
> I also miss the new option at print_help().
David, you might find this primer on command line options useful:
http://wget.addictivecode.org/OptionsHowto
Tony
Norman Khine wrote:
> There is the line:
>
>
LoginId=myusername&Password=mypassword&__sgx_contextKey=63397388108404021367
> 5900406_solidarmonde_X_12873096&__sgx_contextSecurity=&__sgx_script=&=&=
[snip]
> --post-data 'LoginId=myusername&Password=mypassword'
You are only sending two of the four
er 14, 2009 1:16 AM
To: Tony Lewis
Cc: bug-wget@gnu.org
Subject: Re: [Bug-wget] wget with no stdout delivers no result
Hi Tony,
Thanks for your answer.
I tried to run wget wit the --debug option. It doesn't seem that wget
creates the log file as defined with the -o option. Still: invok
Ray Satiro wrote:
> On windows this is valid:
> C:\Users\Internet\Desktop>if exist c://file.txt echo hi
> hi
On windows this is valid too:
if exist c://file.txt echo hi
hi
but neither is the canonical way of expressing the location of a file in the
root directory of the C drive.
Why
Instead of running with -q, try running with --debug and report back on what
wget is reporting when run from a cron job.
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Maurus Frey
Sent: Saturday, Decem
The error is being reported by your command-line processor (bash), not wget.
Another alternative is to put the POST data into a file and then use
--post-file. (The POST file needs to include the 'data=' as well as the
contents of infile.txt.)
Tony
-Original Message-
From: bug-wget-bounces
Try quoting the URL.
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Amit Walia
Sent: Tuesday, November 24, 2009 3:48 PM
To: bug-wget@gnu.org
Subject: RE: [Bug-wget] How to pass arguments in URL
Hi,
My webs
eful.
Tony
From: Dan Yamins [mailto:dyam...@gmail.com]
Sent: Sunday, November 22, 2009 6:30 AM
To: Tony Lewis
Cc: bug-wget@gnu.org
Subject: Re: [Bug-wget] Problem, no getting any response
Hi Tony, thanks for your email.
{"searchQueryString":"p+9-n+12-c+287464-s+0-r+-t+-ri+
There are several things about the request you're asking wget to send that
don't match the browser's request.
Let's start with the most obvious: your posted data looks nothing like what
the browser is sending. According to your Firebug output, the data posted
is:
{"searchQueryString":"p+9-n+12-c+2
KARR, DAVID (ATTCINW) wrote:
> I must be missing something simple.
wget translates file name based on valid file name characters for your local
file system.
Tony
Hatwágner Ferenc Miklós wrote:
> I think I have found a bug in GNU Wget 1.9.1. I use this software as a
> part of my Ubuntu 9.04 distribution. If I try to download big files
> (over 2 GiB) over http, only a small part of the file is written to the
> disk. I think this can be caused by some kind
Mohan, there is a simple way for you to test the second case: Unplug your
network cable and run wget as you normally would.
Tony
-Original Message-
From: bug-wget-bounces+wget=exelana@gnu.org
[mailto:bug-wget-bounces+wget=exelana@gnu.org] On Behalf Of Mohan gupta
Sent: Thursday, Oc
Micah Cowan wrote:
> Have you thought about trying a more recent release? We're at 1.12 right
> now (though it doesn't have support for MinGW builds: try 1.11.4).
Or Andy could try 1.12 and tell Micah what needs to be done for it to build
under MinGW. :-)
Tony
Wayne Glover wrote:
> can i modify wgetrc.txt to vary the save
> locations based on the url?
The scenario you described is essentially two independent wget sessions.
(Session one's output to directory A and session two's output to directory
B.) The only way to accomplish your goal is to wrap wg
Micah Cowan wrote:
>Matthew Woehlke wrote:
>> Thought: is it possible to alter the syntax of -A/-R to tell these that
>> you are matching a regex rather than a glob? Maybe by requiring the '::'?
>
>I'm not crazy about that. It would save us the consumption of new
>short-options, but...
>
>I'm not
Micah Cowan wrote:
>Tony Lewis wrote:
>"hash" doesn't apply to URIs that wget would handle (it's called the
>"fragment" portion in relevant RFCs), as that's not normally part of
>what gets sent to the server.
But it can appear in the links withi
Micah Cowan wrote:
> - It should use extended regular expressions
Agreed
> PCREs are less important
I have a very strong preference for \s over [[:space]]
> - It should be possible to match against just certain components of an
> URL
Agreed. In your exchange with Matthew some possible labels w
If you are successfully using NTLM authentication with wget 1.11.4, know how
to build a development version of wget, and are willing to help with testing
some changes for 1.12, please let me know.
Tony
Wayne Glover wrote:
> So, where can i put this file.
As Todd Pattist wrote earlier, you can inform wget where to find the wgetrc
file by setting the WGETRC environment variable. You can do this globally
using the advanced tab of the system control panel, in a batch file (as Todd
explained), o
Micah Cowan wrote:
> I disagree with Tony's statement: unless you're having wget spit out
> information about what the certificate _is_, how can you claim to
> "trust" it?
I think it is likely that people will be using wget to retrieve content from
web servers in which they already some pre-exis
Anamika Jindal wrote:
> Issue is , I can not automate this.
What you want (at least in part) is the session database, which is not yet
implemented.
http://wget.addictivecode.org/FeatureSpecifications/SessionInfoDataBase
I took a stab at something of the sort when I needed to automate a script.
I
Pratap Kumar Das wrote:
> I am not being able to download the files which would be
> linked with dynamic urls (urls containing "*?*" in them which is probably
> used for redirection).
The "?" is not the problem. See these lines in the output you included:
> Cannot write to `download.html?product
uot;UID" --http-password="PWD" --debug
The debug output is attached and the wget version is 1.11.4.
(Command line and debug output have been sanitized to remove confidential
information.)
I would appreciate any suggestions.
Tony Lewis
DEBUG output created by Wget 1.1
I agree with Diego that it would be useful to rewrite file extensions; just
today I was mirroring a site with .aspx and I will have to manually rename
everything to .html so I could open the files locally in my browser. It is
important to do this within wget so that the links in the downloaded mirr
Run your command with the -p option and you will notice that it gets three
pages in a new subdirectory names www.google.com. (If not, try adding -e
"robots=off"; I always run with that option).
Once you have that, look at the file that starts with translate_p and you
will notice that it is downloa
I'm not sure what protocol is used for submitting translation corrections
and suggestions, but according to
http://translationproject.org/team/index.html the French translation team
can be reached at tra...@traduc.org.
The entire French translation is found in
http://translationproject.org/latest/
84 matches
Mail list logo