[Bug-wget] Implementation suggestion for JavaScript execution

2014-05-25 Thread Andrew Pennebaker
Purpose:

Tumblr and other websites delay loading some of their content (images)
through JavaScript events like *onload*. It would be nice if wget supported
a *-j* flag for executing this, in order to access these dynamically loaded
resources. Execution may add some time to downloads, but for users that
really want the content, having the option is better than not.

Possible solutions:

The HtmlUnit <http://htmlunit.sourceforge.net/> library can already do
this, but it's written in Java and I believe wget is written in C?

Another consideration for attaching JS execution to wget is
Node<http://nodejs.org/>, a
C++ implementation, though we probably only want the core, the
V8<https://code.google.com/p/v8/>JavaScript engine itself.

Other possibilities include
SpiderMonkey<http://en.wikipedia.org/wiki/SpiderMonkey_(JavaScript_engine)>,
the JS engine for Firefox, or
JavaScriptCore<http://www.webkit.org/projects/javascript/>,
Safari's JS engine.

-- 
Cheers,

Andrew Pennebaker
www.yellosoft.us


[Bug-wget] Support for HDFS:// URLs

2015-02-14 Thread Andrew Pennebaker
Could we add support for downloading HDFS:// files with wget?

As a workaround, users can identify a special HTTP:// URL that points to
the WebHDFS location, but I'd prefer that wget learn how to do this
automatically on behalf of the user.

Cheers,
Andrew Pennebaker


[Bug-wget] Request: Handle -np better when URL omits final slash character

2019-01-29 Thread Andrew Pennebaker
Looks like -np is ignored for URL "directories" that omit the trailing
slash. At least this is what I see happen with Homebrew wget. Could we get
-np to act more flexibly, continuing to prohibit upwards navigation when
the original URL leaves off the slash?

-- 
Cheers,
Andrew