Re: [Bug-wget] Fwd: [GSoC] Extend concurrency support in Wget

2014-04-05 Thread Yousong Zhou
On 5 April 2014 17:28, Jure Grabnar grabna...@gmail.com wrote:
 Hi,


 
  That's true. type is currently only used to filter out types which
  Wget
  doesn't support.
  Do you think parsing it (type) is irrelevant?

 IMHO, if it will not be used in the near future, then better document
 or remove it.


 I tried removing elect_resources() (essentially removing type attribute)
 and it mostly works.
 It fails when bittorrent url resource has top priority. In this case it
 HTTP downloads what looks to me like a tracker info.
 Since checksum differs from original file (extracted from metalink file)
 download fails.

 I also merged two metalink files (header from the first file and resources
 from the second file) and Wget crashes. I found out there are some issues
 with temporary files.

 I do believe checking types is more fail-safe since these issues do not
 occur there. At least bittorrent resources have to be eliminated
 beforehand or make Wget somehow aware of them.

Yes, I agree that the parsing is needed for filtering out schemes like
ed2k and bittorrent.

yousong



Re: [Bug-wget] warning about unknown .wgetrc directives

2014-04-05 Thread Micah Cowan
Anyone have thoughts on a designated prefix (say, make-style -) that
indicates a line that can be safely ignored if not understood?

Might also work to have a pragma thingie in the .wgetrc, to turn
fail-on-error on and off.

Naturally, the value of such a thing wouldn't be seen until wgets installed
around the world are mostly above a certain version threshold...

(Sorry not to include context/disturb threading... I'm using gmail, which
isn't the greatest at dealing with digests, and I don't have mutt set up on
this laptop, and am feeling lazy.)

-mjc


Re: [Bug-wget] [Bug-Wget] Issues with Metalink support

2014-04-05 Thread L Walsh



Darshit Shah wrote:

I was trying to download a large ISO (4GB) through a metalink file.

The first thing that struck me was: The file is first downloaded to
/tmp and then moved to the location.

Is there any specific reason for this?


Sorry for the long delay answering this but I thought
I would mention a specific reason that such is done
on windows (that may apply to linux in various degrees
depending on filesystem type used and file-system activity).

To answer the question, there is a reason, but
its importance would be specific to each user's use case.

It is consistent with how some files from the internet are
downloaded, copied or extracted on windows.

I.e. IE will download things to a tmp dir (usually
under the user's home dir on windows), then
move it into place when it is done.  This prevents partly
transfered files from appearing in the destination.

Downloading this way can, also, *allow* for allocating
sufficient contiguous space at the destination in 1
allocation, and then copying the file
into place -- this allows for less fragmentation at the
final destination.  This is more true with larger
files and slower downloads that might stretch over several
or more minutes.  Other activity on the disk
is likely and if writes occur, they might happen in the
middle of where the downloaded file _could_ have had
contiguous space.

So putting a file that is likely to be fragmented as it
is downloaded due to other processes running, into
a 'tmp' location, can allow for knowing the full size
and allocating the full amount for the file so it can
be contiguous on disk.

It can't allocate the full amount for the file at
the destination until it has the whole thing locally, since
if the download is interrupted, the destination would contain
a file that looks to be the right size, but would have
an incomplete download in it.

Anyway -- the behavior of copying it to a tmp is a useful
feature to have -- IF you have the space.  It would be
a nice (not required) feature if there was an option on
how to do this (i.e. store file directly on download, or
use a tmpdir and then move (or copy) the file into the
final location.

Always going direct is safest if user is tight on diskspace,
but has the deficit of often causing more disk fragmentation.

(FWIW, I don't really care one way or the other, but wanted
to tell you why it might be useful)...

Cheers!
Linda



Re: [Bug-wget] [Bug-Wget] Issues with Metalink support

2014-04-05 Thread Random Coder
On Sat, Apr 5, 2014 at 4:09 PM, L Walsh w...@tlinx.org wrote:

 I.e. IE will download things to a tmp dir (usually
 under the user's home dir on windows), then
 move it into place when it is done.  This prevents partly
 transfered files from appearing in the destination.


IE does not download to a tmp folder.  For instance, I just downloaded a
file to a folder, and I can watch the file grow in the destination folder.
 IE uses a .partial extension for the file as it downloads it, renaming
the file to the proper file when it's done.  Chrome and Firefox behave
similarly, just using a different extension for the partial file.

] dir ubuntu-12.04.4-desktop-amd64.iso.scwpnys.partial
04/05/2014  04:57 PM13,115,224
ubuntu-12.04.4-desktop-amd64.iso.scwpnys.partial

] dir ubuntu-12.04.4-desktop-amd64.iso.scwpnys.partial Volume in drive C is
OS
04/05/2014  04:57 PM14,163,800
ubuntu-12.04.4-desktop-amd64.iso.scwpnys.partial

I'm not convinced trying to pre-optimize for disk fragmentation is useful
here.  If the user is concerned about such things, they're free to copy the
download after it's done and delete the original.  Or run an defragmenter.


Re: [Bug-wget] [Bug-Wget] Issues with Metalink support

2014-04-05 Thread L Walsh



Random Coder wrote:
On Sat, Apr 5, 2014 at 4:09 PM, L Walsh w...@tlinx.org 
mailto:w...@tlinx.org wrote:


I.e. IE will download things to a tmp dir (usually
under the user's home dir on windows), then
move it into place when it is done.  This prevents partly
transfered files from appearing in the destination.


IE does not download to a tmp folder.

---

It depends on timing, what version of IE, and probably
the phase of the moon, but here's a abbreviated trace of me downloading
the linux kernel into C:\tmp\download.  I annotate what's going on in
the left column... you can see almost 50% of the file was downloaded
into a tmp file, then switched to final destination and only
wrote 1M chunks instead of previous 4-12K chunks.

6:17:13,IEXPLORE,CreateFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N,Desired Access: 
Read Attributes, OpenResult: Opened
6:17:13,IEXPLORE,CreateFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Desired 
Access: Generic Write, Read Attributes, OpenResult: Created
6:17:13,IEXPLORE,SetAllocationInformationFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,AllocationSize: 
78,399,152
6:17:13,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
0, Length: 704, Priority: Normal
6:17:13,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
704, Length: 1,944
6:17:13,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
2,648, Length: 8,192
6:17:13,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
10,840, Length: 4,096

...
6:17:23,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,207,192, Length: 4,096
6:17:23,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,211,288, Length: 4,096
6:17:23,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,215,384, Length: 16,384


I've typed in the save pathname now:
6:17:23,explorer,808,CreateFile,OK 
,C:\tmp\download\linux-3.14.tar.xz,Desired Access: Read Attributes, 
OpenResult: Opened

6:17:23,explorer,808,CloseFile,OK ,C:\tmp\download\linux-3.14.tar.xz,
6:17:23,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,231,768, Length: 4,096

...
6:17:23,IEXPLORE,WriteFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,461,144, Length: 4,096

...

opens partial file in same directory:

6:17:23,IEXPLORE,CreateFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Desired Access: Generic 
Write, OpenResult: Opened


copies from 1st tmp to final location tmp, but in 1MB increments
6:17:23,IEXPLORE,ReadFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
0, Length: 1,048,576, Priority: Normal
6:17:23,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 0, Length: 
1,048,576, Priority: Normal

...
6:17:23,IEXPLORE,ReadFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
35,651,584, Length: 817,752
6:17:23,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 35,651,584, 
Length: 817,752
6:17:23,IEXPLORE,ReadFile,END OF 
FILE,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Offset: 
36,469,336, Length: 1,048,576


deletes first tmp, and now saved directly to patial at destination:
6:17:23,IEXPLORE,SetDispositionInformationFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,Delete: 
True
6:17:23,IEXPLORE,CloseFile,OK 
,C:userAppData\Local\MS\Win\tmp-internet-IEfiles\BNZE234N\linux-3.14.tar[1].xz,


only 1M writes:
6:17:23,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 36,469,336, 
Length: 1,048,576
6:17:24,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 37,517,912, 
Length: 1,048,576
6:17:24,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 38,566,488, 
Length: 1,048,576
6:17:24,explorer,QueryDirectory,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Filter: 
linux-3.14.tar.xz.w5aj0r5.partial, 1: linux-3.14.tar.xz.w5aj0r5.partial



final output being created:
6:17:24,explorer,CreateFile,OK ,C:\tmp\download\linux-3.14.tar.xz,Desired 
Access: Read Attributes, Read Control, OpenResult: Opened


more writes to partial:
6:17:25,IEXPLORE,WriteFile,OK 
,C:\tmp\download\linux-3.14.tar.xz.w5aj0r5.partial,Offset: 40,663,640, 
Length: 1,048,576
6:17:25,IEXPLORE,WriteFile,OK