RE: Patch: --range switch implemented
>From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] > >Herold Heiko <[EMAIL PROTECTED]> writes: > >> Don't forget you need a symbol for the start->size syntax,too ... + >> would be perfect, > >Yes. That's +, as implemented in the original patch. Noone is >disupting that one. > >> --range 4096+1k >> or --range 4095+1k (shudder) > >Did you mean 4097 here? > Yes in fact, for the 1-based syntax. Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087 -- ITALY
Re: Patch: --range switch implemented
Herold Heiko <[EMAIL PROTECTED]> writes: > Don't forget you need a symbol for the start->size syntax,too ... + > would be perfect, Yes. That's +, as implemented in the original patch. Noone is disupting that one. > --range 4096+1k > or --range 4095+1k (shudder) Did you mean 4097 here?
RE: Patch: --range switch implemented
>From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] > >Vladi Belperchinov-Shabanski <[EMAIL PROTECTED]> writes: > >> Here is my IMO (in case someone is really interested in:)) ... >So what would be a nice alternative syntax for closed-open? 0:1024? >Hyphen is easier to type, though. Damn, sometimes it's so hard to >win. :-) Don't forget you need a symbol for the start->size syntax,too ... + would be perfect, --range 4096+1k or --range 4095+1k (shudder) Maybe what we really need is not a different syntax for every kind of range definition but a default syntax (whichever symbol you use) and number modifiers... for example, for the first kb, and suppose we want to accomodate endusers: default 1-1024 or 1:1024 or 1..1024 or ]0-1023] (same with : or ..) or [1-1024] (same with : or ..) or [0-1024[ (same with : or ..) or 0+1024 or 0+1k or [1+1024 or [1+1k or ]0+1024 or ]0+1k you get the point, sorry but I'm in a hurry, possibly I got some braces wrong. Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087 -- ITALY
Re: Patch: --range switch implemented
Vladi Belperchinov-Shabanski <[EMAIL PROTECTED]> writes: > Here is my IMO (in case someone is really interested in:)) > > all ranges 0-based, > support few syntax-es: > > --range=0..1024-- closed-closed > --range=0-1024 -- closed-open > --range=1024+2048 -- take 3..4 K's :) i.e. get 2k starting on pos 1024 I agree with you copmletely; my preferences also lie in that direction. Except for one thing: HTTP/1.1 `Range' header uses x-y to mean closed-closed, 0-based. We are of course not required to use the same syntax, but it would be nice not to confuse things. So what would be a nice alternative syntax for closed-open? 0:1024? Hyphen is easier to type, though. Damn, sometimes it's so hard to win. :-) > (well last one could be like --range=2048@1024 just for fun) :-)
Re: Patch: --range switch implemented
hi! Here is my IMO (in case someone is really interested in:)) all ranges 0-based, support few syntax-es: --range=0..1024-- closed-closed --range=0-1024 -- closed-open --range=1024+2048 -- take 3..4 K's :) i.e. get 2k starting on pos 1024 (well last one could be like --range=2048@1024 just for fun) implementation of all cases is trivial and I cannot see why not having them all? P! Vladi. Hrvoje Niksic wrote: > > Herold Heiko <[EMAIL PROTECTED]> writes: > > > However, of the top of my head I can't remember many occasions where > > 0-n means closed-open > > There are. (And note that it's n-m in the general case, not just > 0-n.) Off the top of my head, the Java string subscripts, Lisp > array-related functions, Python slices, various Emacs functions, etc. > The Python example is easy to demonstrate: > > >>> range(0, 10) > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > Also: > > >>> a = [0, 1, 2, 3, 4, 5] > >>> a[2:4] > [2, 3] > > This makes perfect sense to me, but not everyone would agree. The > nicest thing about it is that it allows this: > > >>> a[0:3] + a[3:] > [0, 1, 2, 3, 4, 5] > > I.e. you can construct the original interval by appending the > "touching" subintervals. This is nice for downloads because it allows > you to download 0-2k, 2k-5k, etc., without the one overlapping byte. > > Common Lisp: > [1]> (setq a '(0 1 2 3 4 5)) > (0 1 2 3 4 5) > [2]> (subseq a 2 4) > (2 3) > > Perl avoids the potential confusion by having its SUBSTR take offset > (0-based) and length, which is clear to everyone. > > > while there are at least Pascal and Perl where 0..n > > The Pascal reference is to 1..n, not 0..n. Which is one point you > seem to have missed: IMHO [start, end] makes more sense with intervals > that start with 1, and [start, end) makes more sense with intervals > with start with 0. > > > 0..10 #11 bytes including first one, like Perl, Pascal > > 1-10 #10 bytes including first one > > 1-10 is what I meant by "the Pascal way" because most Pascal programs > use 1-based arrays. Again, assuming we want to download 16 bytes, the > three options are, in my order of preference: > > 1 .. 16 # end-closed 1-based, Pascal-like > 0 .. 15 # end-closed 0-based, Perl-like > 0 .. 16 # end-open 0-based, Python-like > > Maybe we should support all 3, but document only one in --help? That > way most users will not notice the "complexity". Also, the first > option could well be ignored since 1-based arrays are for wimps. :-) -- Vladi Belperchinov-Shabanski <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> Personal home page at http://www.biscom.net/~cade DataMax Ltd. http://www.datamax.bg Too many hopes and dreams won't see the light... smime.p7s Description: S/MIME Cryptographic Signature
Re: Patch: --range switch implemented
Andre Pang <[EMAIL PROTECTED]> writes: >> >> --range=1025..2048 >> >> --range=1024..2047 >> > >> > I haven't been following that closely, but how are you going to >> > tell what the user really wants to do if he gives either of those >> > two statements? >> >> Only one of those statements will be a valid way of downloading the >> second kilobyte of a file. The question is, which one. > > Oh! That makes it a bit easier :). I vote for 1025..2048, which > I believe is standard across all functional programming > languages. The [ and ] also correlate nicely to the mathematical > interpretations. (Compare [5..10] with (6..9), for instance). This is again a misunderstanding. Both of the above are inclusive, they only disagree on whether the first byte is 0 or 1. You have elided parts of my posting that explain that. Note that I never proposed an interval open on both sides, but only on the right side. And you deleted that proposal. Again, the alternatives are: 1025..2048 starting with one, end-closed 1024..2047 starting with zero, end-closed 1024..2048 starting with zero, end-open > The majority of end-user utilities seem to like counting from 1, > whereas programming languages tend to start counting from 0. A > quick look at the GNU text utilities (e.g. tail -c) seems use > 1-counting rather than 0-counting. That makes sense for counting lines. Bytes are almost always indexed from zero. > Basically, I think that from an end-user perspective, they're used > to dealing with 1-counting, not 0-counting. If you're *counting* something, sure. But we're not implementing `wc -c' here -- we are referring to specific bytes, or rather to an interval. 0-based indexing makes much more sense to me.
Re: Patch: --range switch implemented
On Mon, Nov 19, 2001 at 08:33:15PM +0100, Hrvoje Niksic wrote: > Compatibility with rfc2616 is a good point, though. Maybe it's best > to simply stick to 1024-2047 then. Compatibility with curl is even more important :). In light of that, I vote for 1024-2047. No point having two file retrieval utilities do something different, just because "it's more correct". -- #ozone/algorithm <[EMAIL PROTECTED]> - trust.in.love.to.save
Re: Patch: --range switch implemented
On Mon, Nov 19, 2001 at 08:19:08PM +0100, Hrvoje Niksic wrote: > >> --range=1025..2048 > >> --range=1024..2047 > > > > I haven't been following that closely, but how are you going to > > tell what the user really wants to do if he gives either of those > > two statements? > > Only one of those statements will be a valid way of downloading the > second kilobyte of a file. The question is, which one. Oh! That makes it a bit easier :). I vote for 1025..2048, which I believe is standard across all functional programming languages. The [ and ] also correlate nicely to the mathematical interpretations. (Compare [5..10] with (6..9), for instance). The majority of end-user utilities seem to like counting from 1, whereas programming languages tend to start counting from 0. A quick look at the GNU text utilities (e.g. tail -c) seems use 1-counting rather than 0-counting. Basically, I think that from an end-user perspective, they're used to dealing with 1-counting, not 0-counting. -- #ozone/algorithm <[EMAIL PROTECTED]> - trust.in.love.to.save
Re: Patch: --range switch implemented
Daniel Stenberg <[EMAIL PROTECTED]> writes: > Then again, both versions could be supported if they just use > different syntaxes. Please note that there is a third version which Andre elided. We're deciding for one or more of: --range=1025..2048 --range=1024..2047 --range=1024..2048 # my preferred version On the one hand, there's no harm in supporting them all, but there's enough overengineering in Wget as it is. I'd like to avoid more. Compatibility with rfc2616 is a good point, though. Maybe it's best to simply stick to 1024-2047 then.
Re: Patch: --range switch implemented
On Mon, 19 Nov 2001, Hrvoje Niksic wrote: > >> --range=1025..2048 > >> --range=1024..2047 > > Only one of those statements will be a valid way of downloading the > second kilobyte of a file. The question is, which one. > > The first one assumes the first byte in the file is "1", the second one > assumes it's "0". Both are inclusive. I vote for the second alternative. Being (non Python-) programmer, reader of the RFC2616 and user/author of the curl --range option that uses the HTTP header syntax... Then again, both versions could be supported if they just use different syntaxes. -- Daniel Stenberg - http://daniel.haxx.se - +46-705-44 31 77 ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Re: Patch: --range switch implemented
Andre Pang <[EMAIL PROTECTED]> writes: > On Mon, Nov 19, 2001 at 05:07:31PM +0100, Hrvoje Niksic wrote: > >> Or, to pick another example, say you want to download the second >> kilobyte of a file: >> >> --range=1025..2048 >> --range=1024..2047 > > I haven't been following that closely, but how are you going to > tell what the user really wants to do if he gives either of those > two statements? Only one of those statements will be a valid way of downloading the second kilobyte of a file. The question is, which one. The first one assumes the first byte in the file is "1", the second one assumes it's "0". Both are inclusive.
Re: Patch: --range switch implemented
On Mon, Nov 19, 2001 at 05:07:31PM +0100, Hrvoje Niksic wrote: > Or, to pick another example, say you want to download the second > kilobyte of a file: > > --range=1025..2048 > --range=1024..2047 I haven't been following that closely, but how are you going to tell what the user really wants to do if he gives either of those two statements? If you define .. as being closed-closed and inclusive, I'm confused how 1025..2048 will be interpreted the same as 1024..2047. Are we doing implicit kb rounding? -- #ozone/algorithm <[EMAIL PROTECTED]> - trust.in.love.to.save
Re: Patch: --range switch implemented
Herold Heiko <[EMAIL PROTECTED]> writes: > However, of the top of my head I can't remember many occasions where > 0-n means closed-open There are. (And note that it's n-m in the general case, not just 0-n.) Off the top of my head, the Java string subscripts, Lisp array-related functions, Python slices, various Emacs functions, etc. The Python example is easy to demonstrate: >>> range(0, 10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Also: >>> a = [0, 1, 2, 3, 4, 5] >>> a[2:4] [2, 3] This makes perfect sense to me, but not everyone would agree. The nicest thing about it is that it allows this: >>> a[0:3] + a[3:] [0, 1, 2, 3, 4, 5] I.e. you can construct the original interval by appending the "touching" subintervals. This is nice for downloads because it allows you to download 0-2k, 2k-5k, etc., without the one overlapping byte. Common Lisp: [1]> (setq a '(0 1 2 3 4 5)) (0 1 2 3 4 5) [2]> (subseq a 2 4) (2 3) Perl avoids the potential confusion by having its SUBSTR take offset (0-based) and length, which is clear to everyone. > while there are at least Pascal and Perl where 0..n The Pascal reference is to 1..n, not 0..n. Which is one point you seem to have missed: IMHO [start, end] makes more sense with intervals that start with 1, and [start, end) makes more sense with intervals with start with 0. > 0..10 #11 bytes including first one, like Perl, Pascal > 1-10 #10 bytes including first one 1-10 is what I meant by "the Pascal way" because most Pascal programs use 1-based arrays. Again, assuming we want to download 16 bytes, the three options are, in my order of preference: 1 .. 16 # end-closed 1-based, Pascal-like 0 .. 15 # end-closed 0-based, Perl-like 0 .. 16 # end-open 0-based, Python-like Maybe we should support all 3, but document only one in --help? That way most users will not notice the "complexity". Also, the first option could well be ignored since 1-based arrays are for wimps. :-)
Re: Patch: --range switch implemented
Herold Heiko <[EMAIL PROTECTED]> writes: > Personally I'd be happy either way, but you'll never be able to make > happy everybody. Choose what you prefer I'd love to choose what I prefer, but I'd like to avoid my wild preferences ruining it for everyone else. :-) Thanks for the support, though. > In the style of my previous post, if you choose a closed-open > interval, something like NOTE whole file = 0..size !!! (yeah > tripple exclamation marks, sign of a diseased mind - should make > people think fair enough). Actually, ".." sounds more like closed-closed to me. :-) (That must be Pascal childhood rearing its ugly head.) How about supporting both? For example: 1. --range=1..size # wimps, or: --range=0..size-1 # slightly less wimpy, but less "consistent" 2. --range=0-size # real (wo)men 3. --range=0+size # both Or, to pick another example, say you want to download the second kilobyte of a file: --range=1025..2048 --range=1024..2047 --range=1024-2048 # my preferred version --range=1024+1024 # also cool
RE: Patch: --range switch implemented
>From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] > >[ Note for Wget list readers: this discusses the `--range' option > submitted to the patch list. ] > >Herold Heiko <[EMAIL PROTECTED]> writes: > >> Also, possibly I missed something, does the download start at byte 0 >But you've still raised an interesting question. I would actually be >happiest with using a closed-open interval for the range, but I didn't > --range=0-1k > --range=1k-2k > --range=2k-3k > ... > > There is no way to do the latter with closed-closed endpoints, where > you have to add and subtract one at a number of places for things to > work. > >It has been said (and I whole-heartedly agre) that closed-closed >intervals are more natural in 1-based counting, whereas closed-open >intervals are more fit for 0-based counting. I would really like Wget >to use 0-based closed-open interval specification, but I'm still >afraid the users woul have problems understanding it, and that's why I >didn't propose it. > >I would appreciate other people's comments on this. IMHO. This would render easy the whole thing for a) people who don't read the manual and are lucky enough to do the correct thing for wrong reasons b) people who do read the manual, or at least check it out if something won't work like expected. It would render things more difficult for people who think they know what they are doing, don't check the manuals and complain wildly if things won't work, in other words lazy sysadmins and programmers ;-) Personally I'd be happy either way, but you'll never be able to make happy everybody. Choose what you prefer, document away with good examples, and put something on the wget --help page (I think the most used reference source anyway) which makes people *think* and check the manual (if it seems strange to them). In the style of my previous post, if you choose a closed-open interval, something like NOTE whole file = 0..size !!! (yeah tripple exclamation marks, sign of a diseased mind - should make people think fair enough). Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087 -- ITALY
Re: Patch: --range switch implemented
[ Note for Wget list readers: this discusses the `--range' option submitted to the patch list. ] Herold Heiko <[EMAIL PROTECTED]> writes: > Also, possibly I missed something, does the download start at byte 0 > (like most programmers ecc. would expect) or at byte 1 (like most users > would expect) ? In other words, to download the first half of a 10 byte > file --range 0-4 or --range 1-5 ? It's 0-4, as documented by Alex. But you've still raised an interesting question. I would actually be happiest with using a closed-open interval for the range, but I didn't want to propose that because it would be confusing to most non-programmer (and some programmer) users. The closed-open interval means that --range=0-5 retrieves the first five bytes, i.e. bytes numbered 0-4, or 1-5, depending on how you count. Although closed-open intervals are confusing at first, they have some very nice properties: * You can get the interval size simply by subtracting the endpoints. * You can easily specify touching but non-overlapping intervals. For example, if you wanted to download the file in 1k chunks, you could do this: --range=0-1024 --range=1024-2048 --range=2048-3072 ... Or, even better: --range=0-1k --range=1k-2k --range=2k-3k ... There is no way to do the latter with closed-closed endpoints, where you have to add and subtract one at a number of places for things to work. It has been said (and I whole-heartedly agre) that closed-closed intervals are more natural in 1-based counting, whereas closed-open intervals are more fit for 0-based counting. I would really like Wget to use 0-based closed-open interval specification, but I'm still afraid the users woul have problems understanding it, and that's why I didn't propose it. I would appreciate other people's comments on this.