Re: Bug in ETA code on x64
El 28/03/2006, a las 20:43, Tony Lewis escribió: Hrvoje Niksic wrote: The cast to int looks like someone was trying to remove a warning and botched operator precedence in the process. I can't see any good reason to use , here. Why not write the line as: eta_hrs = eta / 3600; eta %= 3600; Because that's not equivalent. The sequence or comma operator , has two operands: first the left operand is evaluated, then the right. The result has the type and value of the right operand. Note that a command in a list of initializations or arguments is not an operator, but simply a punctuation mark!. Cheers, Greg smime.p7s Description: S/MIME cryptographic signature
Re: Bug in ETA code on x64
Greg Hurrell [EMAIL PROTECTED] writes: El 28/03/2006, a las 20:43, Tony Lewis escribió: Hrvoje Niksic wrote: The cast to int looks like someone was trying to remove a warning and botched operator precedence in the process. I can't see any good reason to use , here. Why not write the line as: eta_hrs = eta / 3600; eta %= 3600; Because that's not equivalent. Well, it should be, because the comma operator has lower precedence than the assignment operator (see http://tinyurl.com/evo5a, http://tinyurl.com/ff4pp and numerous other locations). I'd still like to know where Thomas got his version of progress.c because it seems that the change has introduced the bug.
regex support RFC
hrvoje and i have been recently talking about adding regex support to wget. we were considering to add a new --filter option which, by supporting regular expressions, would allow more powerful ways of filtering urls to download. for instance the new option could allow the filtering of domain names, file names and url paths. in the following case --filter is used to prevent any download from the www-*.yoyodyne.com domain and to restrict download only to .gif files: wget -r --filter=-domain:www-*.yoyodyne.com --filter=+file:\.gif$ http://yoyodyne.com (notice that --filter interprets every given rule as a regex). i personally think the --filter option would be a great new feature for wget, and i have already started working on its implementation, but we still have a few opened questions. for instance, the syntax for --filter presented above is basically the following: --filter=[+|-][file|path|domain]:REGEXP is it consistent? is it flawed? is there a more convenient one? please notice that supporting multiple comma-separated regexp in a single --filter option: --filter=[+|-][file|path|domain]:REGEXP1,REGEXP2,... would significantly complicate the implementation and usage of --filter, as it would require escaping of the , charachter. also notice that current filtering options like -A/R are somewhat broken, as they do not allow the usage of , char in filtering rules. we also have to reach consensus on the filtering algorithm. for instance, should we simply require that a url passes all the filtering rules to allow its download (just like the current -A/R behaviour), or should we instead adopt a short circuit algorithm that applies all rules in the same order in which they were given in the command line and immediately allows the download of an url if it passes the first allow match? should we also support apache-like deny-from-all and allow-from-all policies? and what would be the best syntax to trigger the usage of these policies? i am looking forward to read your opinions on this topic. P.S.: the new --filter option would replace and extend the old -D, -I/X and -A/R options, which will be deprecated but still supported. -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: regex support RFC
what definition of regexp would you be following? or would this be making up something new? I'm not quite understanding the comment about the comma and needing escaping for literal commas. this is true for any character in the regexp language, so why the special concern for comma? I do like the [file|path|domain]: approach. very nice and flexible. (and would be a huge help to one specific need I have!) I suggest also including an any option as a shortcut for putting the same pattern in all three options. Jim On Wed, 29 Mar 2006, Mauro Tortonesi wrote: hrvoje and i have been recently talking about adding regex support to wget. we were considering to add a new --filter option which, by supporting regular expressions, would allow more powerful ways of filtering urls to download. for instance the new option could allow the filtering of domain names, file names and url paths. in the following case --filter is used to prevent any download from the www-*.yoyodyne.com domain and to restrict download only to .gif files: wget -r --filter=-domain:www-*.yoyodyne.com --filter=+file:\.gif$ http://yoyodyne.com (notice that --filter interprets every given rule as a regex). i personally think the --filter option would be a great new feature for wget, and i have already started working on its implementation, but we still have a few opened questions. for instance, the syntax for --filter presented above is basically the following: --filter=[+|-][file|path|domain]:REGEXP is it consistent? is it flawed? is there a more convenient one? please notice that supporting multiple comma-separated regexp in a single --filter option: --filter=[+|-][file|path|domain]:REGEXP1,REGEXP2,... would significantly complicate the implementation and usage of --filter, as it would require escaping of the , charachter. also notice that current filtering options like -A/R are somewhat broken, as they do not allow the usage of , char in filtering rules. we also have to reach consensus on the filtering algorithm. for instance, should we simply require that a url passes all the filtering rules to allow its download (just like the current -A/R behaviour), or should we instead adopt a short circuit algorithm that applies all rules in the same order in which they were given in the command line and immediately allows the download of an url if it passes the first allow match? should we also support apache-like deny-from-all and allow-from-all policies? and what would be the best syntax to trigger the usage of these policies? i am looking forward to read your opinions on this topic. P.S.: the new --filter option would replace and extend the old -D, -I/X and -A/R options, which will be deprecated but still supported. -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: regex support RFC
Mauro Tortonesi [EMAIL PROTECTED] writes: for instance, the syntax for --filter presented above is basically the following: --filter=[+|-][file|path|domain]:REGEXP I think there should also be url for filtering on the entire URL. People have been asking for that kind of thing a lot over the years.
Re: regex support RFC
Jim Wright [EMAIL PROTECTED] writes: what definition of regexp would you be following? or would this be making up something new? It wouldn't be new, Mauro is definitely referring to regexps as normally understood. The regexp API's found on today's Unix systems might be usable, but unfortunately those are not available on Windows. They also lack the support for the very useful non-greedy matching quantifier (the ? modifier to the * operator) introduced by Perl 5 and supported by most of today's major regexp implementations: Python, Java, Tcl, etc. One idea was to use PCRE, bundling it with Wget for the sake of Windows and systems without PCRE. Another (http://tinyurl.com/elp7h) was to use and bundle Emacs's regex.c, the version of GNU regex shipped with GNU Emacs. It is small (one source) and offers Unix-compatible basic and extended regeps, but also supports the non-greedy quantifier and non-capturing groups. See the message and the related discussion at http://tinyurl.com/mdwhx for more about this topic. I'm not quite understanding the comment about the comma and needing escaping for literal commas. Supporting PATTERN1,PATTERN2,... would require having a way to quote the comma character. But there is little reason for a specific comma syntax since one can always use (PATTERN1|PATTERN2|...). Being unable to have a comma in the pattern is a shortcoming in the current -R/-A options. I do like the [file|path|domain]: approach. very nice and flexible. Thanks.