Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
Hello Enzo, we never developed a patch for this issue. I believe back in 2004 and nutch 0.4 version, there was an other fetcher modul which was replaced in 0.5 version. This fetcher was able to throttle bandwith, but it was also very buggy. So the wiki description would be obsolete. I am not familar with all the changes since version 0.7 So, it might be good, if somebody could change the wiki. If you are interested to see, how this option was implemented, maybe you could find the old version in cvs. Regards, Matthias Enzo Michelangeli schrieb: Hi Matthias, I'm writing you about the Nutch config file option fetcher.throttle.bandwidth , referenced by you at http://wiki.apache.org/nutch/FetchOptions . According to Andrzej Bialecki in the thread http://www.nabble.com/Is--fetcher.throttle.bandwidth-known-to-work--t3861057.html , that refers to a private patch not part of Nutch' mainline code base. Is that patch available from you for submission to the Nutch team? Thanks, Enzo Enzo Michelangeli schrieb: - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Tuesday, June 05, 2007 4:56 PM [...] You can achieve a somewhat similar effect by controlling the number of fetcher threads. I realize this is not as accurate as a specific control mechanism, but so far it was sufficient for most users. If this feature is important to you, please provide a patch that implements it, and we'll consider it for inclusion. I think that for the time being I'll just channel the traffic through a Squid proxy, and use its delay pools feature to throttle the bandwidth (and also its DNS caching, which, as I mentioned a few days ago, I also need...). For Nutch, it might make sense to find the original patch. I'll try to get n touch with Matthias Jaekle, who authored that wiki page where fetcher.throttle.bandwidth was referenced. Thanks anyway, Enzo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
- Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Monday, June 04, 2007 2:05 PM Er... I saw it mentioned at http://wiki.apache.org/nutch/FetchOptions , so I thought it was for real... Sorry, this page is wrong and should be corrected - some of the options listed there were either a part of older version of Fetcher (and have been replaced), or they were a part of a private patch (as was the case with throttling). Don't you think that throttling would be a valuable feature to retain? Is there anything to prevent saturation of the link to the Internet, either in the release 0.9 or in the current nightly builds code? Enzo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
Enzo Michelangeli wrote: - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Monday, June 04, 2007 2:05 PM Er... I saw it mentioned at http://wiki.apache.org/nutch/FetchOptions , so I thought it was for real... Sorry, this page is wrong and should be corrected - some of the options listed there were either a part of older version of Fetcher (and have been replaced), or they were a part of a private patch (as was the case with throttling). Don't you think that throttling would be a valuable feature to retain? Is there anything to prevent saturation of the link to the Internet, either in the release 0.9 or in the current nightly builds code? You can achieve a somewhat similar effect by controlling the number of fetcher threads. I realize this is not as accurate as a specific control mechanism, but so far it was sufficient for most users. If this feature is important to you, please provide a patch that implements it, and we'll consider it for inclusion. -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
- Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Tuesday, June 05, 2007 4:56 PM [...] You can achieve a somewhat similar effect by controlling the number of fetcher threads. I realize this is not as accurate as a specific control mechanism, but so far it was sufficient for most users. If this feature is important to you, please provide a patch that implements it, and we'll consider it for inclusion. I think that for the time being I'll just channel the traffic through a Squid proxy, and use its delay pools feature to throttle the bandwidth (and also its DNS caching, which, as I mentioned a few days ago, I also need...). For Nutch, it might make sense to find the original patch. I'll try to get n touch with Matthias Jaekle, who authored that wiki page where fetcher.throttle.bandwidth was referenced. Thanks anyway, Enzo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
Enzo Michelangeli wrote: - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Monday, June 04, 2007 1:31 AM Enzo Michelangeli wrote: In my case (with Nutch 0.8), it seems not: I set it to 500, and the fetcher still saturates the 1.5 Mbit/s link... Is it supposed to work for the total bandwidth, or for each thread? There's nothing in the current code base to support this, neither there is a config property with such name ... Is this perhaps a part of your local code base? Er... I saw it mentioned at http://wiki.apache.org/nutch/FetchOptions , so I thought it was for real... Sorry, this page is wrong and should be corrected - some of the options listed there were either a part of older version of Fetcher (and have been replaced), or they were a part of a private patch (as was the case with throttling). -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
[Nutch-general] Is fetcher.throttle.bandwidth known to work?
In my case (with Nutch 0.8), it seems not: I set it to 500, and the fetcher still saturates the 1.5 Mbit/s link... Is it supposed to work for the total bandwidth, or for each thread? Enzo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
Enzo Michelangeli wrote: In my case (with Nutch 0.8), it seems not: I set it to 500, and the fetcher still saturates the 1.5 Mbit/s link... Is it supposed to work for the total bandwidth, or for each thread? There's nothing in the current code base to support this, neither there is a config property with such name ... Is this perhaps a part of your local code base? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general
Re: [Nutch-general] Is fetcher.throttle.bandwidth known to work?
- Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] Sent: Monday, June 04, 2007 1:31 AM Enzo Michelangeli wrote: In my case (with Nutch 0.8), it seems not: I set it to 500, and the fetcher still saturates the 1.5 Mbit/s link... Is it supposed to work for the total bandwidth, or for each thread? There's nothing in the current code base to support this, neither there is a config property with such name ... Is this perhaps a part of your local code base? Er... I saw it mentioned at http://wiki.apache.org/nutch/FetchOptions , so I thought it was for real... Enzo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general