Re: Use of our external site embedded into a Debian file
Package: liquidlnf Priority: Normal Version: 2.9.1-2 Please stop using sitetruth.com in your debian/watch -- I think it would be preferable to just disable the watch file until uscan gets https support, if it doesn't already have it (I think 2.10.7 does though). John, this mail to [EMAIL PROTECTED] will open a bug on the package with the link to your site. Ian. On Thu, 2007-09-20 at 22:18 -0700, John Nagle wrote: REF: http://db.debian.net/lurker/message/20070707.195201.8e2c00a8.en.html Author: Varun Hiremath Date: 2007-07-07 12:52 -700 To: 423669 CC: control, Torsten Werner New-Topics: Processed: uscan: https support Subject: Bug#423669: uscan: https support We noticed a wierd usage of our SiteTruth.com site mentioned in a Debian bug report. Bug report #423669 apparently patched a problem by using a link to a CGI script on our site. We have a system that rates web pages, and as a service for webmasters, we have a little utility, viewer.cgi, which is used to show users how our crawler saw a page. Somebody stuck this into a Debian watchfile because it can be used to read a HTTPS page via HTTP, something they needed. But viewer.cgi does more than that. It's not a transparent proxy. It truncates pages at 1MB, parses the HTML into a tree, converts to Unicode/UTF-8, makes all the links absolute, removes embedded content (Javascript, Flash, etc.), and outputs the result as cleaned up and properly indented HTML. What you get out isn't quite what went in. So this probably isn't what you want. SiteTruth really shouldn't be part of some Debian build procedure. We suggest finding some other way to read HTTPS pages with HTTP. Wrong tool for the job. Thanks. John Nagle SiteTruth http://www.sitetruth.com [EMAIL PROTECTED] -- Ian Campbell NEWS FLASH!! Today the East German pole-vault champion became the West German pole-vault champion. signature.asc Description: This is a digitally signed message part
Re: Use of our external site embedded into a Debian file
On Fri, 2007-09-21 at 15:41 +1000, Ben Finney wrote: We have a system that rates web pages, and as a service for webmasters, we have a little utility, viewer.cgi, which is used to show users how our crawler saw a page. Somebody stuck this into a Debian watchfile because it can be used to read a HTTPS page via HTTP, something they needed. Yes, that was the example. It's actually unrelated to the resolution of bug #432669, which was (according to the information in the bug report) fixed by implementing HTTPS properly in the 'uscan' utility. It wasn't just an example, liquidlnf 2.9.1-2 in unstable references sitewatch in debian/watch. Ian. -- Ian Campbell You have all eternity to be cautious in when you're dead. -- Lois Platford signature.asc Description: This is a digitally signed message part
Re: Use of our external site embedded into a Debian file
[cc-ing John Nagle as context suggests he's not on this list. John, if you are subscribed, please say so and we'll stop cc-ing you.] John, thanks very much for researching the problem before reporting it. I understand it can be alarming to see that an automated system is accessing your system in what appears to be an inappropriate fashion; thank you for coming to us with information instead of demands :-) John Nagle [EMAIL PROTECTED] writes: We noticed a wierd usage of our SiteTruth.com site mentioned in a Debian bug report. Bug report #423669 apparently patched a problem by using a link to a CGI script on our site. That's not the case. Varun Hiremath is showing an example of a workaround to fetch a file; bug #423669 is unrelated to sitetruth.com. We have a system that rates web pages, and as a service for webmasters, we have a little utility, viewer.cgi, which is used to show users how our crawler saw a page. Somebody stuck this into a Debian watchfile because it can be used to read a HTTPS page via HTTP, something they needed. Yes, that was the example. It's actually unrelated to the resolution of bug #432669, which was (according to the information in the bug report) fixed by implementing HTTPS properly in the 'uscan' utility. SiteTruth really shouldn't be part of some Debian build procedure. We suggest finding some other way to read HTTPS pages with HTTP. Wrong tool for the job. Thanks. You're quite right that it would be foolish to do so. I believe, from reading the bug report, that it was merely being used to demonstrate the problem (lack of HTTPS support), rather than to become part of a package's built procedure. Do you have reason to believe the sitetruth.com service is still being accessed routinely from Debian build programs? -- \ Today, I was -- no, that wasn't me. -- Steven Wright | `\ | _o__) | Ben Finney -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]