Re: Use of our external site embedded into a Debian file

2007-09-21 Thread Ian Campbell
Package: liquidlnf
Priority: Normal
Version: 2.9.1-2

Please stop using sitetruth.com in your debian/watch -- I think it would
be preferable to just disable the watch file until uscan gets https
support, if it doesn't already have it (I think 2.10.7 does though).

John, this mail to [EMAIL PROTECTED] will open a bug on the package with the
link to your site.

Ian.

On Thu, 2007-09-20 at 22:18 -0700, John Nagle wrote:
 REF: http://db.debian.net/lurker/message/20070707.195201.8e2c00a8.en.html
 Author: Varun Hiremath
 Date: 2007-07-07 12:52 -700
 To: 423669
 CC: control, Torsten Werner
 New-Topics: Processed: uscan: https support
 Subject: Bug#423669: uscan: https support
 
 We noticed a wierd usage of our SiteTruth.com site mentioned in a
 Debian bug report.  Bug report #423669 apparently patched a problem
 by using a link to a CGI script on our site.
 
 We have a system that rates web pages, and as a service for webmasters,
 we have a little utility, viewer.cgi, which is used to show users how
 our crawler saw a page.  Somebody stuck this into a Debian watchfile
 because it can be used to read a HTTPS page via HTTP, something they needed.
 
 But viewer.cgi does more than that.  It's not a transparent proxy.
 It truncates pages at 1MB, parses the HTML into a tree, converts
 to Unicode/UTF-8, makes all the links absolute, removes embedded
 content (Javascript, Flash, etc.), and outputs the result as cleaned up
 and properly indented HTML.  What you get out isn't quite what went in.
 So this probably isn't what you want.
 
 SiteTruth really shouldn't be part of some Debian build procedure.
 We suggest finding some other way to read HTTPS pages with HTTP.
 Wrong tool for the job.  Thanks.
 
   John Nagle
   SiteTruth
   http://www.sitetruth.com
   [EMAIL PROTECTED]
 
 
-- 
Ian Campbell

NEWS FLASH!!
Today the East German pole-vault champion became the West German pole-vault
champion.


signature.asc
Description: This is a digitally signed message part


Re: Use of our external site embedded into a Debian file

2007-09-21 Thread Ian Campbell
On Fri, 2007-09-21 at 15:41 +1000, Ben Finney wrote:
 
  We have a system that rates web pages, and as a service for
  webmasters, we have a little utility, viewer.cgi, which is used to
  show users how our crawler saw a page.  Somebody stuck this into a
  Debian watchfile because it can be used to read a HTTPS page via
  HTTP, something they needed.
 
 Yes, that was the example. It's actually unrelated to the resolution
 of bug #432669, which was (according to the information in the bug
 report) fixed by implementing HTTPS properly in the 'uscan' utility.

It wasn't just an example, liquidlnf 2.9.1-2 in unstable references
sitewatch in debian/watch.

Ian.
-- 
Ian Campbell

You have all eternity to be cautious in when you're dead.
-- Lois Platford


signature.asc
Description: This is a digitally signed message part


Re: Use of our external site embedded into a Debian file

2007-09-20 Thread Ben Finney
[cc-ing John Nagle as context suggests he's not on this list. John, if
you are subscribed, please say so and we'll stop cc-ing you.]

John, thanks very much for researching the problem before reporting
it. I understand it can be alarming to see that an automated system is
accessing your system in what appears to be an inappropriate fashion;
thank you for coming to us with information instead of demands :-)

John Nagle [EMAIL PROTECTED] writes:

 We noticed a wierd usage of our SiteTruth.com site mentioned in a
 Debian bug report.  Bug report #423669 apparently patched a problem
 by using a link to a CGI script on our site.

That's not the case. Varun Hiremath is showing an example of a
workaround to fetch a file; bug #423669 is unrelated to sitetruth.com.

 We have a system that rates web pages, and as a service for
 webmasters, we have a little utility, viewer.cgi, which is used to
 show users how our crawler saw a page.  Somebody stuck this into a
 Debian watchfile because it can be used to read a HTTPS page via
 HTTP, something they needed.

Yes, that was the example. It's actually unrelated to the resolution
of bug #432669, which was (according to the information in the bug
report) fixed by implementing HTTPS properly in the 'uscan' utility.

 SiteTruth really shouldn't be part of some Debian build procedure.
 We suggest finding some other way to read HTTPS pages with HTTP.
 Wrong tool for the job.  Thanks.

You're quite right that it would be foolish to do so. I believe, from
reading the bug report, that it was merely being used to demonstrate
the problem (lack of HTTPS support), rather than to become part of a
package's built procedure.

Do you have reason to believe the sitetruth.com service is still being
accessed routinely from Debian build programs?

-- 
 \ Today, I was -- no, that wasn't me.  -- Steven Wright |
  `\   |
_o__)  |
Ben Finney


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]