Re: [Rd] Connections to https: URLs -- IE expert help needed

2007-01-06 Thread Duncan Temple Lang


Prof Brian Ripley wrote:
> On Mon, 1 Jan 2007, Duncan Temple Lang wrote:
> 
>> Kurt Hornik wrote:
 Duncan Temple Lang writes:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 Prof Brian Ripley wrote:
> I've added to R-devel the ability to use download.file() and url() to
> https: URLs, *only* if --internet2 is used on Windows.
>
> This uses the Internet Explorer internals, and only works if the
> certificate is accepted (so e.g. does not work for
> https://svn.r-project.org).
>
> Now I use IE (and Windows for that matter) only when really necessary, and
> Firefox has simple ways to permanently accept non-verifiable certificates.
> I would be grateful if someone who is much more familiar with IE could
> write a note explaining how to deal with this that we could add to the
> rw-FAQ.
>
> To forestall the inevitable question: there are no plans to add https:
> support on any other platform, but it is something that would make a nice
> project for a user contribution.  The current internal code is based on
> likxml2, and that AFAICS still does not have https: support.
>
 Generally (i.e. not in particular response to Brian but related to
 this thread)
>>> With a similar disclaimer: Brian's efforts were triggered by me asking
>>> how to use url() to read R's mailing list archive files, such as
>>>
>>>   https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz
>>>
>>> directly into R.  Turns out we cannot ... which, in a way, is a shame
>>> ("R cannot read its own web pages") :-(
>> Indeed, it is a shame.  Although, when I process mail messages,
>> I use Perl's very rich collection of modules for processing
>> mail in so many different formats. And then I use RSPerl
>> to control this and get the data into R pretty quickly.
>> So we can do it in R and probably the delegation to
>> mail-processing software is a good given the number of special
>> cases, etc.
>>
>> And even if we had HTTPs in R, we would still want to deal with
>> the certificate on that page, which gets us to more details.
>> Which is the reason I think leaving things to libcurl,
>> libwww, etc. will be best as they continue to evolve
>> to handle new protocols and settings.
> 
> The issue here is the same as it ever was, that of event-loops and not 
> blocking the R process.  I think that is where the missing extensibility 
> is, and it has been raised for at least 6 years now.

Of course, that is one area where extensibility is needed.
Attempts have been made to address this generaly over the last 6 
years,
but the architecture of and the focus on the current numerous R 
front-ends is not necessarily ideal for trying to solve this 
properly.

But your sentence suggests that the extensibility of the
connection API is not an issue.  And we don't agree
on that. I think the two issues of extensibility are
relevant. Blocking is important, but not being able
to explore or add new facilities is fundamental
and I believe of immense importance. Extensibility
of the R engine at the system level rather than in the
interpreted language is a major impediment to the evolution
of R, IMHO.

> 
> If I try to get that example URI with RCurl it
> 
> 1) blocks the R process for a long time.
> 2) fails to retrieve the URI as it is unable to handle the certificate.

2) is, as you would put it, "user error" ;-)
You need to tell libcurl what options you want in the request.
Telling it whether to ignore certificates, where the certificates 
are, etc.  are query-specific options.

> 
> Can you please point us to an extension package that behaves better?
> 

Well, as regards point 1), libcurl does have facilities for
non-blocking calls and so does RCurl via the multi_ interface
of libcurl and the function getURIAsynchronous() in RCurl and the
lower-level functions.
  And one could also merge the basic libcurl interface
into our select calls. I seem to recall libwww has features we 
also can manually integrate into our event loop.

The key thing I am trying to get across is that if we
are going to include these things into R and we have
to do things manually, then we should try to integrate
them in an evolvable, extensible manner that leverages
libraries that do things properly.

> [When Kurt first sent me the example, I was surprised that wget handled 
> it. I then checked, and wget < 1.10 does not check certificates at all.]
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Connections to https: URLs -- IE expert help needed

2007-01-05 Thread Prof Brian Ripley
On Mon, 1 Jan 2007, Duncan Temple Lang wrote:

> Kurt Hornik wrote:
>>> Duncan Temple Lang writes:
>>
>>> -BEGIN PGP SIGNED MESSAGE-
>>> Hash: SHA1
>>
>>> Prof Brian Ripley wrote:
 I've added to R-devel the ability to use download.file() and url() to
 https: URLs, *only* if --internet2 is used on Windows.

 This uses the Internet Explorer internals, and only works if the
 certificate is accepted (so e.g. does not work for
 https://svn.r-project.org).

 Now I use IE (and Windows for that matter) only when really necessary, and
 Firefox has simple ways to permanently accept non-verifiable certificates.
 I would be grateful if someone who is much more familiar with IE could
 write a note explaining how to deal with this that we could add to the
 rw-FAQ.

 To forestall the inevitable question: there are no plans to add https:
 support on any other platform, but it is something that would make a nice
 project for a user contribution.  The current internal code is based on
 likxml2, and that AFAICS still does not have https: support.

>>
>>> Generally (i.e. not in particular response to Brian but related to
>>> this thread)
>>
>> With a similar disclaimer: Brian's efforts were triggered by me asking
>> how to use url() to read R's mailing list archive files, such as
>>
>>   https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz
>>
>> directly into R.  Turns out we cannot ... which, in a way, is a shame
>> ("R cannot read its own web pages") :-(
>
> Indeed, it is a shame.  Although, when I process mail messages,
> I use Perl's very rich collection of modules for processing
> mail in so many different formats. And then I use RSPerl
> to control this and get the data into R pretty quickly.
> So we can do it in R and probably the delegation to
> mail-processing software is a good given the number of special
> cases, etc.
>
> And even if we had HTTPs in R, we would still want to deal with
> the certificate on that page, which gets us to more details.
> Which is the reason I think leaving things to libcurl,
> libwww, etc. will be best as they continue to evolve
> to handle new protocols and settings.

The issue here is the same as it ever was, that of event-loops and not 
blocking the R process.  I think that is where the missing extensibility 
is, and it has been raised for at least 6 years now.

If I try to get that example URI with RCurl it

1) blocks the R process for a long time.
2) fails to retrieve the URI as it is unable to handle the certificate.

Can you please point us to an extension package that behaves better?

[When Kurt first sent me the example, I was surprised that wget handled 
it. I then checked, and wget < 1.10 does not check certificates at all.]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Connections to https: URLs -- IE expert help needed

2007-01-01 Thread Duncan Temple Lang
Kurt Hornik wrote:
> > Duncan Temple Lang writes:
> 
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> 
> > Prof Brian Ripley wrote:
> >> I've added to R-devel the ability to use download.file() and url() to 
> >> https: URLs, *only* if --internet2 is used on Windows.
> >> 
> >> This uses the Internet Explorer internals, and only works if the 
> >> certificate is accepted (so e.g. does not work for 
> >> https://svn.r-project.org).
> >> 
> >> Now I use IE (and Windows for that matter) only when really necessary, and 
> >> Firefox has simple ways to permanently accept non-verifiable certificates. 
> >> I would be grateful if someone who is much more familiar with IE could 
> >> write a note explaining how to deal with this that we could add to the 
> >> rw-FAQ.
> >> 
> >> To forestall the inevitable question: there are no plans to add https: 
> >> support on any other platform, but it is something that would make a nice 
> >> project for a user contribution.  The current internal code is based on 
> >> likxml2, and that AFAICS still does not have https: support.
> >> 
> 
> > Generally (i.e. not in particular response to Brian but related to
> > this thread)
> 
> With a similar disclaimer: Brian's efforts were triggered by me asking
> how to use url() to read R's mailing list archive files, such as
> 
>   https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz
> 
> directly into R.  Turns out we cannot ... which, in a way, is a shame
> ("R cannot read its own web pages") :-(

Indeed, it is a shame.  Although, when I process mail messages,
I use Perl's very rich collection of modules for processing
mail in so many different formats. And then I use RSPerl
to control this and get the data into R pretty quickly.
So we can do it in R and probably the delegation to 
mail-processing software is a good given the number of special
cases, etc.

And even if we had HTTPs in R, we would still want to deal with
the certificate on that page, which gets us to more details.
Which is the reason I think leaving things to libcurl,
libwww, etc. will be best as they continue to evolve
to handle new protocols and settings.

 D.

> 
> Best
> -k
> 
> > An alternative is to use RCurl and leave HTTPS and a host of other
> > protocols and details to an external library (e.g. libcurl, libwww,
> > etc.) and an R package that interfaces to it.
> 
> > If we want the facilities to be accessible via the connections
> > interface, then we can make that API extensible by packages.  Jeff
> > Horner has a proposal on that.
> 
> > Generally, it is important if R is to continue to evolve that the R
> > internals become extensible by package developers so that we can do
> > some new experiments and provide alternative implementations of the
> > basic structures rather than being tied to the existing
> > representation.  An object oriented framework underlying the R source
> > code would enable this and would solve numerous problems that have
> > arisen recently and I strongly suspect many more that will arise.
> 
> >  D.
> 
> 
> > - --
> > Duncan Temple Lang[EMAIL PROTECTED]
> > Department of Statistics  work:  (530) 752-4782
> > 4210 Mathematical Sciences Building   fax:   (530) 752-7099
> > One Shields Ave.
> > University of California at Davis
> > Davis,
> > CA 95616,
> > USA
> > -BEGIN PGP SIGNATURE-
> > Version: GnuPG v1.4.3 (Darwin)
> 
> > iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX
> > zci7UsavJESdidbyiCl2Xgw=
> > =HiuE
> > -END PGP SIGNATURE-
> 
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Duncan Temple Lang[EMAIL PROTECTED]
Department of Statistics  work:  (530) 752-4782
4210 Mathematical Sciences Bldg.  fax:   (530) 752-7099
One Shields Ave.
University of California at Davis
Davis, CA 95616, USA





pgpGagOBzSxIn.pgp
Description: PGP signature
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Connections to https: URLs -- IE expert help needed

2007-01-01 Thread Kurt Hornik
> Duncan Temple Lang writes:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1

> Prof Brian Ripley wrote:
>> I've added to R-devel the ability to use download.file() and url() to 
>> https: URLs, *only* if --internet2 is used on Windows.
>> 
>> This uses the Internet Explorer internals, and only works if the 
>> certificate is accepted (so e.g. does not work for 
>> https://svn.r-project.org).
>> 
>> Now I use IE (and Windows for that matter) only when really necessary, and 
>> Firefox has simple ways to permanently accept non-verifiable certificates. 
>> I would be grateful if someone who is much more familiar with IE could 
>> write a note explaining how to deal with this that we could add to the 
>> rw-FAQ.
>> 
>> To forestall the inevitable question: there are no plans to add https: 
>> support on any other platform, but it is something that would make a nice 
>> project for a user contribution.  The current internal code is based on 
>> likxml2, and that AFAICS still does not have https: support.
>> 

> Generally (i.e. not in particular response to Brian but related to
> this thread)

With a similar disclaimer: Brian's efforts were triggered by me asking
how to use url() to read R's mailing list archive files, such as

  https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz

directly into R.  Turns out we cannot ... which, in a way, is a shame
("R cannot read its own web pages") :-(

Best
-k

> An alternative is to use RCurl and leave HTTPS and a host of other
> protocols and details to an external library (e.g. libcurl, libwww,
> etc.) and an R package that interfaces to it.

> If we want the facilities to be accessible via the connections
> interface, then we can make that API extensible by packages.  Jeff
> Horner has a proposal on that.

> Generally, it is important if R is to continue to evolve that the R
> internals become extensible by package developers so that we can do
> some new experiments and provide alternative implementations of the
> basic structures rather than being tied to the existing
> representation.  An object oriented framework underlying the R source
> code would enable this and would solve numerous problems that have
> arisen recently and I strongly suspect many more that will arise.

>  D.


> - --
> Duncan Temple Lang[EMAIL PROTECTED]
> Department of Statistics  work:  (530) 752-4782
> 4210 Mathematical Sciences Building   fax:   (530) 752-7099
> One Shields Ave.
> University of California at Davis
> Davis,
> CA 95616,
> USA
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.3 (Darwin)

> iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX
> zci7UsavJESdidbyiCl2Xgw=
> =HiuE
> -END PGP SIGNATURE-

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Connections to https: URLs -- IE expert help needed

2006-12-20 Thread Duncan Temple Lang
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Prof Brian Ripley wrote:
> I've added to R-devel the ability to use download.file() and url() to 
> https: URLs, *only* if --internet2 is used on Windows.
> 
> This uses the Internet Explorer internals, and only works if the 
> certificate is accepted (so e.g. does not work for 
> https://svn.r-project.org).
> 
> Now I use IE (and Windows for that matter) only when really necessary, and 
> Firefox has simple ways to permanently accept non-verifiable certificates. 
> I would be grateful if someone who is much more familiar with IE could 
> write a note explaining how to deal with this that we could add to the 
> rw-FAQ.
> 
> To forestall the inevitable question: there are no plans to add https: 
> support on any other platform, but it is something that would make a nice 
> project for a user contribution.  The current internal code is based on 
> likxml2, and that AFAICS still does not have https: support.
> 

Generally (i.e. not in particular response to Brian but related to this
thread)

An alternative is to use RCurl and leave HTTPS and a host
of other protocols and details to an external library (e.g. libcurl,
libwww, etc.) and an R package that interfaces to it.

If we want the facilities to be accessible via the connections
interface, then we can make that API extensible by packages.
Jeff Horner has a proposal on that.

Generally, it is important if R is to continue to evolve that the R
internals become extensible by package developers so that we can do some
new experiments and provide alternative implementations of the basic
structures rather than being tied to the existing representation.
An object oriented framework underlying the R source code would enable
this and would solve numerous problems that have arisen recently
and I strongly suspect many more that will arise.

 D.


- --
Duncan Temple Lang[EMAIL PROTECTED]
Department of Statistics  work:  (530) 752-4782
4210 Mathematical Sciences Building   fax:   (530) 752-7099
One Shields Ave.
University of California at Davis
Davis,
CA 95616,
USA
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX
zci7UsavJESdidbyiCl2Xgw=
=HiuE
-END PGP SIGNATURE-

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Connections to https: URLs -- IE expert help needed

2006-12-20 Thread Prof Brian Ripley
I've added to R-devel the ability to use download.file() and url() to 
https: URLs, *only* if --internet2 is used on Windows.

This uses the Internet Explorer internals, and only works if the 
certificate is accepted (so e.g. does not work for 
https://svn.r-project.org).

Now I use IE (and Windows for that matter) only when really necessary, and 
Firefox has simple ways to permanently accept non-verifiable certificates. 
I would be grateful if someone who is much more familiar with IE could 
write a note explaining how to deal with this that we could add to the 
rw-FAQ.

To forestall the inevitable question: there are no plans to add https: 
support on any other platform, but it is something that would make a nice 
project for a user contribution.  The current internal code is based on 
likxml2, and that AFAICS still does not have https: support.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel