Re: [Rd] Connections to https: URLs -- IE expert help needed
Prof Brian Ripley wrote: > On Mon, 1 Jan 2007, Duncan Temple Lang wrote: > >> Kurt Hornik wrote: Duncan Temple Lang writes: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Prof Brian Ripley wrote: > I've added to R-devel the ability to use download.file() and url() to > https: URLs, *only* if --internet2 is used on Windows. > > This uses the Internet Explorer internals, and only works if the > certificate is accepted (so e.g. does not work for > https://svn.r-project.org). > > Now I use IE (and Windows for that matter) only when really necessary, and > Firefox has simple ways to permanently accept non-verifiable certificates. > I would be grateful if someone who is much more familiar with IE could > write a note explaining how to deal with this that we could add to the > rw-FAQ. > > To forestall the inevitable question: there are no plans to add https: > support on any other platform, but it is something that would make a nice > project for a user contribution. The current internal code is based on > likxml2, and that AFAICS still does not have https: support. > Generally (i.e. not in particular response to Brian but related to this thread) >>> With a similar disclaimer: Brian's efforts were triggered by me asking >>> how to use url() to read R's mailing list archive files, such as >>> >>> https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz >>> >>> directly into R. Turns out we cannot ... which, in a way, is a shame >>> ("R cannot read its own web pages") :-( >> Indeed, it is a shame. Although, when I process mail messages, >> I use Perl's very rich collection of modules for processing >> mail in so many different formats. And then I use RSPerl >> to control this and get the data into R pretty quickly. >> So we can do it in R and probably the delegation to >> mail-processing software is a good given the number of special >> cases, etc. >> >> And even if we had HTTPs in R, we would still want to deal with >> the certificate on that page, which gets us to more details. >> Which is the reason I think leaving things to libcurl, >> libwww, etc. will be best as they continue to evolve >> to handle new protocols and settings. > > The issue here is the same as it ever was, that of event-loops and not > blocking the R process. I think that is where the missing extensibility > is, and it has been raised for at least 6 years now. Of course, that is one area where extensibility is needed. Attempts have been made to address this generaly over the last 6 years, but the architecture of and the focus on the current numerous R front-ends is not necessarily ideal for trying to solve this properly. But your sentence suggests that the extensibility of the connection API is not an issue. And we don't agree on that. I think the two issues of extensibility are relevant. Blocking is important, but not being able to explore or add new facilities is fundamental and I believe of immense importance. Extensibility of the R engine at the system level rather than in the interpreted language is a major impediment to the evolution of R, IMHO. > > If I try to get that example URI with RCurl it > > 1) blocks the R process for a long time. > 2) fails to retrieve the URI as it is unable to handle the certificate. 2) is, as you would put it, "user error" ;-) You need to tell libcurl what options you want in the request. Telling it whether to ignore certificates, where the certificates are, etc. are query-specific options. > > Can you please point us to an extension package that behaves better? > Well, as regards point 1), libcurl does have facilities for non-blocking calls and so does RCurl via the multi_ interface of libcurl and the function getURIAsynchronous() in RCurl and the lower-level functions. And one could also merge the basic libcurl interface into our select calls. I seem to recall libwww has features we also can manually integrate into our event loop. The key thing I am trying to get across is that if we are going to include these things into R and we have to do things manually, then we should try to integrate them in an evolvable, extensible manner that leverages libraries that do things properly. > [When Kurt first sent me the example, I was surprised that wget handled > it. I then checked, and wget < 1.10 does not check certificates at all.] > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Connections to https: URLs -- IE expert help needed
On Mon, 1 Jan 2007, Duncan Temple Lang wrote: > Kurt Hornik wrote: >>> Duncan Temple Lang writes: >> >>> -BEGIN PGP SIGNED MESSAGE- >>> Hash: SHA1 >> >>> Prof Brian Ripley wrote: I've added to R-devel the ability to use download.file() and url() to https: URLs, *only* if --internet2 is used on Windows. This uses the Internet Explorer internals, and only works if the certificate is accepted (so e.g. does not work for https://svn.r-project.org). Now I use IE (and Windows for that matter) only when really necessary, and Firefox has simple ways to permanently accept non-verifiable certificates. I would be grateful if someone who is much more familiar with IE could write a note explaining how to deal with this that we could add to the rw-FAQ. To forestall the inevitable question: there are no plans to add https: support on any other platform, but it is something that would make a nice project for a user contribution. The current internal code is based on likxml2, and that AFAICS still does not have https: support. >> >>> Generally (i.e. not in particular response to Brian but related to >>> this thread) >> >> With a similar disclaimer: Brian's efforts were triggered by me asking >> how to use url() to read R's mailing list archive files, such as >> >> https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz >> >> directly into R. Turns out we cannot ... which, in a way, is a shame >> ("R cannot read its own web pages") :-( > > Indeed, it is a shame. Although, when I process mail messages, > I use Perl's very rich collection of modules for processing > mail in so many different formats. And then I use RSPerl > to control this and get the data into R pretty quickly. > So we can do it in R and probably the delegation to > mail-processing software is a good given the number of special > cases, etc. > > And even if we had HTTPs in R, we would still want to deal with > the certificate on that page, which gets us to more details. > Which is the reason I think leaving things to libcurl, > libwww, etc. will be best as they continue to evolve > to handle new protocols and settings. The issue here is the same as it ever was, that of event-loops and not blocking the R process. I think that is where the missing extensibility is, and it has been raised for at least 6 years now. If I try to get that example URI with RCurl it 1) blocks the R process for a long time. 2) fails to retrieve the URI as it is unable to handle the certificate. Can you please point us to an extension package that behaves better? [When Kurt first sent me the example, I was surprised that wget handled it. I then checked, and wget < 1.10 does not check certificates at all.] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Connections to https: URLs -- IE expert help needed
Kurt Hornik wrote: > > Duncan Temple Lang writes: > > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA1 > > > Prof Brian Ripley wrote: > >> I've added to R-devel the ability to use download.file() and url() to > >> https: URLs, *only* if --internet2 is used on Windows. > >> > >> This uses the Internet Explorer internals, and only works if the > >> certificate is accepted (so e.g. does not work for > >> https://svn.r-project.org). > >> > >> Now I use IE (and Windows for that matter) only when really necessary, and > >> Firefox has simple ways to permanently accept non-verifiable certificates. > >> I would be grateful if someone who is much more familiar with IE could > >> write a note explaining how to deal with this that we could add to the > >> rw-FAQ. > >> > >> To forestall the inevitable question: there are no plans to add https: > >> support on any other platform, but it is something that would make a nice > >> project for a user contribution. The current internal code is based on > >> likxml2, and that AFAICS still does not have https: support. > >> > > > Generally (i.e. not in particular response to Brian but related to > > this thread) > > With a similar disclaimer: Brian's efforts were triggered by me asking > how to use url() to read R's mailing list archive files, such as > > https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz > > directly into R. Turns out we cannot ... which, in a way, is a shame > ("R cannot read its own web pages") :-( Indeed, it is a shame. Although, when I process mail messages, I use Perl's very rich collection of modules for processing mail in so many different formats. And then I use RSPerl to control this and get the data into R pretty quickly. So we can do it in R and probably the delegation to mail-processing software is a good given the number of special cases, etc. And even if we had HTTPs in R, we would still want to deal with the certificate on that page, which gets us to more details. Which is the reason I think leaving things to libcurl, libwww, etc. will be best as they continue to evolve to handle new protocols and settings. D. > > Best > -k > > > An alternative is to use RCurl and leave HTTPS and a host of other > > protocols and details to an external library (e.g. libcurl, libwww, > > etc.) and an R package that interfaces to it. > > > If we want the facilities to be accessible via the connections > > interface, then we can make that API extensible by packages. Jeff > > Horner has a proposal on that. > > > Generally, it is important if R is to continue to evolve that the R > > internals become extensible by package developers so that we can do > > some new experiments and provide alternative implementations of the > > basic structures rather than being tied to the existing > > representation. An object oriented framework underlying the R source > > code would enable this and would solve numerous problems that have > > arisen recently and I strongly suspect many more that will arise. > > > D. > > > > - -- > > Duncan Temple Lang[EMAIL PROTECTED] > > Department of Statistics work: (530) 752-4782 > > 4210 Mathematical Sciences Building fax: (530) 752-7099 > > One Shields Ave. > > University of California at Davis > > Davis, > > CA 95616, > > USA > > -BEGIN PGP SIGNATURE- > > Version: GnuPG v1.4.3 (Darwin) > > > iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX > > zci7UsavJESdidbyiCl2Xgw= > > =HiuE > > -END PGP SIGNATURE- > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA pgpGagOBzSxIn.pgp Description: PGP signature __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Connections to https: URLs -- IE expert help needed
> Duncan Temple Lang writes: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > Prof Brian Ripley wrote: >> I've added to R-devel the ability to use download.file() and url() to >> https: URLs, *only* if --internet2 is used on Windows. >> >> This uses the Internet Explorer internals, and only works if the >> certificate is accepted (so e.g. does not work for >> https://svn.r-project.org). >> >> Now I use IE (and Windows for that matter) only when really necessary, and >> Firefox has simple ways to permanently accept non-verifiable certificates. >> I would be grateful if someone who is much more familiar with IE could >> write a note explaining how to deal with this that we could add to the >> rw-FAQ. >> >> To forestall the inevitable question: there are no plans to add https: >> support on any other platform, but it is something that would make a nice >> project for a user contribution. The current internal code is based on >> likxml2, and that AFAICS still does not have https: support. >> > Generally (i.e. not in particular response to Brian but related to > this thread) With a similar disclaimer: Brian's efforts were triggered by me asking how to use url() to read R's mailing list archive files, such as https://stat.ethz.ch/pipermail/r-help/2007-January.txt.gz directly into R. Turns out we cannot ... which, in a way, is a shame ("R cannot read its own web pages") :-( Best -k > An alternative is to use RCurl and leave HTTPS and a host of other > protocols and details to an external library (e.g. libcurl, libwww, > etc.) and an R package that interfaces to it. > If we want the facilities to be accessible via the connections > interface, then we can make that API extensible by packages. Jeff > Horner has a proposal on that. > Generally, it is important if R is to continue to evolve that the R > internals become extensible by package developers so that we can do > some new experiments and provide alternative implementations of the > basic structures rather than being tied to the existing > representation. An object oriented framework underlying the R source > code would enable this and would solve numerous problems that have > arisen recently and I strongly suspect many more that will arise. > D. > - -- > Duncan Temple Lang[EMAIL PROTECTED] > Department of Statistics work: (530) 752-4782 > 4210 Mathematical Sciences Building fax: (530) 752-7099 > One Shields Ave. > University of California at Davis > Davis, > CA 95616, > USA > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.3 (Darwin) > iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX > zci7UsavJESdidbyiCl2Xgw= > =HiuE > -END PGP SIGNATURE- > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Connections to https: URLs -- IE expert help needed
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Prof Brian Ripley wrote: > I've added to R-devel the ability to use download.file() and url() to > https: URLs, *only* if --internet2 is used on Windows. > > This uses the Internet Explorer internals, and only works if the > certificate is accepted (so e.g. does not work for > https://svn.r-project.org). > > Now I use IE (and Windows for that matter) only when really necessary, and > Firefox has simple ways to permanently accept non-verifiable certificates. > I would be grateful if someone who is much more familiar with IE could > write a note explaining how to deal with this that we could add to the > rw-FAQ. > > To forestall the inevitable question: there are no plans to add https: > support on any other platform, but it is something that would make a nice > project for a user contribution. The current internal code is based on > likxml2, and that AFAICS still does not have https: support. > Generally (i.e. not in particular response to Brian but related to this thread) An alternative is to use RCurl and leave HTTPS and a host of other protocols and details to an external library (e.g. libcurl, libwww, etc.) and an R package that interfaces to it. If we want the facilities to be accessible via the connections interface, then we can make that API extensible by packages. Jeff Horner has a proposal on that. Generally, it is important if R is to continue to evolve that the R internals become extensible by package developers so that we can do some new experiments and provide alternative implementations of the basic structures rather than being tied to the existing representation. An object oriented framework underlying the R source code would enable this and would solve numerous problems that have arisen recently and I strongly suspect many more that will arise. D. - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Building fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (Darwin) iD8DBQFFibTh9p/Jzwa2QP4RAgBiAJ9YojjDf0DMIo9FQ7yW1MiMdANvogCfRgCX zci7UsavJESdidbyiCl2Xgw= =HiuE -END PGP SIGNATURE- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Connections to https: URLs -- IE expert help needed
I've added to R-devel the ability to use download.file() and url() to https: URLs, *only* if --internet2 is used on Windows. This uses the Internet Explorer internals, and only works if the certificate is accepted (so e.g. does not work for https://svn.r-project.org). Now I use IE (and Windows for that matter) only when really necessary, and Firefox has simple ways to permanently accept non-verifiable certificates. I would be grateful if someone who is much more familiar with IE could write a note explaining how to deal with this that we could add to the rw-FAQ. To forestall the inevitable question: there are no plans to add https: support on any other platform, but it is something that would make a nice project for a user contribution. The current internal code is based on likxml2, and that AFAICS still does not have https: support. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel