Re: [Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-05 Thread Xinyi
Apologies for the typo in my original email. I meant “quiet=1” and it was
working. The log was typed but not copied so that was why there was a typo.

But as stated in the reasoning - it hides all the output, so also other
useful information like where the lib is installed, compile details etc.
which will be useful for debugging. I.e. quiet=1 can be a workaround but
not a real solution.

I would know it was typo if I have only tried quite… As it would print
error telling me “quite” is not recognised ;)

Cheers,
Xinyi



On Mon, Feb 5, 2024 at 12:17 Martin Maechler 
wrote:

> > Simon Urbanek
> > on Sun, 4 Feb 2024 10:33:34 +1300 writes:
>
> > Any reason why you didn't use quiet=TRUE to suppress that
> > output?
>
> He wrote 'quite' instead of 'quiet' {see cited below '1. quite=1'}
> and probably never tried the correct spelling ...
>
> > There is no official API structure for
> > credentials in R repositories, so R has no way of knowing
> > which part of the URL are credentials as it is not under
> > R's purview - it could be part of the path or anything, so
> > there is no way R can reliably mask it. Hence it makes
> > more sense for the user to suppress the output if they
> > think it may contain sensitive information - and R
> > supports that.
>
> > If that's still not enough, then please make a concrete
> > proposal that defines exactly what kind processing you'd
> > like to see under what conditions - and how you think that
> > will solve the problem.
>
> > Cheers, Simon
>
>
>
> >> On Feb 2, 2024, at 5:28 AM, Xinyi 
> >> wrote:
> >>
> >> Hi all,
> >>
> >> When trying to install a package from R using
> >> install.packages(), it will print out the full url
> >> address (of the remote repository) it was trying to
> >> access. A bit further digging shows it is from the
> >> in_do_curlDownload method from R's libcurl
> >> <
> https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c
> >:
> >> install.packages() calls download.packages(), and
> >> download.packages() calls download.file(), which uses
> >> "libcurl" as its default method.
> >>
> >> This line from R mirror
> >> <
> https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c#L772
> >
> >> ("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")
> >> prints the full url it is trying to access.
> >>
> >> This is totally fine for public urls without credentials,
> >> but in the case that a given url contains an API key, it
> >> poses security issues. For example, if the
> >> getOption("repos") has been overridden to a customized
> >> repository (protected by API keys), then
> >>> install.packages("zoo")
> >> Installing packages into '--removed local directory
> >> path--' trying URL 'https://--removed userid--:--removed
> >>
> api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz
> >> ' Content type 'application/x-gzip' length 782344 bytes
> >> (764 KB) === downloaded
> >> 764 KB
> >>
> >> * installing *source* package 'zoo' ...  -- further logs
> >> removed --
> >>>
> >>
> >> I also tried several other options:
> >>
> >> 1. quite=1
> >>> install.packages("zoo", quite=1)
> >> It did hide the url, but it also hid all other useful
> >> information.  2. method="curl"
> >>> install.packages("zoo", method="curl")
> >> This does not print the url when the download is
> >> successful, but if there were any errors, it still prints
> >> the url with API key in it.  3. method="wget"
> >>> install.packages("zoo", method="wget")
> >> This hides API key by *password*, but I wasn't able to
> >> install packages with this method even with public repos,
> >> with the error "Warning: unable to access index for
> >> repository https://cloud.r-project.org/src/contrib/4.3:
> >> 'wget' call had nonzero exit status"
> >>
> >>
> >> In other dynamic languages' package managers like
> >> Python's pip, API keys are hidden by default since pip
> >> 18.x in 2018, and masked by "" from pip 19.x in 2019,
> >> see below examples. Can we get a similar default
> >> behaviour in R?
> >>
> >> 1. with pip 10.x $ pip install numpy -v # API key was not
> >> hided Looking in indexes: https://--removed
> >> userid--:--removed
> >> api-ke...@repository-addresss.com:4443/.../pypi/simple
> >> 2. with pip 18.x # All credentials are removed by pip $
> >> pip install numpy -v Looking in indexes:
> >> https://repository-addresss.com:4443/ .../pypi/simple
> >> 3. with pip 19.x onwards # userid is kept, API key is
> >> replaced by  $ pip install numpy -v Looking in
> >> indexes: https://userid:@
> >> repository-addresss.com:4443/.../pypi/simple
> 

Re: [Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-05 Thread Martin Maechler
> Simon Urbanek 
> on Sun, 4 Feb 2024 10:33:34 +1300 writes:

> Any reason why you didn't use quiet=TRUE to suppress that
> output?  

He wrote 'quite' instead of 'quiet' {see cited below '1. quite=1'}
and probably never tried the correct spelling ...

> There is no official API structure for
> credentials in R repositories, so R has no way of knowing
> which part of the URL are credentials as it is not under
> R's purview - it could be part of the path or anything, so
> there is no way R can reliably mask it. Hence it makes
> more sense for the user to suppress the output if they
> think it may contain sensitive information - and R
> supports that.

> If that's still not enough, then please make a concrete
> proposal that defines exactly what kind processing you'd
> like to see under what conditions - and how you think that
> will solve the problem.

> Cheers, Simon



>> On Feb 2, 2024, at 5:28 AM, Xinyi 
>> wrote:
>> 
>> Hi all,
>> 
>> When trying to install a package from R using
>> install.packages(), it will print out the full url
>> address (of the remote repository) it was trying to
>> access. A bit further digging shows it is from the
>> in_do_curlDownload method from R's libcurl
>> 
:
>> install.packages() calls download.packages(), and
>> download.packages() calls download.file(), which uses
>> "libcurl" as its default method.
>> 
>> This line from R mirror
>> 

>> ("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")
>> prints the full url it is trying to access.
>> 
>> This is totally fine for public urls without credentials,
>> but in the case that a given url contains an API key, it
>> poses security issues. For example, if the
>> getOption("repos") has been overridden to a customized
>> repository (protected by API keys), then
>>> install.packages("zoo")
>> Installing packages into '--removed local directory
>> path--' trying URL 'https://--removed userid--:--removed
>> api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz
>> ' Content type 'application/x-gzip' length 782344 bytes
>> (764 KB) === downloaded
>> 764 KB
>> 
>> * installing *source* package 'zoo' ...  -- further logs
>> removed --
>>> 
>> 
>> I also tried several other options:
>> 
>> 1. quite=1
>>> install.packages("zoo", quite=1)
>> It did hide the url, but it also hid all other useful
>> information.  2. method="curl"
>>> install.packages("zoo", method="curl")
>> This does not print the url when the download is
>> successful, but if there were any errors, it still prints
>> the url with API key in it.  3. method="wget"
>>> install.packages("zoo", method="wget")
>> This hides API key by *password*, but I wasn't able to
>> install packages with this method even with public repos,
>> with the error "Warning: unable to access index for
>> repository https://cloud.r-project.org/src/contrib/4.3:
>> 'wget' call had nonzero exit status"
>> 
>> 
>> In other dynamic languages' package managers like
>> Python's pip, API keys are hidden by default since pip
>> 18.x in 2018, and masked by "" from pip 19.x in 2019,
>> see below examples. Can we get a similar default
>> behaviour in R?
>> 
>> 1. with pip 10.x $ pip install numpy -v # API key was not
>> hided Looking in indexes: https://--removed
>> userid--:--removed
>> api-ke...@repository-addresss.com:4443/.../pypi/simple
>> 2. with pip 18.x # All credentials are removed by pip $
>> pip install numpy -v Looking in indexes:
>> https://repository-addresss.com:4443/ .../pypi/simple
>> 3. with pip 19.x onwards # userid is kept, API key is
>> replaced by  $ pip install numpy -v Looking in
>> indexes: https://userid:@
>> repository-addresss.com:4443/.../pypi/simple
>> 
>> 
>> I was instructed by https://www.r-project.org/bugs.html
>> that I should get some discussion on r-devel before
>> filing a feature request. So looking forward to
>> comments/suggestions.
>> 

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-04 Thread Kevin Ushey
For cases like these, I think it would be more useful to have some
mechanism for associating URLs / hosts with credentials, and have R use
those credentials by default whenever accessing those URLs. Since
download.file() now supports custom headers, this could be a mechanism for
setting headers to be used by default when downloading files from
particular URLs. Then, users could do something like:

options(download.file.headers = list(
example.org = c(Authorization = "<...>")
))

And those headers would be used automatically by download.file() whenever
talking to that server.

All of that to say -- I think the better way forward would be to make it
easier to safely use authentication credentials in download.file(), rather
than just tooling in support for suppressing specific types of output.

Best,
Kevin


On Sat, Feb 3, 2024 at 1:33 PM Simon Urbanek 
wrote:
>
> Any reason why you didn't use quiet=TRUE to suppress that output?
>
> There is no official API structure for credentials in R repositories, so
R has no way of knowing which part of the URL are credentials as it is not
under R's purview - it could be part of the path or anything, so there is
no way R can reliably mask it. Hence it makes more sense for the user to
suppress the output if they think it may contain sensitive information -
and R supports that.
>
> If that's still not enough, then please make a concrete proposal that
defines exactly what kind processing you'd like to see under what
conditions - and how you think that will solve the problem.
>
> Cheers,
> Simon
>
>
>
> > On Feb 2, 2024, at 5:28 AM, Xinyi  wrote:
> >
> > Hi all,
> >
> > When trying to install a package from R using install.packages(), it
will
> > print out the full url address (of the remote repository) it was trying
to
> > access. A bit further digging shows it is from the in_do_curlDownload
> > method from R's libcurl
> > <
https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c>:
> > install.packages() calls download.packages(), and download.packages()
calls
> > download.file(), which uses "libcurl" as its default method.
> >
> > This line from R mirror
> > <
https://github.com/wch/r-source/blob/trunk/src/modules/internet/libcurl.c#L772
>
> > ("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")  prints the full
url
> > it is trying to access.
> >
> > This is totally fine for public urls without credentials, but in the
case
> > that a given url contains an API key, it poses security issues. For
> > example, if the getOption("repos") has been overridden to a
> > customized repository (protected by API keys), then
> >> install.packages("zoo")
> > Installing packages into '--removed local directory path--'
> > trying URL 'https://--removed userid--:--removed
> > api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz
'
> > Content type 'application/x-gzip' length 782344 bytes (764 KB)
> > ===
> > downloaded 764 KB
> >
> > * installing *source* package 'zoo' ...
> > -- further logs removed --
> >>
> >
> > I also tried several other options:
> >
> > 1. quite=1
> >> install.packages("zoo", quite=1)
> > It did hide the url, but it also hid all other useful information.
> > 2. method="curl"
> >> install.packages("zoo", method="curl")
> > This does not print the url when the download is successful, but if
there
> > were any errors, it still prints the url with API key in it.
> > 3. method="wget"
> >> install.packages("zoo", method="wget")
> > This hides API key by *password*, but I wasn't able to install packages
> > with this method even with public repos, with the error "Warning:
unable to
> > access index for repository https://cloud.r-project.org/src/contrib/4.3:
> > 'wget' call had nonzero exit status"
> >
> >
> > In other dynamic languages' package managers like Python's pip, API keys
> > are hidden by default since pip 18.x in 2018, and masked by "" from
pip
> > 19.x in 2019, see below examples. Can we get a similar default
behaviour in
> > R?
> >
> > 1. with pip 10.x
> > $ pip install numpy -v # API key was not hided
> > Looking in indexes:  https://--removed userid--:--removed
> > api-ke...@repository-addresss.com:4443/.../pypi/simple
> > 2. with pip 18.x # All credentials are removed by pip
> > $ pip install numpy -v
> > Looking in indexes:  https://repository-addresss.com:4443/
> > .../pypi/simple
> > 3. with pip 19.x onwards # userid is kept, API key is replaced by 
> > $ pip install numpy -v
> > Looking in indexes:  https://userid:@
> > repository-addresss.com:4443/.../pypi/simple
> >
> >
> > I was instructed by https://www.r-project.org/bugs.html that I should
get
> > some discussion on r-devel before filing a feature request. So looking
> > forward to comments/suggestions.
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]


Re: [Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-03 Thread Simon Urbanek
Any reason why you didn't use quiet=TRUE to suppress that output?

There is no official API structure for credentials in R repositories, so R has 
no way of knowing which part of the URL are credentials as it is not under R's 
purview - it could be part of the path or anything, so there is no way R can 
reliably mask it. Hence it makes more sense for the user to suppress the output 
if they think it may contain sensitive information - and R supports that.

If that's still not enough, then please make a concrete proposal that defines 
exactly what kind processing you'd like to see under what conditions - and how 
you think that will solve the problem.

Cheers,
Simon



> On Feb 2, 2024, at 5:28 AM, Xinyi  wrote:
> 
> Hi all,
> 
> When trying to install a package from R using install.packages(), it will
> print out the full url address (of the remote repository) it was trying to
> access. A bit further digging shows it is from the in_do_curlDownload
> method from R's libcurl
> :
> install.packages() calls download.packages(), and download.packages() calls
> download.file(), which uses "libcurl" as its default method.
> 
> This line from R mirror
> 
> ("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")  prints the full url
> it is trying to access.
> 
> This is totally fine for public urls without credentials, but in the case
> that a given url contains an API key, it poses security issues. For
> example, if the getOption("repos") has been overridden to a
> customized repository (protected by API keys), then
>> install.packages("zoo")
> Installing packages into '--removed local directory path--'
> trying URL 'https://--removed userid--:--removed
> api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz  '
> Content type 'application/x-gzip' length 782344 bytes (764 KB)
> ===
> downloaded 764 KB
> 
> * installing *source* package 'zoo' ...
> -- further logs removed --
>> 
> 
> I also tried several other options:
> 
> 1. quite=1
>> install.packages("zoo", quite=1)
> It did hide the url, but it also hid all other useful information.
> 2. method="curl"
>> install.packages("zoo", method="curl")
> This does not print the url when the download is successful, but if there
> were any errors, it still prints the url with API key in it.
> 3. method="wget"
>> install.packages("zoo", method="wget")
> This hides API key by *password*, but I wasn't able to install packages
> with this method even with public repos, with the error "Warning: unable to
> access index for repository https://cloud.r-project.org/src/contrib/4.3:
> 'wget' call had nonzero exit status"
> 
> 
> In other dynamic languages' package managers like Python's pip, API keys
> are hidden by default since pip 18.x in 2018, and masked by "" from pip
> 19.x in 2019, see below examples. Can we get a similar default behaviour in
> R?
> 
> 1. with pip 10.x
> $ pip install numpy -v # API key was not hided
> Looking in indexes:  https://--removed userid--:--removed
> api-ke...@repository-addresss.com:4443/.../pypi/simple
> 2. with pip 18.x # All credentials are removed by pip
> $ pip install numpy -v
> Looking in indexes:  https://repository-addresss.com:4443/
> .../pypi/simple
> 3. with pip 19.x onwards # userid is kept, API key is replaced by 
> $ pip install numpy -v
> Looking in indexes:  https://userid:@
> repository-addresss.com:4443/.../pypi/simple
> 
> 
> I was instructed by https://www.r-project.org/bugs.html that I should get
> some discussion on r-devel before filing a feature request. So looking
> forward to comments/suggestions.
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-01 Thread Duncan Murdoch
I've just been reading 
https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication, and it 
states that putting userid:password in the URL is deprecated, but it 
does make sense that R should protect users who still use that scheme.


Duncan Murdoch

On 01/02/2024 11:28 a.m., Xinyi wrote:

Hi all,

When trying to install a package from R using install.packages(), it will
print out the full url address (of the remote repository) it was trying to
access. A bit further digging shows it is from the in_do_curlDownload
method from R's libcurl
:
install.packages() calls download.packages(), and download.packages() calls
download.file(), which uses "libcurl" as its default method.

This line from R mirror

("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")  prints the full url
it is trying to access.

This is totally fine for public urls without credentials, but in the case
that a given url contains an API key, it poses security issues. For
example, if the getOption("repos") has been overridden to a
customized repository (protected by API keys), then

install.packages("zoo")

Installing packages into '--removed local directory path--'
trying URL 'https://--removed userid--:--removed
api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz  '
Content type 'application/x-gzip' length 782344 bytes (764 KB)
===
downloaded 764 KB

* installing *source* package 'zoo' ...
-- further logs removed --




I also tried several other options:

1. quite=1

install.packages("zoo", quite=1)

It did hide the url, but it also hid all other useful information.
2. method="curl"

install.packages("zoo", method="curl")

This does not print the url when the download is successful, but if there
were any errors, it still prints the url with API key in it.
3. method="wget"

install.packages("zoo", method="wget")

This hides API key by *password*, but I wasn't able to install packages
with this method even with public repos, with the error "Warning: unable to
access index for repository https://cloud.r-project.org/src/contrib/4.3:
'wget' call had nonzero exit status"


In other dynamic languages' package managers like Python's pip, API keys
are hidden by default since pip 18.x in 2018, and masked by "" from pip
19.x in 2019, see below examples. Can we get a similar default behaviour in
R?

1. with pip 10.x
$ pip install numpy -v # API key was not hided
Looking in indexes:  https://--removed userid--:--removed
api-ke...@repository-addresss.com:4443/.../pypi/simple
2. with pip 18.x # All credentials are removed by pip
$ pip install numpy -v
Looking in indexes:  https://repository-addresss.com:4443/
.../pypi/simple
3. with pip 19.x onwards # userid is kept, API key is replaced by 
$ pip install numpy -v
Looking in indexes:  https://userid:@
repository-addresss.com:4443/.../pypi/simple


I was instructed by https://www.r-project.org/bugs.html that I should get
some discussion on r-devel before filing a feature request. So looking
forward to comments/suggestions.

Thanks,
Xinyi

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [Feature Request] Hide API Key in download.file() / R's libcurl

2024-02-01 Thread Xinyi
Hi all,

When trying to install a package from R using install.packages(), it will
print out the full url address (of the remote repository) it was trying to
access. A bit further digging shows it is from the in_do_curlDownload
method from R's libcurl
:
install.packages() calls download.packages(), and download.packages() calls
download.file(), which uses "libcurl" as its default method.

This line from R mirror

("if (!quiet) REprintf(_("trying URL '%s'\n"), url);")  prints the full url
it is trying to access.

This is totally fine for public urls without credentials, but in the case
that a given url contains an API key, it poses security issues. For
example, if the getOption("repos") has been overridden to a
customized repository (protected by API keys), then
> install.packages("zoo")
Installing packages into '--removed local directory path--'
trying URL 'https://--removed userid--:--removed
api-ke...@repository-addresss.com:4443/.../src/contrib/zoo_1.8-12.tar.gz  '
Content type 'application/x-gzip' length 782344 bytes (764 KB)
===
downloaded 764 KB

* installing *source* package 'zoo' ...
-- further logs removed --
>

I also tried several other options:

1. quite=1
> install.packages("zoo", quite=1)
It did hide the url, but it also hid all other useful information.
2. method="curl"
> install.packages("zoo", method="curl")
This does not print the url when the download is successful, but if there
were any errors, it still prints the url with API key in it.
3. method="wget"
> install.packages("zoo", method="wget")
This hides API key by *password*, but I wasn't able to install packages
with this method even with public repos, with the error "Warning: unable to
access index for repository https://cloud.r-project.org/src/contrib/4.3:
'wget' call had nonzero exit status"


In other dynamic languages' package managers like Python's pip, API keys
are hidden by default since pip 18.x in 2018, and masked by "" from pip
19.x in 2019, see below examples. Can we get a similar default behaviour in
R?

1. with pip 10.x
$ pip install numpy -v # API key was not hided
Looking in indexes:  https://--removed userid--:--removed
api-ke...@repository-addresss.com:4443/.../pypi/simple
2. with pip 18.x # All credentials are removed by pip
$ pip install numpy -v
Looking in indexes:  https://repository-addresss.com:4443/
.../pypi/simple
3. with pip 19.x onwards # userid is kept, API key is replaced by 
$ pip install numpy -v
Looking in indexes:  https://userid:@
repository-addresss.com:4443/.../pypi/simple


I was instructed by https://www.r-project.org/bugs.html that I should get
some discussion on r-devel before filing a feature request. So looking
forward to comments/suggestions.

Thanks,
Xinyi

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel