Re: [R-pkg-devel] CRAN rules re. web scraping?
Hi Spencer, To add to what Roy has already provided. If you have tests that require Internet access, you should be using skip_on_cran() for those tests and in your examples using the \donttest{} tags to prevent errors on CRAN servers when Internet is not available or the server is not responding or the resource is unavailable. Using tryCatch() will be helpful for the end-user experience, but will not completely fix the issue that is being raised here. On Thu, 23 Jan 2020 at 11:59, Roy Mendelssohn - NOAA Federal via R-package-devel wrote: > Hi Spencer: > > I think that message means what it says, and I read it as pretty > straightforward and business like. The issue is not web scraping. There > are two errors here: > > 1. You can not write to the user's space without first explicitly asking > permission of the user. The suggested policy is to write to a temp > directory, R has tempdir() and related commands for how to do this. > > 2. When accessing something over the internet, failure of the access > must be checked for and the program exiting gracefully. The second error > appears to be that at times on the builds the .csv file is not downloaded, > but there is no check, just an error is thrown. There are a number of > ways to catch such errors, such as "try...catch" which will solve this > problem > > HTH, > > -Roy > > > > On Jan 22, 2020, at 5:48 PM, Spencer Graves < > spencer.gra...@effectivedefense.org> wrote: > > > > Hello, All: > > > > > > GOOD NEWS AND BAD NEWS: > > > > > > * First the good news: I heard from Brian Ripley; see below. > > His web site says, "He retired in August 2014 on grounds of ill health." > > (http://www.stats.ox.ac.uk/~ripley/) I was pleased to see that he > seems > > to be well enough to send me the email below. > > > > > > * BAD NEWS: My Ecfun package is violating current CRAN rules > > regarding "not writing anywhere in the file space". (See below.) > > > > > > QUESTION: > > > > > > How do you suggest I respond to this? > > > > > > It's hard for me to fix, because I cannot replicate the error and > > I don't understand the rules Prof. Ripley is trying to enforce. The > > "CRAN Package Check Results for" this package show an error on 1 > > platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms > > (Fedora-clang and Debian), and "OK" on 9 others. I can program selected > > tests not to run on CRAN, e.g., with (!fda::CRAN()). > > > > > > However, I suspect I should be able to do better than that. > > > > > > Suggestions? > > > > > > Thanks, > > Spencer Graves > > > > > > p.s. The development version of this package is available at > > "https://github.com/sbgraves237/Ecfun";. > > > > > > https://cloud.r-project.org/web/checks/check_results_Ecfun.html > > > > > > Forwarded Message > > Subject: CRAN package Ecfun > > Date: Tue, 21 Jan 2020 21:26:02 + > > From: Prof Brian Ripley > > Reply-To: CRAN > > To: Spencer Graves > > CC: CRAN > > > > > > > > This has been intermittently failing its checks for a week: different > > check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. > > The current failure is > > > > Check: examples > > Result: ERROR > > Running examples in ‘Ecfun-Ex.R’ failed > > The error most likely occurred in: > > > >> ### Name: read.testURLs > >> ### Title: Read a file produced by testURLs > >> ### Aliases: read.testURLs > >> ### Keywords: IO > >> > >> ### ** Examples > >> > >> # Test only 2 web sites, not the default 4, > >> # and test only twice, not the default 10 times: > >> tst <- testURLs(c( > > + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index";, > > + house="http://house.gov/representatives";), > > + n=2, maxFail=2) > > 1 > > 1579634784, PVI, TRUE 0.828 > > 1579634785, house, FALSE 0.051 > > 1579634785, house, FALSE 0.048 > > 2 > > 1579634785, PVI, TRUE 0.043 > > 1579634785, house, FALSE 0.11 > > 1579634785, house, FALSE 0.035 > >> > >> # The above should have created a file 'testURLresults.csv' > >> # in the working directory. Read it. > >> > >> dat <- read.testURLs() > > Error in read.table(file = file, header = header, sep = sep, quote = > > quote, : > > more columns than column names > > Calls: read.testURLs -> read.csv -> read.table > > > > That does not conform to the policy on Internet access, not least as no > > attempt is made to check if the file was created, let alone that it has > > the expected layout. Nor does it conform to the policy on not writing > > anywhere in the file space (and that shows on its CRAN results page too). > > > > Please correct ASAP and before Feb 4 to safely retain the package on > CRAN. > > > > -- > > Brian D. Ripley, rip...@stats.ox.ac.uk > > Emeritus Professor of Applied Statistics, University of Oxford > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-package-devel@r-proje
Re: [R-pkg-devel] CRAN rules re. web scraping?
Thanks very much to Iñaki Ucar, Adam H Sparks, and Roy Mendelssohn for their replies that helped me understand what I needed to do to fix problems identified in the CRAN Checks. I believe that those problems are not fixed in the development version of Ecfun available at "https://github.com/sbgraves237/Ecfun". The package still needs more work, but I will make Prof. Ripley's Feb. 4 deadline. Thanks again, Spencer Graves On 2020-01-23 01:55, Iñaki Ucar wrote: On Thu, 23 Jan 2020 at 02:49, Spencer Graves wrote: Hello, All: GOOD NEWS AND BAD NEWS: * First the good news: I heard from Brian Ripley; see below. His web site says, "He retired in August 2014 on grounds of ill health." (http://www.stats.ox.ac.uk/~ripley/) I was pleased to see that he seems to be well enough to send me the email below. * BAD NEWS: My Ecfun package is violating current CRAN rules regarding "not writing anywhere in the file space". (See below.) QUESTION: How do you suggest I respond to this? It's hard for me to fix, because I cannot replicate the error and I don't understand the rules Prof. Ripley is trying to enforce. The "CRAN Package Check Results for" this package show an error on 1 platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms (Fedora-clang and Debian), and "OK" on 9 others. I can program selected tests not to run on CRAN, e.g., with (!fda::CRAN()). However, I suspect I should be able to do better than that. Suggestions? The message from Prof. Ripley is crystal-clear, and exposes two issues (Internet access, writing files) that have been discussed many times in this list. A quick scan of the CRAN policy [1] yields: - Packages which use Internet resources should fail gracefully with an informative message if the resource is not available (and not give a check warning nor error). - Packages should not write in the user’s home filespace (including clipboards), nor anywhere else on the file system apart from the R session’s temporary directory. [1] https://cran.r-project.org/web/packages/policies.html Iñaki Thanks, Spencer Graves p.s. The development version of this package is available at "https://github.com/sbgraves237/Ecfun";. https://cloud.r-project.org/web/checks/check_results_Ecfun.html Forwarded Message Subject:CRAN package Ecfun Date: Tue, 21 Jan 2020 21:26:02 + From: Prof Brian Ripley Reply-To: CRAN To: Spencer Graves CC: CRAN This has been intermittently failing its checks for a week: different check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. The current failure is Check: examples Result: ERROR Running examples in ‘Ecfun-Ex.R’ failed The error most likely occurred in: > ### Name: read.testURLs > ### Title: Read a file produced by testURLs > ### Aliases: read.testURLs > ### Keywords: IO > > ### ** Examples > > # Test only 2 web sites, not the default 4, > # and test only twice, not the default 10 times: > tst <- testURLs(c( + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index";, + house="http://house.gov/representatives";), + n=2, maxFail=2) 1 1579634784, PVI, TRUE 0.828 1579634785, house, FALSE 0.051 1579634785, house, FALSE 0.048 2 1579634785, PVI, TRUE 0.043 1579634785, house, FALSE 0.11 1579634785, house, FALSE 0.035 > > # The above should have created a file 'testURLresults.csv' > # in the working directory. Read it. > > dat <- read.testURLs() Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names Calls: read.testURLs -> read.csv -> read.table That does not conform to the policy on Internet access, not least as no attempt is made to check if the file was created, let alone that it has the expected layout. Nor does it conform to the policy on not writing anywhere in the file space (and that shows on its CRAN results page too). Please correct ASAP and before Feb 4 to safely retain the package on CRAN. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] CRAN rules re. web scraping?
On Thu, 23 Jan 2020 at 02:49, Spencer Graves wrote: > > Hello, All: > > > GOOD NEWS AND BAD NEWS: > > >* First the good news: I heard from Brian Ripley; see below. > His web site says, "He retired in August 2014 on grounds of ill health." > (http://www.stats.ox.ac.uk/~ripley/) I was pleased to see that he seems > to be well enough to send me the email below. > > >* BAD NEWS: My Ecfun package is violating current CRAN rules > regarding "not writing anywhere in the file space". (See below.) > > > QUESTION: > > >How do you suggest I respond to this? > > >It's hard for me to fix, because I cannot replicate the error and > I don't understand the rules Prof. Ripley is trying to enforce. The > "CRAN Package Check Results for" this package show an error on 1 > platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms > (Fedora-clang and Debian), and "OK" on 9 others. I can program selected > tests not to run on CRAN, e.g., with (!fda::CRAN()). > > >However, I suspect I should be able to do better than that. > > >Suggestions? The message from Prof. Ripley is crystal-clear, and exposes two issues (Internet access, writing files) that have been discussed many times in this list. A quick scan of the CRAN policy [1] yields: - Packages which use Internet resources should fail gracefully with an informative message if the resource is not available (and not give a check warning nor error). - Packages should not write in the user’s home filespace (including clipboards), nor anywhere else on the file system apart from the R session’s temporary directory. [1] https://cran.r-project.org/web/packages/policies.html Iñaki >Thanks, >Spencer Graves > > > p.s. The development version of this package is available at > "https://github.com/sbgraves237/Ecfun";. > > > https://cloud.r-project.org/web/checks/check_results_Ecfun.html > > > Forwarded Message > Subject:CRAN package Ecfun > Date: Tue, 21 Jan 2020 21:26:02 + > From: Prof Brian Ripley > Reply-To: CRAN > To: Spencer Graves > CC: CRAN > > > > This has been intermittently failing its checks for a week: different > check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. > The current failure is > > Check: examples > Result: ERROR > Running examples in ‘Ecfun-Ex.R’ failed > The error most likely occurred in: > > > ### Name: read.testURLs > > ### Title: Read a file produced by testURLs > > ### Aliases: read.testURLs > > ### Keywords: IO > > > > ### ** Examples > > > > # Test only 2 web sites, not the default 4, > > # and test only twice, not the default 10 times: > > tst <- testURLs(c( > + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index";, > + house="http://house.gov/representatives";), > + n=2, maxFail=2) > 1 > 1579634784, PVI, TRUE 0.828 > 1579634785, house, FALSE 0.051 > 1579634785, house, FALSE 0.048 > 2 > 1579634785, PVI, TRUE 0.043 > 1579634785, house, FALSE 0.11 > 1579634785, house, FALSE 0.035 > > > > # The above should have created a file 'testURLresults.csv' > > # in the working directory. Read it. > > > > dat <- read.testURLs() > Error in read.table(file = file, header = header, sep = sep, quote = > quote, : > more columns than column names > Calls: read.testURLs -> read.csv -> read.table > > That does not conform to the policy on Internet access, not least as no > attempt is made to check if the file was created, let alone that it has > the expected layout. Nor does it conform to the policy on not writing > anywhere in the file space (and that shows on its CRAN results page too). > > Please correct ASAP and before Feb 4 to safely retain the package on CRAN. > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > > > [[alternative HTML version deleted]] > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Iñaki Úcar __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] CRAN rules re. web scraping?
Hi Spencer: I think that message means what it says, and I read it as pretty straightforward and business like. The issue is not web scraping. There are two errors here: 1. You can not write to the user's space without first explicitly asking permission of the user. The suggested policy is to write to a temp directory, R has tempdir() and related commands for how to do this. 2. When accessing something over the internet, failure of the access must be checked for and the program exiting gracefully. The second error appears to be that at times on the builds the .csv file is not downloaded, but there is no check, just an error is thrown. There are a number of ways to catch such errors, such as "try...catch" which will solve this problem HTH, -Roy > On Jan 22, 2020, at 5:48 PM, Spencer Graves > wrote: > > Hello, All: > > > GOOD NEWS AND BAD NEWS: > > > * First the good news: I heard from Brian Ripley; see below. > His web site says, "He retired in August 2014 on grounds of ill health." > (http://www.stats.ox.ac.uk/~ripley/) I was pleased to see that he seems > to be well enough to send me the email below. > > > * BAD NEWS: My Ecfun package is violating current CRAN rules > regarding "not writing anywhere in the file space". (See below.) > > > QUESTION: > > > How do you suggest I respond to this? > > > It's hard for me to fix, because I cannot replicate the error and > I don't understand the rules Prof. Ripley is trying to enforce. The > "CRAN Package Check Results for" this package show an error on 1 > platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms > (Fedora-clang and Debian), and "OK" on 9 others. I can program selected > tests not to run on CRAN, e.g., with (!fda::CRAN()). > > > However, I suspect I should be able to do better than that. > > > Suggestions? > > > Thanks, > Spencer Graves > > > p.s. The development version of this package is available at > "https://github.com/sbgraves237/Ecfun";. > > > https://cloud.r-project.org/web/checks/check_results_Ecfun.html > > > Forwarded Message > Subject: CRAN package Ecfun > Date: Tue, 21 Jan 2020 21:26:02 + > From: Prof Brian Ripley > Reply-To: CRAN > To: Spencer Graves > CC: CRAN > > > > This has been intermittently failing its checks for a week: different > check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. > The current failure is > > Check: examples > Result: ERROR > Running examples in ‘Ecfun-Ex.R’ failed > The error most likely occurred in: > >> ### Name: read.testURLs >> ### Title: Read a file produced by testURLs >> ### Aliases: read.testURLs >> ### Keywords: IO >> >> ### ** Examples >> >> # Test only 2 web sites, not the default 4, >> # and test only twice, not the default 10 times: >> tst <- testURLs(c( > + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index";, > + house="http://house.gov/representatives";), > + n=2, maxFail=2) > 1 > 1579634784, PVI, TRUE 0.828 > 1579634785, house, FALSE 0.051 > 1579634785, house, FALSE 0.048 > 2 > 1579634785, PVI, TRUE 0.043 > 1579634785, house, FALSE 0.11 > 1579634785, house, FALSE 0.035 >> >> # The above should have created a file 'testURLresults.csv' >> # in the working directory. Read it. >> >> dat <- read.testURLs() > Error in read.table(file = file, header = header, sep = sep, quote = > quote, : > more columns than column names > Calls: read.testURLs -> read.csv -> read.table > > That does not conform to the policy on Internet access, not least as no > attempt is made to check if the file was created, let alone that it has > the expected layout. Nor does it conform to the policy on not writing > anywhere in the file space (and that shows on its CRAN results page too). > > Please correct ASAP and before Feb 4 to safely retain the package on CRAN. > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > > > [[alternative HTML version deleted]] > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel ** "The contents of this message do not reflect any position of the U.S. Government or NOAA." ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center ***Note new street address*** 110 McAllister Way Santa Cruz, CA 95060 Phone: (831)-420-3666 Fax: (831) 420-3980 e-mail: roy.mendelss...@noaa.gov www: https://www.pfeg.noaa.gov/ "Old age and treachery will overcome youth and skill." "From those who have been given much, much will be expected" "the arc of the moral universe is long, but it bends toward justice" -MLK Jr. __ R-package-devel@r-pro
[R-pkg-devel] CRAN rules re. web scraping?
Hello, All: GOOD NEWS AND BAD NEWS: * First the good news: I heard from Brian Ripley; see below. His web site says, "He retired in August 2014 on grounds of ill health." (http://www.stats.ox.ac.uk/~ripley/) I was pleased to see that he seems to be well enough to send me the email below. * BAD NEWS: My Ecfun package is violating current CRAN rules regarding "not writing anywhere in the file space". (See below.) QUESTION: How do you suggest I respond to this? It's hard for me to fix, because I cannot replicate the error and I don't understand the rules Prof. Ripley is trying to enforce. The "CRAN Package Check Results for" this package show an error on 1 platform (r-devel-linux-x86_64-fedora-gcc), NOTEs on 3 platforms (Fedora-clang and Debian), and "OK" on 9 others. I can program selected tests not to run on CRAN, e.g., with (!fda::CRAN()). However, I suspect I should be able to do better than that. Suggestions? Thanks, Spencer Graves p.s. The development version of this package is available at "https://github.com/sbgraves237/Ecfun";. https://cloud.r-project.org/web/checks/check_results_Ecfun.html Forwarded Message Subject:CRAN package Ecfun Date: Tue, 21 Jan 2020 21:26:02 + From: Prof Brian Ripley Reply-To: CRAN To: Spencer Graves CC: CRAN This has been intermittently failing its checks for a week: different check runs failed (in the 24h prior to) the 14th, 15th, 17th and today. The current failure is Check: examples Result: ERROR Running examples in ‘Ecfun-Ex.R’ failed The error most likely occurred in: > ### Name: read.testURLs > ### Title: Read a file produced by testURLs > ### Aliases: read.testURLs > ### Keywords: IO > > ### ** Examples > > # Test only 2 web sites, not the default 4, > # and test only twice, not the default 10 times: > tst <- testURLs(c( + PVI="http://en.wikipedia.org/wiki/Cook_Partisan_Voting_Index";, + house="http://house.gov/representatives";), + n=2, maxFail=2) 1 1579634784, PVI, TRUE 0.828 1579634785, house, FALSE 0.051 1579634785, house, FALSE 0.048 2 1579634785, PVI, TRUE 0.043 1579634785, house, FALSE 0.11 1579634785, house, FALSE 0.035 > > # The above should have created a file 'testURLresults.csv' > # in the working directory. Read it. > > dat <- read.testURLs() Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names Calls: read.testURLs -> read.csv -> read.table That does not conform to the policy on Internet access, not least as no attempt is made to check if the file was created, let alone that it has the expected layout. Nor does it conform to the policy on not writing anywhere in the file space (and that shows on its CRAN results page too). Please correct ASAP and before Feb 4 to safely retain the package on CRAN. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel