[Rd] write.csv performance improvements?

2023-03-29 Thread Toby Hocking
Dear R-devel, I did a systematic comparison of write.csv with similar functions, and observed two asymptotic inefficiencies that could be improved. 1. write.csv is quadratic time (N^2) in the number of columns N. Can write.csv be improved to use a linear time algorithm, so it can handle CSV files

[Rd] read.csv quadratic time in number of columns

2023-03-29 Thread Toby Hocking
Dear R-devel, A number of people have observed anecdotally that read.csv is slow for large number of columns, for example: https://stackoverflow.com/questions/7327851/read-csv-is-extremely-slow-in-reading-csv-files-with-large-numbers-of-columns I did a systematic comparison of read.csv with

Re: [Bioc-devel] httr::GET() problem downloading a ExperimentHub resource

2023-03-29 Thread Martin Morgan
Some more not-necessarily helpful observations. You can get verbose output with curl::curl_fetch_disk(url, tempfile(), handle = new_handle(verbose = TRUE)) and on the command line with curl -v -L � Also, it seems that other BAM files can be downloaded, e.g., from eh[["EH3502"]] (also

Re: [Bioc-devel] httr::GET() problem downloading a ExperimentHub resource

2023-03-29 Thread Robert Castelo
good catch, but really enigmatic, BAI files work, but BAM don't: dat <- read.csv("https://raw.githubusercontent.com/functionalgenomics/gDNAinRNAseqData/devel/inst/extdata/metadata_LiYu22subsetBAMfiles.csv;) rdatapath <- strsplit(dat$RDataPath, ":") bamfiles <- unlist(rdatapath)[seq(1, 18, 2)]

Re: [Bioc-devel] httr::GET() problem downloading a ExperimentHub resource

2023-03-29 Thread Martin Morgan
Not really helpful but this could be simplified a bit by removing the redirect from experiment hub, and the layer from httr to curl, so url = "https://functionalgenomics.upf.edu/experimenthub/gdnainrnaseqdata/LiYu22subsetBAMfiles/s32gDNA0.bam; curl::curl_fetch_disk(url, tempfile()) Error in

[Bioc-devel] rowSums, colSums, rowMeans, colMeans generics moved from BiocGenerics to MatrixGenerics

2023-03-29 Thread Hervé Pagès
Hi developers, A couple of days ago I moved the rowSums, colSums, rowMeans, colMeans generics from *BiocGenerics* to *MatrixGenerics*, and this seems to break a lot of packages on today's build report for devel, sorry for that. I didn't have time to look closely at the damage caused by this

[Bioc-devel] httr::GET() problem downloading a ExperimentHub resource

2023-03-29 Thread Robert Castelo
hi, we recently added a few new ExperimentHub resources, consisting of BAM files and their corresponding BAI files and hosted in my own server. while it seems that they are accessible, they cannot be downloaded through the ExperimentHub API. the minimum example reproducing the problem is this

[Bioc-devel] Important Bioconductor Release Deadlines

2023-03-29 Thread Kern, Lori
Please remember, The Bioconductor 3.16 branch will be frozen Monday April 10th. After that date, no changes will be permitted ever on that branch. The deadline for devel Bioconductor 3.17 for packages to pass R CMD build and R CMD check is April 21th. While you will still be able to make

Re: [Rd] Incorrect behavior of ks.test and psmirnov functions with exact=TRUE

2023-03-29 Thread Kurt Hornik
> Alexey Sergushichev writes: Thanks. This is now fixed for the upcoming 4.3.0 release. Best -k > HI, > I've noticed what I think is an incorrect behavior of stats::psmirnov > function and consequently of ks.test when run in an exact mode. > For example: > psmirnov(1, sizes=c(50, 50),