that's great, thank you so much Lori!
robert. On 4/5/24 15:02, Kern, Lori wrote: > I found the bug. Testing and pushing up a fix. > > Cheers, > > Lori Shepherd - Kern > > Bioconductor Core Team > > Roswell Park Comprehensive Cancer Center > > Department of Biostatistics & Bioinformatics > > Elm & Carlton Streets > > Buffalo, New York 14263 > > ------------------------------------------------------------------------ > *From:* Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of > Kern, Lori via Bioc-devel <bioc-devel@r-project.org> > *Sent:* Friday, April 5, 2024 8:15 AM > *To:* Robert Castelo <robert.cast...@upf.edu>; > bioc-devel@r-project.org <bioc-devel@r-project.org> > *Subject:* Re: [Bioc-devel] duplicated entries with > 'ExperimentHub(localHub=TRUE)' > I will have to look at how offline changes the loading of the files. > That is an odd and unexpected behavior. > > They aren't actually duplicate files, what is happening is it is > displaying the entry for the bam file (.bam) and the index file (.bai) > as separate entries when offline instead of associating them as one entry. > > I'll investigate more. > > > Lori Shepherd - Kern > > Bioconductor Core Team > > Roswell Park Comprehensive Cancer Center > > Department of Biostatistics & Bioinformatics > > Elm & Carlton Streets > > Buffalo, New York 14263 > > ________________________________ > From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of > Robert Castelo <robert.cast...@upf.edu> > Sent: Thursday, April 4, 2024 2:40 PM > To: bioc-devel@r-project.org <bioc-devel@r-project.org> > Subject: [Bioc-devel] duplicated entries with > 'ExperimentHub(localHub=TRUE)' > > hi, > > I'm getting duplicated entries when loading **offline** previously > cached ExperimentHub resources. This code reproduces the problem: > > 1. If in a fresh empty cache of ExperimentHub I download 9 resources > through the gDNAinRNAseqData package: > > library(gDNAinRNAseqData) > > bamfiles <- LiYu22subsetBAMfiles() > length(bamfiles) > [1] 9 > > 2. Try to load them again from the local cache either going offline or > using the 'offline=TRUE' argument to the loader function, which sets > 'localHub=TRUE' in the call to 'ExperimentHub()': > > bamfiles <- LiYu22subsetBAMfiles(offline=TRUE) > Using 'localHub=TRUE' > If offline, please also see BiocManager vignette section on offline use > snapshotDate(): 2024-04-02 > see ?gDNAinRNAseqData and browseVignettes('gDNAinRNAseqData') for > documentation > loading from cache > [...] > > length(bamfiles) > [1] 18 > > 3. If I examine the resources offline directly with 'ExperimentHub()' I > see them duplicated with some IDs getting a '.1' suffix: > > library(ExperimentHub) > > eh <- ExperimentHub(localHub=TRUE) > Using 'localHub=TRUE' > If offline, please also see BiocManager vignette section on offline use > snapshotDate(): 2024-04-02 > length(eh) > [1] 18 > eh > ExperimentHub with 18 records > # snapshotDate(): 2024-04-02 > # $dataprovider: NGDC > # $species: Homo sapiens > # $rdataclass: BamFile > # additional mcols(): taxonomyid, genome, description, > # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags, > # rdatapath, sourceurl, sourcetype > # retrieve records with, e.g., 'object[["EH8079"]]' > > > EH8079 | > EH8079.1 | > EH8080 | > EH8080.1 | > EH8081 | > ... > EH8085.1 | > EH8086 | > EH8086.1 | > EH8087 | > EH8087.1 | > title > EH8079 RNA-seq data BAM file subset of HRR589632 contaminated with > 0% gDNA > EH8079.1 RNA-seq data BAM file subset of HRR589632 contaminated with > 0% gDNA > EH8080 RNA-seq data BAM file subset of HRR589633 contaminated with > 0% gDNA > EH8080.1 RNA-seq data BAM file subset of HRR589633 contaminated with > 0% gDNA > EH8081 RNA-seq data BAM file subset of HRR589634 contaminated with > 0% gDNA > ... ... > EH8085.1 RNA-seq data BAM file subset of HRR589623 contaminated with > 10% ... > EH8086 RNA-seq data BAM file subset of HRR589624 contaminated with > 10% ... > EH8086.1 RNA-seq data BAM file subset of HRR589624 contaminated with > 10% ... > EH8087 RNA-seq data BAM file subset of HRR589625 contaminated with > 10% ... > EH8087.1 RNA-seq data BAM file subset of HRR589625 contaminated with > 10% ... > > Does anybody have an idea what might be going on with > 'ExperimentHub(localHub=TRUE)'? > > Thanks! > > robert. > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://secure-web.cisco.com/1H0voxA7oQ0saDcNCWmRZwr1H6rkyUr0Fu4Ru-hZrq5GY1ay-R4ltvl_raeo94HUjjlKMox7wMWOkNHrqW28aJmsFXxCkYVatvRWHo5X5Pwpy3KKZLPxRybRw-xB-pjeKV38ia8MSC3_WURYilKunRSCMrcU8O0rBmThSR5Zip-TpfdAvp5oTkjIvudwgfsDPkVYxWwfoZIAFgRMj1x0D6yNG-HAsH5z4ejKrUklBnDvDPDK60h8e8HX0O31gA3pKSQYcN4v71RUYobDgAeciTZJwFe7PVneGo5q2nBuXNIhkwzKebrB5H9_O2At40PjQ9NOAKYCnl4N532p-NNGkHw/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel > > > > This email message may contain legally privileged and/or confidential > information. If you are not the intended recipient(s), or the > employee or agent responsible for the delivery of this message to the > intended recipient(s), you are hereby notified that any disclosure, > copying, distribution, or use of this email message is prohibited. If > you have received this message in error, please notify the sender > immediately by e-mail and delete this email message from your > computer. Thank you. > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://secure-web.cisco.com/1thqFzBAcT0ErtGX-D2Y92G38HTNI25jcq62d-WCawFoQYC218AtscHoM4VW5_dRc5tH-YWcY-cjQDkvIc-6ukdQ3ZAA5y0SlIQJMp1h2ArJIqB4yGFiua5DXt2eeIb-qChQZgmntCJffJrNwtn3iHQCg-X6kSkwzbBOT_Y4B-YWr77Qctd7puN0evQlJ4XSDSWEUfdvWzk-7wAQID4XCq-q6VWk7W2LhGRUPIThvl6_YYNljIeloEj5RlyS4VeYsw6EE0-0O_77PPWLDlfZpJmekjXREfUDjvJSLLELTyvrk-kanUUidUjcRpWgFUzrH/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel > > > This email message may contain legally privileged and/or confidential > information. If you are not the intended recipient(s), or the employee > or agent responsible for the delivery of this message to the intended > recipient(s), you are hereby notified that any disclosure, copying, > distribution, or use of this email message is prohibited. If you have > received this message in error, please notify the sender immediately > by e-mail and delete this email message from your computer. Thank you. -- Robert Castelo, PhD Associate Professor Dept. of Medicine and Life Sciences Universitat Pompeu Fabra (UPF) Barcelona Biomedical Research Park (PRBB) Dr Aiguader 88 E-08003 Barcelona, Spain telf: +34.933.160.514 [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel