Re: [Bioc-devel] orthogene: killed while reading tree

2023-04-10 Thread Brian Schilder
Indeed, thanks!

___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
CV | https://bschilder.github.io/CV/CV<https://bschilder.github.io/CV/CV.html>
LinkedIn | 
linkedin.com/in/brian-schilder<https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder<http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk<http://neurogenomics.co.uk>
UK DRI | www.ukdri.ac.uk<http://www.ukdri.ac.uk/>

From: Kern, Lori 
Date: Monday, 10 April 2023 at 13:03
To: Brian Schilder , bioc-devel@r-project.org 

Subject: Re: orthogene: killed while reading tree
I'm not sure what caused it but it looks like it was intermittent as the build 
report is now showing all OK


Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

____
From: Bioc-devel  on behalf of Brian Schilder 

Sent: Wednesday, April 5, 2023 11:38 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] orthogene: killed while reading tree

Hi everyone,

After some pushes upstream to my package 
orthogene<https://secure-web.cisco.com/1Yjb8CppSzUQLL9b4g0UKktihh2zFG4EoIIlbHKDWimE74owMrRVZqfccfH6b6mlcXJM6WIMD1ky8R7i1SSDiQtMOhngidorvykSLnMFZe4JfwikVDy_vAgZEClSdDvXZ5r4pufS5sdlwMA9Qa-rcNWI840cXML6--QWyOkJL63HFirIP8e6VEQCc7Td0hTZVdvWxk35kUefwA2tMRpI91HBtAzq41R5plUEPirBmX8V6-ruxxXoovFLQX2KtSiBYYdAQywGETgUmneU2meMszfNznjaGGctBiEoqp0vKTEJcIShWkwb5CB4xyPhZlKUf/https%3A%2F%2Fbioconductor.org%2Fpackages%2Frelease%2Fbioc%2Fhtml%2Forthogene.html>,
 I noticed that the release version (3.16) is failing on the server because it 
was �Killed� midway through. But I can�t see any details that explain why or 
how it was killed.
http://secure-web.cisco.com/18Y3DECHiDC5IXa6V20ApXvuSZui_k5W0qRUurD8cIzo1s_tU4g0PaicLe-0r8LHwu3VZbmQiiVjchuut0peSTNIVSRnlU1BlwfMhO6gjN5cEyhS_NrneB8jvwInNp1g7lEcjhCaKJErCeqoe3Wso3fF3umsRXS59r0mr7jZh2rtiX09STqqFOPqHnUAkzU1v8RPyWzhNUXeQcjh1K3NjY1t36KGSJrQQfTiMEir4xA2DTUIuqNRJ2BTNrW5EuH0ZwAvRVtu6Q4fuMnJlZe-mnqaYb2kSB-1kIOF9rmVBww7MJlZgjZ6983-Yk000ZZcX/http%3A%2F%2Fbioconductor.org%2FcheckResults%2Frelease%2Fbioc-LATEST%2Forthogene%2Fnebbiolo2-checksrc.html

The only thing I know is that is occurs at the step where a large tree (147k 
species, stored as a 5.3Mb newick file) is being read into R with ape::read.tree
https://secure-web.cisco.com/1VcNDK1HHX4nN8aCtV1oJXg6ITjMsrslJ_ob2tF0xAf_ornVpQsz8BEZgdgXbjx6ahY5JLkAyjeKEkvi9jcHIB4ABv2o9ujffCFuzDGgrFpUqIBguEH7vyNAlMcdYPP6u7qmkfMxAi4E5FE0Rx_fr5VJwgjZF1A6CAHp_ICHt0vmZLVgWF6kwH0ccMywWVeORPIB0-tSbghUIGCTthFCWPrq37AO0c5_8daQxHMULHWvzWYBuUPqJtLdE48857YZpmpyGcsYbT0BrtehIMfLIrUN9_Sh2M730GTNMZc9OOK3YIFIixgDrAJNggS59lX3o/https%3A%2F%2Fgithub.com%2Fneurogenomics%2Forthogene%2Fblob%2F4e41e6322507a929562fc77cbdc602bc084a34bf%2FR%2Fprepare_tree.R%23L115

Oddly, this same code seems to work fine on devel (3.17):
http://secure-web.cisco.com/1YEyAnMymly7DaE-8T-tfgtB3qaqjFRTB_3mLcEGYfiOfGuAH8cgluekNz2bRilm8pL0EgzYXlvk-5HjyoGa8gYcJpNldgRcyJVve6ww4p3cycwTGCphRKaQou6dhBQEFws6vWDMAlWn7-2MVOdVUxQpKjbO4d7rfBVaPdyncGiMbLVRuQ7VOamFzUImU1Q_9telrakGN9oWFBWD0SFnHvi0fuaW4v4BgOxfi9kTM7Ay-J9pSibYJLBrPVmBQqoKVfgMwtR-RWWNhNA06UxGm2r-hmMJHE4Og_2QjVanrXJ7bmMwW0QI77agDy8iuyrmw/http%3A%2F%2Fbioconductor.org%2Fpackages%2F3.17%2Fbioc%2Fhtml%2Forthogene.html

Anyone know why this might be?

Thanks in advance,
Brian
___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
CV | 
https://secure-web.cisco.com/1hSOEYYcneirgbSg4StL0qPT_AWA2F3h51iHm6FkI8XCE6gBpC8qDzdXXRCFrwV_SeVhexavZ4rKpwe3i1fs7fweCXXLkafQpyXgPKm9L63OI3NKAVxgE-0IN1cJomDWrfuDmluCu0zKa6ktv7pW_g924CbTGVKhq_Fc6rntFKZgw22zzGpnHbtYd0rRJ0WNeb3TLdCmpOfvzw9Ins21jedSRHlV1R_UN00eUYInXzJ3A0i_atvXJHt4U7_ZkRQ9FMPyn4_JKfvV4E0SGs1HCjlOteW3w30Ktpee_ELN6ZB0LILwWsWEcb8O4GXN7wkPQ/https%3A%2F%2Fbschilder.github.io%2FCV%2FCV<https://secure-web.cisco.com/1e6XskBMnzPAa3y8j3Rbpm1sSG7wFVl6ssqAlVf1gXx-lXVDbJhZV7eR56sxDvzn2P6q87KXJd7IvvwyVfttAh-oAWFRnwe6PIRd9wJyFu9i9HIhGoRQdY6BGuEPP9B_8Xm4zEhtCme9MoeVqiXmy__N338kxOqxWNh1C2mV4q96m4mc5NDByEdJUd1jJtZ5IT5nu9hO76GxvTQ67gqhSFrKxhU718FnzJMW2r8kNyEov1JbIxwYJx4sc6D5wAvbBem_ywYBO4DkNJxsAzvk_Qwl-Gvw1L6r1-LpoHQsoMhQynbR_hV9NApPXLi05YzzN/https%3A%2F%2Fbschilder.github.io%2FCV%2FCV.html><https://secure-web.cisco.com/1hSOEYYcneirgbSg4StL0qPT_AWA2F3h51iHm6FkI8XCE6gBpC8qDzdXXRCFrwV_SeVhexavZ4rKpwe3i1fs7fweCXXLkafQpyXgPKm9L63OI3NKAVxgE-0IN1cJomDWrfuDmluCu0zKa6ktv7pW_g924CbTGVKhq_Fc6rntFKZgw22zzGpnHbtYd0rRJ0WNeb3TLdCmpOfvzw9Ins21jedSRHlV1R_UN00eUYInXzJ3A0i_atvXJHt4U7_ZkRQ9FMPyn4_JKfvV4E0SGs1HCjlOteW3w30Ktpee_ELN6ZB0LILwWsWEcb8O4GXN7wkPQ/https%3A%2F%2Fbschilder.github.io%2FCV%2FCV%3chttps:/secure-web.cisco.com/1e6XskBMnzPAa3y8j3Rbpm1sSG7w

[Bioc-devel] orthogene: killed while reading tree

2023-04-05 Thread Brian Schilder
Hi everyone,

After some pushes upstream to my package 
orthogene<https://bioconductor.org/packages/release/bioc/html/orthogene.html>, 
I noticed that the release version (3.16) is failing on the server because it 
was �Killed� midway through. But I can�t see any details that explain why or 
how it was killed.
http://bioconductor.org/checkResults/release/bioc-LATEST/orthogene/nebbiolo2-checksrc.html

The only thing I know is that is occurs at the step where a large tree (147k 
species, stored as a 5.3Mb newick file) is being read into R with ape::read.tree
https://github.com/neurogenomics/orthogene/blob/4e41e6322507a929562fc77cbdc602bc084a34bf/R/prepare_tree.R#L115

Oddly, this same code seems to work fine on devel (3.17):
http://bioconductor.org/packages/3.17/bioc/html/orthogene.html

Anyone know why this might be?

Thanks in advance,
Brian
___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
CV | https://bschilder.github.io/CV/CV<https://bschilder.github.io/CV/CV.html>
LinkedIn | 
linkedin.com/in/brian-schilder<https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder<http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk<http://neurogenomics.co.uk>
UK DRI | www.ukdri.ac.uk<http://www.ukdri.ac.uk/>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Multiple projects for submission to Bioconductor

2023-01-03 Thread Brian Schilder
Hi Ali,

I�d just like to add that biocthis is great if you�d like to have use static 
workflow file. But if you�re looking for something that is centrally maintained 
(ie an action) without you having to update your workflow file each time 
there�s an update or bug fix, you can use rworkflows:
https://github.com/neurogenomics/rworkflows/

Best,
Brian

From: Bioc-devel  on behalf of Ali Sajid 
Imami 
Date: Tuesday, 3 January 2023 at 04:07
To: Vincent Carey 
Cc: bioc-devel@r-project.org , Ali Sajid Imami 

Subject: Re: [Bioc-devel] Multiple projects for submission to Bioconductor
Hi,

Thank you for the detailed response. I�ll make sure I dot all the I�s and
cross all the T�s. I�ll also reach out to the author of Biocthis to cross
verify things.

Looking forward to the submission process and thank you and the
bioconductor team for this amazing service.

On Mon, Jan 2, 2023 at 8:12 PM Vincent Carey 
wrote:

>
>
> On Mon, Jan 2, 2023 at 7:51 PM Ali Sajid Imami 
> wrote:
>
>> Hi Colleagues,
>>
>> I am currently a PhD student in Bioinformatics at the University of
>> Toledo,
>> College of Medicine and Life Sciences. I am working in the cognitive
>> disorders research lab (https://cdrl-ut.org) and we have been actively
>> developing multiple packages. We are particularly interested in
>> building data and analysis packages for Kinase Activity analysis.
>>
>> We have a number of packages that we would like to submit to Bioconductor
>> for acceptance. I am currently in the process of streamlining the
>> packages,
>> making sure their dependencies are clearly defined and extracting large
>> datasets into their own data packages.
>>
>> I had a couple of questions regarding the submissions:
>>
>> 1. At our lab we rely heavily on Github Actions for continuous integration
>> and delivery. Are the actions outlined in the biocthis (
>> https://bioconductor.org/packages/release/bioc/html/biocthis.html)
>> package
>> up to date and something we can reliably build on?
>>
>
> Thank you for your note.
>
> I would say the answer here is "yes", but you could also check with the
> author of biocthis to see whether there are any concerns to be aware of.
>
>
>> 2. We have quite a few packages and there may be some packages that would
>> be better to go into the same release. Is it possible to submit them all
>> separately but have their review process be streamlined, collectively?
>>
>
> It is impossible to say without more information.  We have limited
> personnel
> for reviewing.  There is an approach for submitting related packages, see
> here
> 
> .
>
> 3. Is Bioconductor's git infrastructure set up to update from our github
>> repo or is there a solution we can build to push our new "release" to the
>> bioconductor git infra after the integration tests in github actions are
>> completed?
>>
>
> I suspect you could accomplish update-git-on-push-to-github but perhaps
> further discussion is needed.  At this time our build system works with git
> repositories that are jointly managed by us and by the package maintainer.
> Maintainers are responsible for ensuring that the git repository is up to
> date.
>
> Sincerely,
> Vincent Carey
>
>
>> Thank you.
>>
>>
>> Regards,
>> Dr. Ali Sajid Imami
>> LinkedIn 
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> The information in this e-mail is intended only for th...{{dropped:9}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] bioconductor package testing

2022-12-27 Thread Brian Schilder
Hi Adam, and to any other Bioc developers who have encountered this scenario,

I would humbly offer my new R package 
rworkflows<https://github.com/neurogenomics/rworkflows/> for this purpose.
I started this project for exactly the situation you’re describing: wanting to 
test your package with a fresh install on multiple OS (Linux, Mac, Windows) 
before pushing the changes to Bioc. It sets up a GitHub 
Actions<https://github.com/features/actions> workflow that will do all Bioc 
(and/or CRAN) testing, render/launch documentation website via GitHub Pages, 
and build a containers hosted on Dockerhub. You can set this all up using a 
single R function- rworkflows::use_workflow()

https://github.com/neurogenomics/rworkflows/

Accompanying preprint should be up soon. An early release is currently on CRAN, 
but I’d recommend using the devel version on Github atm as it has more workflow 
customisation features. Please let me know if you have any questions in the 
meantime 😊

Sincerely,
Brian
___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
Profile | bit.ly/imperial_profile<https://bit.ly/imperial_profile>
LinkedIn | 
linkedin.com/in/brian-schilder<https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder<http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk<http://neurogenomics.co.uk>
UK DRI | www.ukdri.ac.uk<http://www.ukdri.ac.uk/>

From: Bioc-devel  on behalf of Park, Adam 
Keebum 
Date: Sunday, 25 December 2022 at 05:45
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] bioconductor package testing
Dear whom it may concern,

I hope I am contacting the right person for my inquiry.

(1) I wonder how I can test a package in advance of submitting to bioconductor, 
 especially with respect to dependencies. Dozens of R libraries should be 
installed with my package.

That is, I would like to simulate BiocManager::install("my package") and check 
if a fresh new user can run tutorials and vignettes code without any problem.

(2) Similarly, how could I simulate running Vignette codes in my local 
environment? As I understood so far, codes written in a Vignette(.Rmd) will be 
executable after being converted to a html document and being published in the 
bioconductor website.

Sincerely,
Adam.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] RELEASE_3_14 branch freeze at 1pm EDT today.

2022-04-11 Thread Brian Schilder
Hi Nitesh, 

I pushed a number of fixes to orthogene on Bioc 3.14 
 over the 
last couple weeks, most recently last Friday (April 10th). But the VM report 
 is saying 
my last commit was October 2021, which was v1.0.0 when it should be v1.0.2 now. 
As a consequence orthogene is not passing checks. Do you know what might be 
going on here? I didn’t encounter any errors or warnings when I made the 
upstream pushes.

Best, 
Brian

> On 11 Apr 2022, at 17:00, Nitesh Turaga  wrote:
> 
> Dear Maintainers,
> 
> Please keep in mind that we are on schedule for our release as given in 
> http://bioconductor.org/developers/release-schedule/. 
> 
> Today (April 11th 2022) at 1pm EDT, I will freeze the commits to the 
> RELEASE_3_14 branch in Bioconductor. This is for all packages (software, 
> data-experiment and workflow).
> 
> After 1pm today, you will not be able to push to the RELEASE_3_14 branch. I 
> will send an email post freeze as well. 
> 
> Best regards,
> 
> Nitesh
> 
> 
> Nitesh Turaga
> Scientist II, Department of Data Science,
> Bioconductor Core Team Member
> Dana Farber Cancer Institute
> 
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] MAGMA executable

2021-12-13 Thread Brian Schilder
Thank you both for the helpful feedback. I’ll follow up with the developers of 
MAGMA for clarification on license.

Regarding installation, I agree Kasper, this is not an ideal solution. 
Installing MAGMA at the R package installation time would be ideal, but I’ve 
been unable to come up with a way to do this. 

I’ve been look to Rsamtools <https://github.com/Bioconductor/Rsamtools> for 
some sources of inspiration, since it relies on multiple CLI tools (rsamtools, 
tabix). I’m unfamiliar with getting bash scripts to run while installing R 
packages, but looking into this now.

Best, 
Brian

> On 13 Dec 2021, at 13:59, Kasper Daniel Hansen  
> wrote:
> 
> Ignoring the license issues (which may be significant), I strongly dislike 
> this installation strategy. It (IMO) unreasonable that you potentially write 
> in system locations on package load. You're looking in 
>   /usr/local/bin
>   R.home/bin <- this makes not sense, this is the R home location, why should 
> anything else but R be here?
>   $HOME
>   working directory
> 
> In my opinion, if you want to do something like this, you need to do it at 
> installation time and you should install MAGMA in the package location.
> 
> 
> 
> On Thu, Dec 9, 2021 at 8:30 AM Vincent Carey  <mailto:st...@channing.harvard.edu>> wrote:
> I didn't find an obvious licensing statement at the magma site.  I did see
> 
> note that standard copyright applies; the MAGMA binaries and source code
> may not be distributed or modified)
> 
> the licensing situation would affect my advice on this process, but others
> may have other more
> specific advice
> 
> On Thu, Dec 9, 2021 at 7:37 AM Brian Schilder <
> brian_schil...@alumni.brown.edu <mailto:brian_schil...@alumni.brown.edu>> 
> wrote:
> 
> > Hi everyone,
> >
> > I’m a developer for the R package MAGMA.Celltyping <
> > https://github.com/neurogenomics/MAGMA_Celltyping/tree/bschilder_dev 
> > <https://github.com/neurogenomics/MAGMA_Celltyping/tree/bschilder_dev>> (on
> > the bschilder_dev branch). It’s currently only distributed via GitHub but
> > I’m trying to get it on Bioc if possible. The dilemma is, it relies on
> > MAGMA <https://ctg.cncr.nl/software/magma 
> > <https://ctg.cncr.nl/software/magma>>, which is only available as a
> > CLI program.
> >
> > I have everything passing CRAN/Bioc checks on my local machine, but the
> > final hurdle is installing MAGMA <https://ctg.cncr.nl/software/magma 
> > <https://ctg.cncr.nl/software/magma>> on
> > other machines (e.g. via GitHub Actions checks) such that it can be called
> > from within R.
> >
> > Here’s the steps I’ve taken:
> > Upon .onLoad <
> > https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/zzz.R
> >  
> > <https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/zzz.R>>
> > of MAGMA.Celltyping, magma_installed_version() <
> > https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_installed_version.R
> >  
> > <https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_installed_version.R>>
> > will check whether MAGMA is installed. If not, it proceeds to try and
> > install it via magma_install() <
> > https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_install.R
> >  
> > <https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_install.R>>
> > .
> > magma_install() finds the latest version of MAGMA in their archives <
> > https://ctg.cncr.nl/software/MAGMA/prog/ 
> > <https://ctg.cncr.nl/software/MAGMA/prog/>>, installs it wherever the user
> > has permissions (from a list of possible installation locations <
> > https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/find_install_dir.R
> >  
> > <https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/find_install_dir.R>>),
> > and sets up the symlink.
> > Checks that MAGMA is indeed installed and callable from within R using
> > functions like system(“magma ….”).
> > This all seems to work fine locally, but when I launch to GitHub actions,
> > MAGMA.Celltyping can’t seems to install/find MAGMA. it doesn’t seem to be
> > able to find it.
> >
> > Does anyone know of any solutions to this that are Bioc (or at least CRAN)
> > -compatible?
> >
> > Many thanks in advance,
> > Brian
> > ___
> > Brian Schilder
> > PhD Candidate
> > UK Dementia Research Institute at Imperial College London
>

[Bioc-devel] MAGMA executable

2021-12-09 Thread Brian Schilder
Hi everyone, 

I’m a developer for the R package MAGMA.Celltyping 
<https://github.com/neurogenomics/MAGMA_Celltyping/tree/bschilder_dev> (on the 
bschilder_dev branch). It’s currently only distributed via GitHub but I’m 
trying to get it on Bioc if possible. The dilemma is, it relies on MAGMA 
<https://ctg.cncr.nl/software/magma>, which is only available as a CLI program.

I have everything passing CRAN/Bioc checks on my local machine, but the final 
hurdle is installing MAGMA <https://ctg.cncr.nl/software/magma> on other 
machines (e.g. via GitHub Actions checks) such that it can be called from 
within R.

Here’s the steps I’ve taken:
Upon .onLoad 
<https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/zzz.R> 
of MAGMA.Celltyping, magma_installed_version() 
<https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_installed_version.R>
 will check whether MAGMA is installed. If not, it proceeds to try and install 
it via magma_install() 
<https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/magma_install.R>
 .
magma_install() finds the latest version of MAGMA in their archives 
<https://ctg.cncr.nl/software/MAGMA/prog/>, installs it wherever the user has 
permissions (from a list of possible installation locations 
<https://github.com/neurogenomics/MAGMA_Celltyping/blob/bschilder_dev/R/find_install_dir.R>),
 and sets up the symlink.
Checks that MAGMA is indeed installed and callable from within R using 
functions like system(“magma ….”).
This all seems to work fine locally, but when I launch to GitHub actions, 
MAGMA.Celltyping can’t seems to install/find MAGMA. it doesn’t seem to be able 
to find it.  

Does anyone know of any solutions to this that are Bioc (or at least CRAN) 
-compatible?

Many thanks in advance,
Brian
___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
Profile | bit.ly/imperial_profile <https://bit.ly/imperial_profile>
LinkedIn | linkedin.com/in/brian-schilder 
<https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder <http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk <http://neurogenomics.co.uk/>
UK DRI | www.ukdri.ac.uk <http://www.ukdri.ac.uk/>
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] NCBI taxonomy annotation

2021-08-09 Thread Brian Schilder
Hi Levi, 

I recently just put together a new package called orthogene 
<https://github.com/neurogenomics/orthogene> (currently under review by bioc) 
that has a convenience function for flexibly mapping species identifiers to any 
ID types (including NCBI taxa IDs): map_species() 

It may not be as comprehensive as GenomeInfoDbData, but might still be useful. 

Best, 
Brian
___
Brian Schilder
PhD Candidate
UK Dementia Research Institute at Imperial College London
Faculty of Medicine, Department of Brain Sciences, Neurogenomics Lab
Profile | bit.ly/imperial_profile <https://bit.ly/imperial_profile>
LinkedIn | linkedin.com/in/brian-schilder 
<https://www.linkedin.com/in/brian-schilder/>
Twitter | twitter.com/BMSchilder <http://www.twitter.com/BMSchilder>
Lab | neurogenomics.co.uk <http://neurogenomics.co.uk/>
UK DRI | www.ukdri.ac.uk <http://www.ukdri.ac.uk/>


> On 8 Aug 2021, at 19:10, Levi Waldron  wrote:
> 
> Does anyone else do mapping between NCBI taxids, names, and ranks? We do
> this in curatedMetagenomicData and soon other packages, currently using
> external files that lack provenance and versioning, so Ludwig Geistlinger
> was looking for Bioconductor annotation resources. The closest he found was
> in GenomeInfoDbData <https://bioconductor.org/packages/GenomeInfoDbData> but
> this has only genus and species, and some quirks like Bacteria being listed
> as a genus:
> 
>> library(GenomeInfoDbData)
>> data(specData)
>> head(specData)
>  tax_idgenus species
> 1  1  all
> 2  1 root
> 3  2 Bacteria
> 4  6 Azorhizobium
> 5  7 Azorhizobium caulinodans
> 6  9 Buchnera  aphidicola
>> dim(specData)
> [1] 2521271   3
>> subset(specData, c(genus == "Escherichia" & species == "coli"))$tax_id
> [1] 562
> 
> Any thoughts from the GenomeInfoDbData maintainer ("Bioconductor Maintainer
> ") about a pull request either to a) update
> specData to add additional columns for all taxonomic levels, or b) creating
> a new object? Or, another approach altogether? See
> https://github.com/waldronlab/curatedMetagenomicData/issues/245.
> 
> --
> 
> Levi Waldron
> 
> Associate Professor
> 
> Department of Epidemiology and Biostatistics
> 
> CUNY Graduate School of Public Health and Health Policy
> 
> Institute for Implementation Science in Population Health
> 
> 55 W 125th St, New York NY 10035
> 
> https://waldronlab.io
> 
> Join the microbiome Virtual International Forum: https://microbiome-vif.org
> 
>   [[alternative HTML version deleted]]
> 
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel