Re: [R-pkg-devel] handling documentation build tools

2024-05-21 Thread Vladimir Dergachev




On Tue, 21 May 2024, Boylan, Ross via R-package-devel wrote:


Thanks for the pointer.  You may have been thrown off by some goofs I made in 
the intro, which said

I would like to build the automatically, with requiring either users or 
repositories to have the tools.


The intended meaning, with corrections in **, was

I would like to build the  *custom documentation* automatically, with*out* 
requiring either users or repositories to have the tools.


So I want to build the document only locally, as you suggest, but am not sure 
how to accomplish that.


I usually just create a Makefile.

It can be something like this:


all: documentation.pdf

documentation.pdf: documentation.lyx
lyx --export pdf4 documentation.lyx

Then every time before you do R build, must run make in the directory with 
the Makefile.


best

Vladimir Dergachev



Regarding the trick, I'm puzzled by what it gains.  It seems like a complicated 
way to get the core pdf copied to inst/doc.

Also, my main concern was how to automate production of the "core" pdf, using 
the language of the blog post.

Ross



From: Dirk Eddelbuettel 
Sent: Tuesday, May 21, 2024 2:15 PM
To: Boylan, Ross
Cc: r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] handling documentation build tools

!---|
 This Message Is From an External Sender
 This message came from outside your organization.
|---!


As lyx is not listed in 'Writing R Extensions', the one (authorative) manual
describing how to build packages for R, I would not assume it to be present
on every CRAN machine building packages. Also note that several user recently
had to ask here how to deal with less common fonts for style files for
(pdf)latex.

So I would recommend 'localising' the pdf creation to your own machine, and
to ship the resulting pdf. You can have pre-made pdfs as core of a vignette,
I trick I quite like to make package building simpler and more robust.  See
https://urldefense.com/v3/__https://www.r-bloggers.com/2019/01/add-a-static-pdf-vignette-to-an-r-package/__;!!LQC6Cpwp!vcNeLBuZJDE3hWqjhjwi0NVVeEkEHhrSe847H98Eqj9ZEEBspCetgb6g-F7a518JPRd35jL-7xkOlj0$
for details.

Cheers, Dirk

--
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] Question relating to extending a class and inclusion of data

2024-05-21 Thread Hervé Pagès
Hi,

On 5/21/24 01:58, Vilhelm Suksi wrote:
> Hi!
>
> Excuse the long email, but there are a number of things to be clarified in 
> preparation for submitting the notame package which I have been developing to 
> meet Bioconductor guidelines. As of now it passes almost all of the automatic 
> checks, with the exception of formatting and some functions that are over 50 
> lines long.
>
> Background 1:
> The notame package already has a significant following, and was published in 
> 2020 with an associated protocol article published in the "Metabolomics Data 
> Processing and Data Analysis—Current Best Practices" special issue of the 
> Metabolites journal (https://www.mdpi.com/2218-1989/10/4/135). The original 
> package relies on the MetaboSet container class, which extends ExpressionSet 
> with three slots, namely group_col, time_col and subject_col. These slots are 
> used to store the names of the corresponding sample data columns, and are 
> used as default arguments to most functions. This makes for a more 
> streamlined experience. However, the submission guidelines state that 
> existing classes should be preferred, such as SummarizedExperiment. We will 
> be implementing support for SummarizedExperiment over the summer. We have 
> included a MetaboSet - SummarizedExperiment converter for interoperability.
>
> Q1: Can an initial Bioconductor submission rely on the Metaboset container 
> class? Support for MetaboSet would do well to be included anyways for 
> existing users until it is phased out.
Since you already have a user base, you will need a roadmap for the 
transition from Metaboset to MetaboExperiment. Bioconductor has a 
6-month release cycle that facilitates this. More on this below.
> Q2: Is it ok to extend the SummarizedExperiment class to utilize the three 
> aforementioned slots? It could be called MetaboExperiment. Or should the 
> functions be modified such that said columns are specified explicitly, using 
> SummarizedExperiment?

It's better to define your own SummarizedExperiment extension with the 
three additional slots. This way you will have a container 
(MetaboExperiment) that is semantically equivalent (or close) to 
Metaboset. Which means that: (1) in principle you won't need to modify 
the interface of your existing functions, and (2) you'll be able to 
provide coercion methods to go back and forth between the 
MetaboExperiment and Metaboset representations (see ?setAs). Overall 
this should make the transition from Metaboset to MetaboExperiment 
easier/smoother.

This transition would roughly look something like this:

1. Submit theMetaboset-based version of the package for inclusion in 
BioC 3.20.

2. After the 3.20 release (next Fall), make the following changes in the 
devel branch of the package:

- Implement the MetaboExperiment class + accessors (getters/setters) + 
constructor function(s) + show() method.

- Implement the coercion methods to go from Metaboset to 
MetaboExperiment and vice-versa.

- Modify the implementation of all the functions that deal with 
Metaboset objects to deal with MetaboExperiment objects. This will be 
the primary representation that they handle. If they receive a 
Metaboset, they will immediately replace it with a MetaboExperiment 
using as(..., "MetaboExperiment").

- Modify all the documentation, unit tests, and serialized objects 
accordingly.

3. Now you are ready to deprecate the Metaboset class. I recommend that 
you also do this in the devel branch before the 3.21 release. There are 
no well established guidelines to deprecate an S4 class. I recommend 
that you use .Deprecated() to display a deprecation message in its 
show() method, constructor function(s), getters/setters, and coercion 
method from MetaboExperiment to Metaboset.

4. After the 3.21 release (Spring 2025), make the Metaboset class 
defunct by replacing all the .Deprecated() calls with .Defunct() calls.

> Background 2:
> The notame package caters to untargeted LC-MS data analysis metabolic 
> profiling experiments, encompassing data pretreatment (quality control, 
> normalization, imputation and other steps leading up to feature selection) 
> and feature selection (univariate analysis and supervised learning). Raw data 
> preprocessing is not supported. Instead, the package offers utilities for 
> flexibly reading peak tables from an Excel file, resulting from various 
> point-and-click software such as MS-DIAL. As such, data in Excel format needs 
> to be included, but is not available in any Bioconductor package, although 
> such Excel data could be procured from existing data in Bioconductor. 
> However, existing untargeted LC-MS data in Bioconductor can not be used, as 
> is, to demonstrate the full functionality of the notame package. With regard 
> to feature data, there needs to be several analytical modes. Sample data 
> needs to include study group, time point, subject ID and several batches. 
> Blank samples would be good as well. Packages I have 

Re: [R-pkg-devel] handling documentation build tools

2024-05-21 Thread Simon Urbanek
Ross,

It's entirely up to you how you do this -- what you describe is something that 
has to happen before your run R CMD build, so it's not reflected in your 
package that you submit (and this has nothing to do with CRAN or R). There is 
nothing automatic as it requires you to do something in any case. It's entirely 
up to you which tools you use before you call R CMD build - it comes down to 
personal preferences. You have already described all that is needed: the most 
simple solution is to have either a Makefile or a script in your repository 
that does whatever you need (Makefile gives you the automatic update for free) 
and you put those in .Rbuildignore so they will not be part of the package 
source you submit as the output of R CMD build. Typically, it is customary to 
keep the sources (here you .lyx) in the distributed package, "tools" is one 
place you could do it safely, but it is not well-defined.

There is a slight variation on the above: you can (ab)use the "cleanup" script 
so that the process is actually run by R CMD build itself. E.g., the cleanup 
script could simply check if your Makefile is present (which would be only in 
the repo, but not in that tar ball) and simply run make in that case.

That said, as Dirk mentioned, since your output is documentation, the 
semantically correct way would be to treat it as a vignette, which is 
well-documented and specifically provides way for what you described and it is 
clear that it is documentation and not just some random script.

Cheers,
Simon


> On 22/05/2024, at 9:01 AM, Boylan, Ross via R-package-devel 
>  wrote:
> 
> I have some documentation that requires external tools to build.  I would 
> like to build the automatically, with requiring either users or repositories 
> to have the tools.  What's the best way to accomplish that.
> 
> Specifically one document is written using the LyX word processor, so the 
> "source" is msep.lyx.  To convert that into something useful, a pdf, one must 
> run lyx, which in turn requires latex.
> 
> I'm looking for recommendations about how to approach this.
> 
> A purely manual approach would be to place msep.lyx in .Rbuildignore and 
> manually regenerate the pdf if I edit the file.
> 
> This has 2 drawbacks: first, it does not guarantee that the pdf is consistent 
> with the current source; second, if I want to generate plain text or html 
> versions of the document as well, the manual approach gets more tedious and 
> error prone.
> 
> Currently I use pkgbuild::build's option to run bootstrap.R to run the 
> lyx->pdf conversion automatically with each build.  The script is pretty 
> specific to my build environment, and I think if I uploaded the package CRAN 
> would end up trying to run bootstrap.R, which would fail.
> 
> Maybe the script should go in tools/? But then it won't run automatically.  
> Maybe the script in tools goes in .Rbuildignore and the bootstrap.R script 
> simply checks if the tools/ script exists and runs it if present.
> 
> Suggestions?
> 
> I'm also unsure what directory msep.lyx should go in, though that's a 
> secondary issue.  Currently it's in inst/doc, which led to problems with the 
> build system sometimes wiping it out.  I've solved that problem.
> 
> Thanks.
> Ross
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] handling documentation build tools

2024-05-21 Thread Boylan, Ross via R-package-devel
Thanks for the pointer.  You may have been thrown off by some goofs I made in 
the intro, which said
> I would like to build the automatically, with requiring either users or 
> repositories to have the tools.  

The intended meaning, with corrections in **, was
> I would like to build the  *custom documentation* automatically, with*out* 
> requiring either users or repositories to have the tools.

So I want to build the document only locally, as you suggest, but am not sure 
how to accomplish that.

Regarding the trick, I'm puzzled by what it gains.  It seems like a complicated 
way to get the core pdf copied to inst/doc.

Also, my main concern was how to automate production of the "core" pdf, using 
the language of the blog post.

Ross



From: Dirk Eddelbuettel 
Sent: Tuesday, May 21, 2024 2:15 PM
To: Boylan, Ross
Cc: r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] handling documentation build tools

!---|
  This Message Is From an External Sender
  This message came from outside your organization.
|---!


As lyx is not listed in 'Writing R Extensions', the one (authorative) manual
describing how to build packages for R, I would not assume it to be present
on every CRAN machine building packages. Also note that several user recently
had to ask here how to deal with less common fonts for style files for
(pdf)latex.

So I would recommend 'localising' the pdf creation to your own machine, and
to ship the resulting pdf. You can have pre-made pdfs as core of a vignette,
I trick I quite like to make package building simpler and more robust.  See
https://urldefense.com/v3/__https://www.r-bloggers.com/2019/01/add-a-static-pdf-vignette-to-an-r-package/__;!!LQC6Cpwp!vcNeLBuZJDE3hWqjhjwi0NVVeEkEHhrSe847H98Eqj9ZEEBspCetgb6g-F7a518JPRd35jL-7xkOlj0$
for details.

Cheers, Dirk

--
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] handling documentation build tools

2024-05-21 Thread Dirk Eddelbuettel


As lyx is not listed in 'Writing R Extensions', the one (authorative) manual
describing how to build packages for R, I would not assume it to be present
on every CRAN machine building packages. Also note that several user recently
had to ask here how to deal with less common fonts for style files for
(pdf)latex.

So I would recommend 'localising' the pdf creation to your own machine, and
to ship the resulting pdf. You can have pre-made pdfs as core of a vignette,
I trick I quite like to make package building simpler and more robust.  See
https://www.r-bloggers.com/2019/01/add-a-static-pdf-vignette-to-an-r-package/
for details.

Cheers, Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Compile issues on r-devel-linux-x86_64-debian-clang with OpenMP

2024-05-21 Thread Dirk Eddelbuettel

Hi Michelle,

On 21 May 2024 at 13:46, Nixon, Michelle Pistner wrote:
| Hi all,
| 
| I'm running into build issues for my package (fido: 
https://github.com/jsilve24/fido) on the r-devel-linux-x86_64-debian-clang 
system on CRAN (full check log here: 
https://win-builder.r-project.org/incoming_pretest/fido_1.1.0_20240515_211644/Debian/00install.out).
 fido relies on several of the Rcpp packages, and I think the error is due to 
how OpenMP is set up in our package. The error in question states:
| 
| "Error: package or namespace load failed for �fido� in dyn.load(file, DLLpath 
= DLLpath, ...):
|  unable to load shared object 
'/home/hornik/tmp/R.check/r-devel-clang/Work/build/Packages/00LOCK-fido/00new/fido/libs/fido.so':
|   
/home/hornik/tmp/R.check/r-devel-clang/Work/build/Packages/00LOCK-fido/00new/fido/libs/fido.so:
 undefined symbol: omp_get_thread_num"
| 
| I've had a hard time recreating the error, as I can successfully get the 
package to build on other systems (GitHub action results here: 
https://github.com/jsilve24/fido/actions) including a system using the same 
version of R/clang as the failing CRAN check. Looking at the logs between the 
two, the major difference is the lack of -fopenmp in the compiling function on 
the CRAN version (which is there on the r-hub check version with the same 
specifications):
| 
| (From the CRAN version) clang++-18 -std=gnu++17 -shared 
-L/home/hornik/tmp/R-d-clang-18/lib -Wl,-O1 -o fido.so ConjugateLinearModel.o 
MaltipooCollapsed_LGH.o MaltipooCollapsed_Optim.o MatrixAlgebra.o 
PibbleCollapsed_LGH.o PibbleCollapsed_Optim.o PibbleCollapsed_Uncollapse.o 
PibbleCollapsed_Uncollapse_sigmaKnown.o RcppExports.o SpecialFunctions.o 
test_LaplaceApproximation.o test_MultDirichletBoot.o test_utils.o 
-L/home/hornik/tmp/R-d-clang-18/lib -lR
| 
| My initial thought was an issue in the configure scripts (which we borrowed 
heavily from RcppArmadillo but made slight changes to (which is the most likely 
cause if there is issue here)) or that there is some mismatch somewhere as to 
whether or not OpenMP is available, but there isn't an obvious bug to me.
| 
| Any guidance on how to debug would be greatly appreciated!

I seem to recall that that machine is 'known-bad' for OpenMP due to the
reliance on clang-18 which cannot (?) build with it.  Might be best to
contact Kurt Hornik (CC'ed) and/or CRAN.

Best, Dirk

 
| Thanks,
| Michelle
| 
| Michelle Nixon, PhD
| 
| Assistant Research Professor
| College of Information Sciences and Technology
| The Pennsylvania State University
| 
|   [[alternative HTML version deleted]]
| 
| __
| R-package-devel@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-package-devel

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] handling documentation build tools

2024-05-21 Thread Boylan, Ross via R-package-devel
I have some documentation that requires external tools to build.  I would like 
to build the automatically, with requiring either users or repositories to have 
the tools.  What's the best way to accomplish that.

Specifically one document is written using the LyX word processor, so the 
"source" is msep.lyx.  To convert that into something useful, a pdf, one must 
run lyx, which in turn requires latex.

I'm looking for recommendations about how to approach this.

A purely manual approach would be to place msep.lyx in .Rbuildignore and 
manually regenerate the pdf if I edit the file.

This has 2 drawbacks: first, it does not guarantee that the pdf is consistent 
with the current source; second, if I want to generate plain text or html 
versions of the document as well, the manual approach gets more tedious and 
error prone.

Currently I use pkgbuild::build's option to run bootstrap.R to run the lyx->pdf 
conversion automatically with each build.  The script is pretty specific to my 
build environment, and I think if I uploaded the package CRAN would end up 
trying to run bootstrap.R, which would fail.

Maybe the script should go in tools/? But then it won't run automatically.  
Maybe the script in tools goes in .Rbuildignore and the bootstrap.R script 
simply checks if the tools/ script exists and runs it if present.

Suggestions?

I'm also unsure what directory msep.lyx should go in, though that's a secondary 
issue.  Currently it's in inst/doc, which led to problems with the build system 
sometimes wiping it out.  I've solved that problem.

Thanks.
Ross

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[Bioc-devel] Remote BigWig file access

2024-05-21 Thread Leonardo Collado Torres
Hi Bioc-devel,

As some of you are aware, rtracklayer::import() has long provided
access to import BigWig files. Those files can be shared on servers
and accessed remotely thanks to all the effort from many of you in
building and maintaining rtracklayer.

>From my side, derfinder::loadCoverage() relies on
rtracklayer::import.bw(), and recount::expressed_regions() +
recount::coverage_matrix() use derfinder::loadCoverage().
recountWorkflow showcases those recount functions on larger datasets.
brainflowprobes by Amanda Price, Nina Rajpurohit and others also ends
up relying on rtracklayer::import.bw() through these functions.

At https://github.com/lawremi/rtracklayer/issues/83 I initially
reported some issues once our recount2/3 data host changed, but
previously Brian Schilder also reported that one could no longer read
remote files https://github.com/lawremi/rtracklayer/issues/73.
https://github.com/lawremi/rtracklayer/issues/63 and/or
https://github.com/lawremi/rtracklayer/issues/65 might have been
related.

Yesterday I updated
https://github.com/lawremi/rtracklayer/issues/83#issuecomment-2121313270
with a comment showing some small reproducible code, and that the
workaround of downloading the data first, then using
rtracklayer::import() on the local data does work. However, this
workaround does involve a lot of, hmm, wasteful data transfer.

On the recount vignette at some point I access just chrY of a bigWig
file that is about 1300 MB. On the recountWorkflow vignette I do
something similar for a 7GB bigWig file. Previously accessing just
chrY on these files was a small data transfer.

On recountWorkflow version 1.29.2
https://github.com/LieberInstitute/recountWorkflow, I've included
pre-computed results (~2 MB) to avoid downloading tons of data, though
the vignette code shows how to actually fully reproduce the results if
you don't mind downloading those large files. I also implemented some
workarounds on recount, though I haven't yet gone the full route of
including pre-computed results. I have yet to try implementing a
workaround for brainflowprobes.



My understanding is that rtracklayer's root issues are elsewhere and
changes in dependencies rtracklayer has likely created these problems.
These problems are not always in the control of rtracklayer authors to
resolve, and also create an unexpected burden on them.

If one considers alternatives to rtracklayer, I see that there's a new
package https://github.com/PoisonAlien/trackplot/tree/master that uses
bwtool (a system dependency), and older alternative
https://github.com/andrelmartins/bigWig that hasn't had updates in 4
years, and a CRAN package
(https://cran.r-project.org/web/packages/wig/readme/README.html) that
recommends using rtracklayer for larger files. I guess that I could
also try using megadepth https://research.libd.org/megadepth/, though
derfinder::loadCoverage uses rtracklayer::import(as = "RleList") for
efficiency 
https://github.com/lcolladotor/derfinder/blob/f9cd986e0c1b9ea6551d0d8d2077d4501216a661/R/loadCoverage.R#L401
and lots of functions in that package were built for that structure
(RleList objects). I likely missed other alternatives.


My current line of thought is to keep implementing workarounds using
local data (sometimes with pre-computed results) for recount,
recountWorkflow, and brainflowprobes (derfinder only has tests with
local bigWig files) without really altering the internals of those
packages. That is, assume that the remote BigWig file access via
rtracklayer will indefinitely be suspended, though it could be
supported again at some point and when it does, those packages will
work again with remote BigWig files as if nothing ever happened. But I
wanted to check in if this is what others who use BigWig files are
thinking of doing.

Thanks!

Best,
Leo


Leonardo Collado Torres, Ph. D.
Investigator, LIEBER INSTITUTE for BRAIN DEVELOPMENT
Assistant Professor, Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
855 N. Wolfe St., Room 382
Baltimore, MD 21205
lcolladotor.github.io
lcollado...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[R-pkg-devel] Compile issues on r-devel-linux-x86_64-debian-clang with OpenMP

2024-05-21 Thread Nixon, Michelle Pistner
Hi all,

I'm running into build issues for my package (fido: 
https://github.com/jsilve24/fido) on the r-devel-linux-x86_64-debian-clang 
system on CRAN (full check log here: 
https://win-builder.r-project.org/incoming_pretest/fido_1.1.0_20240515_211644/Debian/00install.out).
 fido relies on several of the Rcpp packages, and I think the error is due to 
how OpenMP is set up in our package. The error in question states:

"Error: package or namespace load failed for �fido� in dyn.load(file, DLLpath = 
DLLpath, ...):
 unable to load shared object 
'/home/hornik/tmp/R.check/r-devel-clang/Work/build/Packages/00LOCK-fido/00new/fido/libs/fido.so':
  
/home/hornik/tmp/R.check/r-devel-clang/Work/build/Packages/00LOCK-fido/00new/fido/libs/fido.so:
 undefined symbol: omp_get_thread_num"

I've had a hard time recreating the error, as I can successfully get the 
package to build on other systems (GitHub action results here: 
https://github.com/jsilve24/fido/actions) including a system using the same 
version of R/clang as the failing CRAN check. Looking at the logs between the 
two, the major difference is the lack of -fopenmp in the compiling function on 
the CRAN version (which is there on the r-hub check version with the same 
specifications):

(From the CRAN version) clang++-18 -std=gnu++17 -shared 
-L/home/hornik/tmp/R-d-clang-18/lib -Wl,-O1 -o fido.so ConjugateLinearModel.o 
MaltipooCollapsed_LGH.o MaltipooCollapsed_Optim.o MatrixAlgebra.o 
PibbleCollapsed_LGH.o PibbleCollapsed_Optim.o PibbleCollapsed_Uncollapse.o 
PibbleCollapsed_Uncollapse_sigmaKnown.o RcppExports.o SpecialFunctions.o 
test_LaplaceApproximation.o test_MultDirichletBoot.o test_utils.o 
-L/home/hornik/tmp/R-d-clang-18/lib -lR

My initial thought was an issue in the configure scripts (which we borrowed 
heavily from RcppArmadillo but made slight changes to (which is the most likely 
cause if there is issue here)) or that there is some mismatch somewhere as to 
whether or not OpenMP is available, but there isn't an obvious bug to me.

Any guidance on how to debug would be greatly appreciated!

Thanks,
Michelle

Michelle Nixon, PhD

Assistant Research Professor
College of Information Sciences and Technology
The Pennsylvania State University

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

2024-05-21 Thread Liu, Haibo
If you have many sequences to download, it can be very slow if you use 
Uniprot.ws to query the database. I used a walk-around method to download the 
proteome of a species using FTP in my dagLogo package: 
https://github.com/jianhong/dagLogo/blob/devel/R/prepareProteomeByFTP.R.

Haibo

-Original Message-
From: aditya.bhag...@uni-marburg.de 
Sent: Tuesday, May 21, 2024 9:20 AM
To: Liu, Haibo 
Cc: Vincent Carey ; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

Thank you Haibo and Vincent : )

Yes, UniProt.ws is certainly there and did serve me well earlier.
There are times when an offline solution is needed, when operations are done on 
larger sets.
Followed my habit to check first whether a relevant AnnotationHub record exists.
Given the absence of that, I downloaded the relevant fastafile manually from 
UniProt.
Read it in with Biostrings and then took it from there.
Which is quite hassle-free, so yes, I can see that there is no need and not 
much value to pull it into AnnotationHub.
Thankyou for sharing your experience : )

Cheers,

Aditya




> We use Uniprot.ws to access the Uniprot data on demand:
> https://bioconductor.org/packages/release/bioc/html/UniProt.ws.html.
>
> Haibo
>
> -Original Message-
> From: Bioc-devel  On Behalf Of
> Vincent Carey
> Sent: Tuesday, May 21, 2024 7:02 AM
> To: aditya.bhag...@uni-marburg.de
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] AnnotationHub: uniprot seqs ?
>
> On Tue, May 21, 2024 at 4:18 AM Aditya Bhagwat via Bioc-devel <
> bioc-devel@r-project.org> wrote:
>
>> Hey guys,
>>
>> Do we have Uniprot sequences in AnnotationHub ?
>>
>
> That does not seem practical.
>
> Please see
> https://bioconductor.org/packages/release/bioc/vignettes/UniProt.ws/inst/doc/UniProt.ws.html
>
> Let us know if that does not meet your need.
>
>
>> Not being able to find them.
>>
>
>
>
>>
>> Thankyouverymuch : )
>>
>> Aditya
>>
>> --
>> Aditya Bhagwat
>> Translational Proteomics ∙ Philipps-University Marburg
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat/
>> .ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel=05%7C02%7Chaibo.liu%40
>> umassmed.edu%7C47f0c877418b4558bdac08dc79859dee%7Cee9155fe2da34378a6c4
>> 4405faf57b2e%7C0%7C0%7C638518862010686203%7CUnknown%7CTWFpbGZsb3d8eyJW
>> IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%
>> 7C%7C=zDryByDBnvrbLsS4GN5W4EKzwc5%2FcpTdl3vgNrCEb%2BQ%3D
>> d=0
>>
>
> --
> The information in this email is intended only for the p...{{dropped:15}}
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Aditya Bhagwat
Translational Proteomics ∙ Philipps-University Marburg
Biological Pharmacological Center ∙ Room A406
Tel.: +49 6421 28 27403

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

2024-05-21 Thread Aditya Bhagwat via Bioc-devel

Thank you Haibo and Vincent : )

Yes, UniProt.ws is certainly there and did serve me well earlier.
There are times when an offline solution is needed, when operations  
are done on larger sets.
Followed my habit to check first whether a relevant AnnotationHub  
record exists.
Given the absence of that, I downloaded the relevant fastafile  
manually from UniProt.

Read it in with Biostrings and then took it from there.
Which is quite hassle-free, so yes, I can see that there is no need  
and not much value to pull it into AnnotationHub.

Thankyou for sharing your experience : )

Cheers,

Aditya




We use Uniprot.ws to access the Uniprot data on demand:  
https://bioconductor.org/packages/release/bioc/html/UniProt.ws.html.


Haibo

-Original Message-
From: Bioc-devel  On Behalf Of  
Vincent Carey

Sent: Tuesday, May 21, 2024 7:02 AM
To: aditya.bhag...@uni-marburg.de
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

On Tue, May 21, 2024 at 4:18 AM Aditya Bhagwat via Bioc-devel <  
bioc-devel@r-project.org> wrote:



Hey guys,

Do we have Uniprot sequences in AnnotationHub ?



That does not seem practical.

Please see
https://bioconductor.org/packages/release/bioc/vignettes/UniProt.ws/inst/doc/UniProt.ws.html

Let us know if that does not meet your need.



Not being able to find them.







Thankyouverymuch : )

Aditya

--
Aditya Bhagwat
Translational Proteomics ∙ Philipps-University Marburg

___
Bioc-devel@r-project.org mailing list
https://stat/
.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel=05%7C02%7Chaibo.liu%40
umassmed.edu%7C47f0c877418b4558bdac08dc79859dee%7Cee9155fe2da34378a6c4
4405faf57b2e%7C0%7C0%7C638518862010686203%7CUnknown%7CTWFpbGZsb3d8eyJW
IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%
7C%7C=zDryByDBnvrbLsS4GN5W4EKzwc5%2FcpTdl3vgNrCEb%2BQ%3D
d=0



--
The information in this email is intended only for the...{{dropped:14}}


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

2024-05-21 Thread Liu, Haibo
We use Uniprot.ws to access the Uniprot data on demand: 
https://bioconductor.org/packages/release/bioc/html/UniProt.ws.html.

Haibo

-Original Message-
From: Bioc-devel  On Behalf Of Vincent Carey
Sent: Tuesday, May 21, 2024 7:02 AM
To: aditya.bhag...@uni-marburg.de
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

On Tue, May 21, 2024 at 4:18 AM Aditya Bhagwat via Bioc-devel < 
bioc-devel@r-project.org> wrote:

> Hey guys,
>
> Do we have Uniprot sequences in AnnotationHub ?
>

That does not seem practical.

Please see
https://bioconductor.org/packages/release/bioc/vignettes/UniProt.ws/inst/doc/UniProt.ws.html

Let us know if that does not meet your need.


> Not being able to find them.
>



>
> Thankyouverymuch : )
>
> Aditya
>
> --
> Aditya Bhagwat
> Translational Proteomics ∙ Philipps-University Marburg Biological
> Pharmacological Center ∙ Room A406
> Tel.: +49 6421 28 27403
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel=05%7C02%7Chaibo.liu%40
> umassmed.edu%7C47f0c877418b4558bdac08dc79859dee%7Cee9155fe2da34378a6c4
> 4405faf57b2e%7C0%7C0%7C638518862010686203%7CUnknown%7CTWFpbGZsb3d8eyJW
> IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%
> 7C%7C=zDryByDBnvrbLsS4GN5W4EKzwc5%2FcpTdl3vgNrCEb%2BQ%3D
> d=0
>

--
The information in this email is intended only for the p...{{dropped:15}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] AnnotationHub: uniprot seqs ?

2024-05-21 Thread Vincent Carey
On Tue, May 21, 2024 at 4:18 AM Aditya Bhagwat via Bioc-devel <
bioc-devel@r-project.org> wrote:

> Hey guys,
>
> Do we have Uniprot sequences in AnnotationHub ?
>

That does not seem practical.

Please see
https://bioconductor.org/packages/release/bioc/vignettes/UniProt.ws/inst/doc/UniProt.ws.html

Let us know if that does not meet your need.


> Not being able to find them.
>



>
> Thankyouverymuch : )
>
> Aditya
>
> --
> Aditya Bhagwat
> Translational Proteomics ∙ Philipps-University Marburg
> Biological Pharmacological Center ∙ Room A406
> Tel.: +49 6421 28 27403
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
The information in this email is intended only for the p...{{dropped:15}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] confint Attempts to Use All Server CPUs by Default

2024-05-21 Thread Ivan Krylov via R-devel
В Tue, 21 May 2024 08:00:11 +
Dario Strbenac via R-devel  пишет:

> Would a less resource-intensive value, such as 1, be a safer default
> CPU value for confint?

Which confint() method do you have in mind? There is at least four of
them by default in R, and many additional classes could make use of
stats:::confint.default by implementing vcov().

> Also, there is no mention of such parallel processing in ?confint, so
> it was not clear at first where to look for performance degradation.
> It could at least be described in the manual page so that users would
> know that export OPENBLAS_NUM_THREADS=1 is a solution.

There isn't much R can do about the behaviour of the BLAS, because
there is no standard interface to set the number of threads. Some BLASes
(like ATLAS) don't even offer it as a tunable number at all [*].

A system administrator could link the installation of R against
FlexiBLAS [**], provide safe defaults in the environment variables and
educate the users about its tunables [***], but that's a choice just
like it had been a choice to link R against a parallel variant of
OpenBLAS on a shared computer. This is described in R Installation and
Administration, section A.3.1 [].

-- 
Best regards,
Ivan

[*]
https://math-atlas.sourceforge.net/faq.html#tnum

[**]
https://www.mpi-magdeburg.mpg.de/projects/flexiblas

[***]
https://search.r-project.org/CRAN/refmans/flexiblas/html/flexiblas-threads.html

[]
https://cran.r-project.org/doc/manuals/R-admin.html#BLAS

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Bioc-devel] Question relating to extending a class and inclusion of data

2024-05-21 Thread Vilhelm Suksi
Hi!

Excuse the long email, but there are a number of things to be clarified in 
preparation for submitting the notame package which I have been developing to 
meet Bioconductor guidelines. As of now it passes almost all of the automatic 
checks, with the exception of formatting and some functions that are over 50 
lines long.

Background 1:
The notame package already has a significant following, and was published in 
2020 with an associated protocol article published in the "Metabolomics Data 
Processing and Data Analysis—Current Best Practices" special issue of the 
Metabolites journal (https://www.mdpi.com/2218-1989/10/4/135). The original 
package relies on the MetaboSet container class, which extends ExpressionSet 
with three slots, namely group_col, time_col and subject_col. These slots are 
used to store the names of the corresponding sample data columns, and are used 
as default arguments to most functions. This makes for a more streamlined 
experience. However, the submission guidelines state that existing classes 
should be preferred, such as SummarizedExperiment. We will be implementing 
support for SummarizedExperiment over the summer. We have included a MetaboSet 
- SummarizedExperiment converter for interoperability. 

Q1: Can an initial Bioconductor submission rely on the Metaboset container 
class? Support for MetaboSet would do well to be included anyways for existing 
users until it is phased out.

Q2: Is it ok to extend the SummarizedExperiment class to utilize the three 
aforementioned slots? It could be called MetaboExperiment. Or should the 
functions be modified such that said columns are specified explicitly, using 
SummarizedExperiment?

Background 2:
The notame package caters to untargeted LC-MS data analysis metabolic profiling 
experiments, encompassing data pretreatment (quality control, normalization, 
imputation and other steps leading up to feature selection) and feature 
selection (univariate analysis and supervised learning). Raw data preprocessing 
is not supported. Instead, the package offers utilities for flexibly reading 
peak tables from an Excel file, resulting from various point-and-click software 
such as MS-DIAL. As such, data in Excel format needs to be included, but is not 
available in any Bioconductor package, although such Excel data could be 
procured from existing data in Bioconductor. However, existing untargeted LC-MS 
data in Bioconductor can not be used, as is, to demonstrate the full 
functionality of the notame package. With regard to feature data, there needs 
to be several analytical modes. Sample data needs to include study group, time 
point, subject ID and several batches. Blank samples would be good as well. 
Packages I have checked for data with the above specifications include FaahKO, 
MetaMSdata, msdata, msqc1, mtbls2, pmp, PtH2O2lipids, and ropls. As of now, the 
example data is not realistic in that it is scrambled and I have not yet been 
informed of the origin and modification of the data. 

Q3: If I get access to information about the origin and modification of the now 
used data, can I further modify it to satisfy the needs of the package for an 
initial Bioconductor release? Or does it need to be realistic? Consider this 
the explicit pre-approval inquiry for including data in the notame package.

Q4: Do you think a separate ExperimentData package satisfying the 
specifications laid out in Background 2 is warranted? This could be included in 
a future version with SummarizedExperiment/MetaboExperiment support.

Q5: The instructions state that the data needs to be documented 
(https://contributions.bioconductor.org/docs.html#doc-inst-script). Is the 
availability of the original data strictly necessary?  I notice many packages 
don't include documentation on how the data was procured.

Thanks,
Vilhelm Suksi
Turku Data Science Group
vks...@utu.fi

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] AnnotationHub: uniprot seqs ?

2024-05-21 Thread Aditya Bhagwat via Bioc-devel

Hey guys,

Do we have Uniprot sequences in AnnotationHub ?
Not being able to find them.

Thankyouverymuch : )

Aditya

--
Aditya Bhagwat
Translational Proteomics ∙ Philipps-University Marburg
Biological Pharmacological Center ∙ Room A406
Tel.: +49 6421 28 27403

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Rd] confint Attempts to Use All Server CPUs by Default

2024-05-21 Thread Dario Strbenac via R-devel
Hello,

Would a less resource-intensive value, such as 1, be a safer default CPU value 
for confint? I noticed excessive CPU usage on a I.T. administrator-managed 
server which was being three-quarters used by another staff member when the 
confidence interval calculation in an R Markdown document suddenly changed from 
two seconds to ninety seconds because of competition for CPUs between users. 
Also, there is no mention of such parallel processing in ?confint, so it was 
not clear at first where to look for performance degradation. It could at least 
be described in the manual page so that users would know that export 
OPENBLAS_NUM_THREADS=1 is a solution.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel