[Rdkit-discuss] use cases for weighted sampling of a compound library

Christopher Mayer-Bacon Sun, 11 Dec 2022 09:25:06 -0800

Hello all,

I’m starting a project that explores the sampling of a large compound
library.  My question is not so much about how to do something, but rather
the specific use cases for weighted sampling from a compound library.


Given a large compound library and a smaller, reference library, I want to
take random samples from the large library such that the samples resemble
the reference library in some way.  At the moment I’m focused on element
composition (% of carbon atoms, % of oxygen atoms, etc.), but I’m open to
using other features in the future.

I have an idea of how to perform this sampling; my question for this
community concerns a possible use case.  What would be the benefit of
sampling from a compound library such that the samples resemble another
library in some way?  I can think of a use case for my specific research
niche (adaptive properties of the canonical amino acid alphabet), but I
can’t think of another potential use case.  I know the RDKit community has
a wide variety of backgrounds and expertise, hence why I wanted to pose
this question to you all.

-Chris

-- 
-Christopher Mayer-Bacon (*he/him/his*)
PhD student
Department of Biological Sciences
University of Maryland, Baltimore County

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] use cases for weighted sampling of a compound library

Reply via email to