Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures? --> We have just published a preprint to this!

2020-05-15 Thread Bennion, Brian via Rdkit-discuss
Thank you for the link. I will look at it! --- Sent from Workspace ONE Boxer On May 14, 2020 at 11:25:49 PM PDT, Tuan Le wrote: Hi Brian, I was working on a study to deduce molecular structures given ECFP fingerprints and came across your open

[Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures? --> We have just published a preprint to this!

2020-05-15 Thread Tuan Le
Hi Brian, I was working on a study to deduce molecular structures given ECFP fingerprints and came across your open question on the rdkit mailing-list (https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg07851.html). I really enjoyed reading the discussion in the mailing

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Nathan Brown
From: David Cosgrove <davidacosgrov...@gmail.com> Date: Monday, 23 April 2018 at 17:28 To: Brice Hoffmann <brice.hoffm...@iktos.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structu

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread David Cosgrove
Hi all, I’ve just had the attached from Roger Sayle, which might be of interest. Dave Hi Andrew and Dave, John (Mayfield) has just pointed me at the very interesting discussion raging on sourceforge. Alas, I've no idea how to post/tweet/snapchat a reply, but thought I'd at least contribute this

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Brice Hoffmann
Hi, Another option is to use generative models that uses fingerprints as input (ex: https://arxiv.org/abs/1701.01329, https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.7b00346). If you use as a scoring function of the generated molecules the Tanimoto Distance to a given fingerprint, you can often

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Andrew Dalke
On Apr 23, 2018, at 14:54, Brian Cole wrote: > Unfortunately it doesn't work on circular/ECFP-like fingerprints. To be fair, you didn't mention that was a requirement. ;) > It has the requirement that the fingerprint be a substructure fingerprint as > you described. Could

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Brian Cole
Thanks Andrew, very interesting and useful script! Unfortunately it doesn't work on circular/ECFP-like fingerprints. It has the requirement that the fingerprint be a substructure fingerprint as you described. It seems the evolutionary/genetic algorithm approach is the current state-of-the-art for

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Maciek Wójcikowski
> > > which could of course also be changed to something expensive to > calculate. > Yes, that could be possible. Abstractly, let the first 20 bytes of each > fingerprint be a salt, and use something like bcrypt so each fingerprint > test requires that the query structure be re-fingerprinted for

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-22 Thread Andrew Dalke
On Apr 22, 2018, at 20:22, Nils Weskamp wrote: > Actually, I *was* also thinking about your use cases 2 and 3 since you > also need some form of hash function to map substructures to bit > numbers. This is normally a rather simple function / pseudo random > generator,

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-22 Thread Nils Weskamp
Hi Andrew, Am 22.04.2018 um 19:35 schrieb Andrew Dalke: > I think of what I did here as a bit more elegant than that. ;) I should have have looked at the code more carefully before commenting. ;) Nevertheless, you will probably still need many steps for complex structures - although not as many

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-22 Thread Andrew Dalke
On Apr 22, 2018, at 08:42, Nils Weskamp wrote: > Nice work. If brute-force approaches like this (or methods based on > genetic algorithms etc.) are the only way to reverse a fingerprint, one > could probably come up with a fingerprint that allows for pretty secure >

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-22 Thread Nils Weskamp
Am 22.04.2018 um 03:04 schrieb Andrew Dalke: > Here's an implementation of that sketch, applied to the RDKit hash > fingerprint: Nice work. If brute-force approaches like this (or methods based on genetic algorithms etc.) are the only way to reverse a fingerprint, one could probably come up with

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-21 Thread Andrew Dalke
On Apr 21, 2018, at 01:55, Andrew Dalke wrote: > Hand-waving sketch: start with a carbon. Generate fingerprint. It should pass > the screening test. If not, the structure contains no carbons, so repeat with > other elements until you find an atom which passes.

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Andrew Dalke
On Apr 20, 2018, at 19:03, jeff godden wrote: > > Long ago molecular fingerprints were referred to in the literature as > molecular hash functions. (y'know, those crazy mathematical algorithms which > permitted rapid lookup of some string in a lookup table) Do you have a

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread David Cosgrove
Hi Jeff, What you say is theoretically correct, in that it is probably not possible to go from the fingerprint directly to a structure. However, it is possible to generate structures and rapidly compare them to the target fingerprint. The fingerprints are of course able to tell you how close your

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread jeff godden
(getting dangerously old fart chatty here but) we crafted an in-house molecular fingerprint once which was designed to hash out whether a compound would've pissed off the high-throughput/organic-chemists or not. (essentially anything with "exotic atoms" (like Boron?) or strained bonds (like less

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Peter S. Shenkin
Well, @jeff, there's no law saying that hashes must collide, and in fact some are designed to make collision extremely unlikely (can you say "SHA-2"?). But the ones in question here do collide relatively frequently, for at least some molecular fingerprint types. An interesting question (maybe

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread jeff godden
Long ago molecular fingerprints were referred to in the literature as molecular hash functions. (y'know, those crazy mathematical algorithms which permitted rapid lookup of some string in a lookup table) As such, we expected for their to be the associated hash collisions (

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Peter S. Shenkin
Isn't it the case that more than one molecule can share an identical fingerprint? (Depending on the specific fingerprint.) Think p-biphenyl, extended to triphenyl, tetraphenyl, etc. Still, a GA or SA method could keep going and come up with multiple matches, plus multiple near-misses. -P. On

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread David Cosgrove
Hi Brian, Dave Weininger once showed a fairly simple GA that could generally deduce a structure from a daylight fingerprint by using SMILES strings as the chromosomes and tanimoto distance to the target fingerprint as the fitness function. He may have done a talk about it for MUG or conceivably

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Nils Weskamp
Hi Brian, in general, it might be difficult to come up with a deterministic algorithm that generates exactly one structure for a given fingerprint due to many ambiguities in the process. If you are happy with a more "fuzzy" (approximate / probabilistic) approach, you might want to take a look at

[Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Brian Cole
Hi Chem-informaticians: I know it has been talked about in the community that fingerprints are not a way to obfuscate molecules for security, but I don't recall a paper actually demonstrating actual reverse engineering a fingerprint into a chemical structure. Does anyone know if such a paper