Hi all,

if have a question regarding how it is verified that Fingerprinters
actually works correctly as well as Universalisomorphism Tester?

The Question is related to the cdk based project I'm working on which I
will "officially release" once I believe it is usable enough.

I use UIT for Subgraph matching and the ExtendedFingerprinter. I had the
feeling that the fingerprint wasn't especially great at least for the used
dataset (Part of Subset 13 of ZINC) and hence I wanted to try out the
PubchemFingerprinter which I did put now I was getting different amount of
search hits than before. See below tables. I'm now wondering if it is a bug
on my part or in the Fingerprints and/or UIT. How can I determine the
actually correct result? Especially since the reference also disagrees with
UIT.

PubchemFingerprinter:

SMILES                    Screening Hits    Hits
CCC(C)C(C)C(C)C               8599         344
O(C)C(C)C(C)C(C)C              938            28
CCCCCC(C)CC                   9227        1547
N(C)(C)CC(C)C                  15861        8893
O(CC)C(N(C)C)C                 1365            83
CC(C)C(C)C(C(C)C)C(C)C    8599              0

ExtendedFingerprinter

SMILES                    Screening Hits    Hits
CCC(C)C(C)C(C)C                22488        429
O(C)C(C)C(C)C(C)C               9398          77
CCCCCC(C)CC                     3955       1603
N(C)(C)CC(C)C                    88301     10917
O(CC)C(N(C)C)C                   1588          74
CC(C)C(C)C(C(C)C)C(C)C     22488           0

No Screening, just UIT:

SMILES                                              Hits
CCC(C)C(C)C(C)C                                436
O(C)C(C)C(C)C(C)C                                77
CCCCCC(C)CC                                   2171
N(C)(C)CC(C)C                                  11412
O(CC)C(N(C)C)C                                   139
CC(C)C(C)C(C(C)C)C(C)C                         0

As a Reference the same Searches were done in ChemFinder over the same Data
Set

SMILES                        Hits Found in ChemFinder
CCC(C)C(C)C(C)C                              427
O(C)C(C)C(C)C(C)C                             77
CCCCCC(C)CC                                1825
N(C)(C)CC(C)C                               11412
O(CC)C(N(C)C)C                                109
CC(C)C(C)C(C(C)C)C(C)C                       0

Best Regards,

Joos
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to