Hi CDK users,
I've released chemfp 5.0, my Python package for cheminformatics
fingerprint generation, search, and analysis. You can install it on
Linux-based OSes using:
python -m pip install chemfp -i https://chemfp.com/packages/
(Append "--upgrade" if you have already installed it.)
For a description of the changes since 4.2 see
https://chemfp.com/docs/whats_new_in_50.html .
The highlights are:
• Update the FPB format to handle over 1 billion fingerprints.
• New chemfp shardsearch command-line tool which does similarity
search across multiple target files and merges the result.
- Tested with the 977 million structures in GDB-13
• New chemfp simhistogram / chemfp simhist command-line tool and
corresponding chemfp.simhistogram() high-level API function
to create a histogram of similarity scores.
• Initial support for count fingerprints:
- new text-based FPC format based on the FPS format
- rdkit2fpc tool which uses RDKit's sparse fingerprint generators
- fpc2fps tool with various method to convert sparse count
fingerprints to binary fingerprints
• Fast implementations of the 4860-bit Klekota-Roth fingerprint
for the OpenEye and RDKit toolkits.
Chemfp can generate CDK fingerprints using the JPype bridge. For details see
https://chemfp.com/docs/installing.html#installing-cdk-and-jpype
The CDK-specific changes for chemfp 5.0 are for new features added in
CDK 2.10 and 2.11, and a new "prepare" option which if True (the
default) identifies rings and perceives aromaticity when reading a
structure file.
For more details see https://chemfp.com/docs/whats_new_in_50.html#cdk
Cheers,
Andrew Dalke
[email protected]
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user