I do appreciate that you're willing to help out with this, but I would go
for proposal 4. No API change required, zero maintainence, and it will
never be incorrect. We can always provide information after a release that
the fingerprint didn't change between versions.

Regards,
- Noel

On Wed, 9 Jan 2019 at 20:23, Andrew Dalke <da...@dalkescientific.com> wrote:

> On Jan 9, 2019, at 15:45, Noel O'Boyle <baoille...@gmail.com> wrote:
> > Making such an API addition adds a maintainence would commit us to
> correctly maintaining the underlying information, and that's a maintainence
> task I'm not willing to take on.
>
> Certainly, and I appreciate that maintenance is an important factor.
>
> Let me try this again.
>
> Hi! Open Babel supports the FPS output format. What can I do to help
> maintain that plugin?
>
> When Open Babel decided to support the FPS format, I realized they took on
> a maintenance task. For example, the FPS format at that time, at
> https://code.google.com/archive/p/chem-fingerprints/wikis/FPS.wiki , says
> "The version MUST change whenever the underlying fingerprint algorithm
> changes."
>
> I noticed that the implementations in version control have changed. I want
> to help update the version numbers so they will be ready for the next
> release.
>
> I have a lot of Python experience but haven't really used C++ in 20 years
> - and even then without shared library experience - so I'm afraid I don't
> really know what I'm doing with that language. But I'll try!
>
> My proposals to simply update the version number to "/3", and to change
> the plugin API to allow version numbers, don't seem like they were
> acceptable. What about changing the plugin so it is structured like this:
>
>   const char *plugin_version = "1";
>   const char *plugin_id = _pFP->GetID();
>
>   if (!strcmp(plugin_id, "MACCS"))) {
>     plugin_version = "3";
>   }
>   ...
>         << "#type=OpenBabel-" << plugin_id << "/" << plugin_version << '\n'
>
> so that it is possible to have different version numbers for each
> fingerprint type, embedded as logic inside of the plugin?
>
> If this third alternative seems like it might be acceptable, then I can
> work on it further and submit a pull request for a more detailed
> examination.
>
> > In addition, I don't personally think that this information adds
> anything beyond reporting the version number.
>
> And I personally think it does. I know I've liked it when chemfp reported
> mismatched versions when I accidentally used an outdated file.
>
> But that's okay. Open Babel doesn't need to follow my feelings. I can
> totally see that if the overall fingerprint generation process in Open
> Babel effectively changes for every release, then the type version adds
> nothing.
>
> In that case, here's a fourth proposal. Change the FPS plugin so the
> output "#type" includes the Babel version, that is:
>
>         << "#type=OpenBabel-" << _pFP->GetID() << "/" <<  BABEL_VERSION <<
> '\n'
>
> The output would look like something like "OpenBabel-MACCS/2.4.1", which
> is a valid FPS fingerprint type string.
>
> This type string provides a hint to all downstream users that they need to
> re-build their fingerprints after every Open Babel upgrade.
>
> If this seems like it might be acceptable, then I can submit a pull
> request for it.
>
>
> For some background on why there is a "#type" line with a fingerprint type
> version, and a "#software" line with a version of the software, that's
> because other tools have different needs than Open Babel. I might have my
> own tool built on Open Babel which generates fingerprints. The releases are:
>
>   1.0 - initial release, v1 of the fingerprint
>
>   1.1 - performance improvements, fingerprints don't change
>
>   1.2 - fix a bug in my implementation, call the new version v2,
>      but allow v1 to be selected as a command-line option
>
>   1.3 - port to MS Windows, otherwise unchanged
>
>   1.4 - change to require C++14 instead of C++03, otherwise unchanged
>
>   1.5 - switch to faster PRNG as the default, call it v3,
>      and allow v1 and v2 to be selected as a command-line option.
>
> That is, version 1.5 of my program can generate versions 1, 2, and 3 of
> the fingerprint type I created, selectable at run-time. These fingerprint
> types might be available for reproducibility or contractual reasons, among
> others.
>
> In this case, the program version cannot be used to distinguish between
> the fingerprint types, which is why there is a distinct type.
>
> As another example, RDKit's and OpenEye's fingerprints have not changed
> for many years, though they both have had multiple releases during that
> time. Releases for those toolkits typically add features elsewhere. There
> is no reason for users of those toolkits to re-build their fingerprint data
> sets after each release.
>
> My hope is to simplify and standardize the mechanism used to inform users
> that there may be a problem. For example, a web server might use a
> component, installed with 'pip', which has a hard-coded set of
> fingerprints. Users might not even be aware that that FPS file exists. To
> help reduce silent errors, that component might generate a log message
> about a potential problem if it sees a version skew.
>
> That's why chemfp includes a function to compare two FPS headers and
> generate a list of errors (like comparing 1024 vs. 2048 bit fingerprints),
> warnings (e.g., different fingerprint types) and "info"-level messages
> (e.g., different program versions) for things that typically aren't
> important - and lets the calling software decide what to do with each
> severity level and message type.
>
> Best regards,
>
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
>
>
>
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to